preprocess.upsample 구현하기 #29

ArtemisDicoTiar · 2021-11-05T12:28:43Z

What?

가장 데이터 개수가 많은 클래스에 맞춰서, 상대적으로 개수가 적은 클래스의 데이터를 upsample하는 함수

eubinecto · 2021-11-08T09:54:08Z

@ArtemisDicoTiar 이거 예전에 어떤 함수를 써서 이미 구현했었죠? 그 부분 commit이 있으면 여기에 올려줄 수 있나요 혹시?

ArtemisDicoTiar · 2021-11-08T12:43:41Z

@eubinecto 이런식으로 구현했었습니다.
wisdom이 df에서 언급된 세고 그중 제일 많은 걸 기준으로 나머지 데이터들도 랜덤하게 업샘플링되게 해뒀습니다.
(코드 다시보니 왜 카운팅을 저런식으로 해뒀는 지 잘 이해가 안되네요 ㅋㅋㅋㅋㅋ 그냥 그룹 소트 하면 됐을텐데 ㅋㅋㅋ)
업샘플링의 경우 사이킷런 써서 랜덤하게 업샘플링 시켰습니다.
별거 없어서 지금 해둘게요

from collections import Counter
from sklearn.utils import resample

counts = sorted(Counter(data_df['wisdom']).items(), key=lambda r: r[1], reverse=True)
major = counts[0]

# Upsample minority class
total_df = data_df.loc[data_df['wisdom'] == major[0]]
for wis, ct in counts[1:]:
    df_minority_upsampled = resample(data_df[data_df['wisdom'] == wis],
                                        replace=True,  # sample with replacement
                                        n_samples=major[1],  # to match majority class
                                        random_state=123)  # reproducible results

    total_df = total_df.append(df_minority_upsampled)
return total_df

ArtemisDicoTiar mentioned this issue Nov 5, 2021

preprocess 구현하기 #26

Open

5 tasks

ArtemisDicoTiar assigned ArtemisDicoTiar, ohsuz and teang1995 and unassigned ohsuz Nov 5, 2021

ArtemisDicoTiar added a commit that referenced this issue Nov 8, 2021

#29 upsample implemented

477ae6d

ArtemisDicoTiar linked a pull request Nov 8, 2021 that will close this issue

#29 upsample implemented #38

Merged

ArtemisDicoTiar added a commit that referenced this issue Nov 9, 2021

#29 seed number hard coded changed to parameter

f201038

ArtemisDicoTiar closed this as completed Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preprocess.upsample 구현하기 #29

preprocess.upsample 구현하기 #29

ArtemisDicoTiar commented Nov 5, 2021 •

edited by eubinecto

eubinecto commented Nov 8, 2021

ArtemisDicoTiar commented Nov 8, 2021 •

edited

preprocess.upsample 구현하기 #29

preprocess.upsample 구현하기 #29

Comments

ArtemisDicoTiar commented Nov 5, 2021 • edited by eubinecto

What?

eubinecto commented Nov 8, 2021

ArtemisDicoTiar commented Nov 8, 2021 • edited

ArtemisDicoTiar commented Nov 5, 2021 •

edited by eubinecto

ArtemisDicoTiar commented Nov 8, 2021 •

edited