## Effective Number of Samples (ENS)

Paper: [Class-Balanced Loss Based on Effective Number of Samples](http://openaccess.thecvf.com/content_CVPR_2019/papers/Cui_Class-Balanced_Loss_Based_on_Effective_Number_of_Samples_CVPR_2019_paper.pdf), CVPR19 by Google

### $Wn,c$ = $\frac{1}{E_{n_{c}}}$
### $E_{n_{c}}$ = $1 - \beta^{n^{c}} \over 1 - \beta$   
$where\;n_{c} \;is\;the\;Number\;of\;Samples\;in\;Class\;c$  
$and\;E_{n_{c}}\;represents\;the\;Effective\;Number\;of\;Samples$

n은 sample의 수, $\beta$는 [0, 1)의 hyperparameter  
*저자는 $\beta$의 0.9, 0.99, 0.999, 0.9999 실험결과를 보여줌

In [None]:
def get_weights_inverse_num_of_samples(no_of_classes, beta, samples_per_cls):
  effective_num = 1.0 - np.power(beta, samples_per_cls)
  weights_for_samples = (1.0 - beta) / np.array(effective_num)
  weights_for_samples = weights_for_samples / np.sum(weights_for_samples) * no_of_classes
  return weights_for_samples

- Data sample의 effective number: N개의 data가 있을 때, 중복 및 유사한 sample을 제외한 영향력 있는 sample들의 개수  
- sample size가 커질 수록 값이 커짐 -> majority class일 수록 큰 값을 갖고 minority class일 수록 작은 값을 가짐

![](https://github.com/bbh-pharm/How-to-Handle-data/blob/main/Class-Imbalance/paper_capture_ENS.png?raw=true)

Reference:  
1. https://medium.com/gumgum-tech/handling-class-imbalance-by-introducing-sample-weighting-in-the-loss-function-3bdebd8203b4
2. https://yjchoi-95.gitbook.io/paper-review/paper-review/cvpr-2019-class-balanced-loss-based-on-effective-number-of-samples