# Instance Selection for GANs
> Terrance DeVries, Michal Drozdzal, Graham W. Taylor

- toc:true
- branch: master
- badges: true
- comments: false
- author: 최서연

ref: https://arxiv.org/pdf/2007.15255.pdf

## Abstract

Several recently proposed techniques attempt to avoid spurious samples, either by rejecting them after generation, or by truncating the model’s latent space.
- 최근 제안된 기술들은 가짜 샘플들을 피하려는 시도는 생성 후 거절하거니 모델의 잠재 공간을 잘라내는 것임.
- 효과적이긴 한데 모델의 대부분이 사용되지 않는 샘플에 할당

altering the training dataset via instance selection before model training has taken place. 
- 그래서 모델 학습이 일어나기 전에 인스턴스 선택을 통해 학습셋을 변경하는 것을 제안할 것

## Instance Selection for GAN

- to automatically remove the sparsest regions of the data manifold, specifically those parts that GANs struggle to capture. 
- define an image embedding function F and a scoring function H.

**Embedding function**

- F projects images into an embedding space
    - image z data set이 주어지면, $z = F(x)$를 data point $x ∈ X $에 적용하여 embeded image Z 가 주어진다.
    - image  generation을 위해 사전 학습된 image classifier의 feature space와 같은 aligned embedding function을 제안하고 있다.

**Scoring function**
- H is used to to assess the manifold density in a neighbourhood around each embedded data point z.
    - 논문에서 비교할 세 가지 scoring function selection
        - *log likelihood under a standard Gaussian model,
        - log likelihood under a Probabilistic Principal Component Analysis (PPCA) model,
        - distance to the Kth nearest neighbour (KNN Distance).

The Gaussian model is fit to the *embedded dataset by computing the empirical mean $µ$ and the sample covariance $Σ$ of $Z$.*
- d는 z의 demension

$$H_{Gaussian}(z) = −\frac{1}{2}[ln(|Σ|) + (z − µ)^{T} Σ^{−1}(z − µ) + d ln(2π)], (1)$$

- 논문 설정: set the number of principal components such that 95% of the variance in the data is preserved.

$$H_{PPCA}(z) = −\frac{1}{2}[ln(|C|) + Tr((z − µ)^{T} C^{−1}(z − µ)) + d ln(2π)], C = WW^T + σ^2 I, (2)$$
- $W$ is the fit model weight matrix,
- $µ$ is the empirical mean of $Z$, 
- $σ$ is the residual variance, 
- $I$ is the identity matrix, 
- $d$ is the dimension of $z$.

**KNN**
- $z$와 $Z \ {z}$의 유클리드 거리 계산 후 가장 가까운 k번째 원소까지 거리 반환해 data point 얻는데 사용한다.
- To convert to a score, we make the resulting distance negative, such that smaller distances return larger values. 

$$H_{KNN}(z, K, Z) = − min\underset{K}  \{||z − z_i ||_2 : z_i ∈ Z \ {z} \}, (3)$$

- 집합에서 k번째 가장 작은 값.$\leftarrow$논문에서는 k=5로 정함
- To perform *instance selection*, we compute scores $H(F(x))$ for each data point and keep all data points with scores above some threshold $ψ$.
- For convenience, *we often set $ψ$ to be equal to some percentile of the scores, such that we preserve the top N% of the best scoring data points

Figure 1에서 High likelihood images share a similar visual structure, while low likelihood samples are more varied 였음!

$$X' = {x ∈ X s.t. H(F(x)) > ψ}$$
- data points $x ∈ X$ 의 초기 학습 set을 구성함으로써 reduced training set $X'$를 구성함

Figure 1에서 ImageNet의 Red Fox class 에서 most and least likely imgaed를 보면 training set으로부터 data points를 제거하는 것이 좋은 이유가 설명된다.

Likelihood는 pretrain된 Inceptionv3 classifier에서 feature embedding에 적합한 가우시안모델에 의해 결정된다.

- The most likely images (a) are similarly cropped around the fox’s face, while the least likely images (b) have many odd viewpoints and often suffer from occlusion. It is logical to imagine how a generative model trained on these unusual instances may try to generate samples that mimic such conditions, resulting in undesirable outputs.


## Experiments

- review evaluation metrics,
- motivate selecting instances based on manifold density, 
- analyze the impact of applying instance selection to GAN training.

### Evaluation Metrics

- When calculating FID we follow Brock et al. [2] in using all images in the training set *to estimate the reference distribution*, and *sampling 50 k images* to make up the generated distribution.
- For P&R and D&C we use an Inceptionv3 embedding.
- 1 N and M are **set to 10 k samples** for both the reference and generated distributions, and K is **set equal to 5** as recommended by Naeem et al. [19]

### Relationship Between Dataset Manifold Density and GAN Performance

- image manifold는 많은 data point들이 서로 가까이에 있는 영역에서보다 정확히 정의된다.
- GAN은 주어진 dataset의 data point를 기반으로 image manifold를 재현하려고 시도하기 떄문에 잘 정의된 manifold(no sparse manifold regions)가 있는 dataset에서 더 나은 성능을 발휘해야 한다고 suspect한다.
    - 그래서 use the ImageNet2 dataset [7] and treat each of the 1000 classes as a separate dataset 할거다
    - use a single class-conditional BigGAN from [2] that has been pretrained on ImageNet at 128 × 128 resolution. 
    - For each class, we sample 700 real images from the dataset, and generate 700 class-conditioned samples with the BigGAN.

To measure the density for each class manifold we compare three different methods: 
- Gaussian likelihood, 
- Probabilistic Principal Component Analysis (PPCA) likelihood,
- and distance to the Kth neighbour (KNN Distance) (§3).

![](https://d3i71xaburhd42.cloudfront.net/d534182c1a64143e74e9f00fd7394b9223fe62a0/5-Figure2-1.png)

Figure 2. image data set의 각 class에 대한 manifold 밀도 추정치와 FID 사이의 상관관계. x측 값이 낮을수록 dataset manifold의 밀도가 높다는 것을 나타냄. y축 값이 낮을수록 sample의 품질이 우수함을 나타냄.

### Embedding and Scoring Function