This repo contains information/implementation (PyTorch, Tensorflow) about IS and FID score. This is a handy toolbox that you can easily add to your projects. TF implementations are intended to compute the exact same output as the official ones for reporting in papers. Discussion/PR/Issues are very welcomed.


Put this metrics/ folder in your projects, and see below (Pytorch), and each .py's head comment for usage.

We also need to download some files in res/, see res/ for more details.

TF implementations (almost the same as official, just changed the interface, can be reported in papers)

Pytorch Implementation (CANNOT report in papers, but can get an quick view)

  • Requirements

    • pytorch, torchvision, scipy, numpy, tqdm

    • inception score, get around mean=9.67278, std=0.14992 for CIFAR-10 train data when n_split=10
    • FID score
    • calculate stats for custom images in a folder (mu, sigma)
    • multi-GPU support by nn.DataParallel
      • e.g. CUDA_VISIBLE_DEVICES=0,1,2,3 will use 4 GPU.
  • command line usage

    • calculate IS, FID

      # calc IS score on CIFAR10, will download CIFAR10 data to ../data/cifar10
      # calc IS score on custom images in a folder/
      python --path foldername/
      # calc IS, FID score on custom images in a folder/, compared to CIFAR10 (given precalculated stats)
      python --path foldername/ --fid res/stats_pytorch/fid_stats_cifar10_train.npz
      # calc FID on custom images in two folders/
      python --path foldername1/ --fid foldername2/
      # calc FID on two precalculated stats
      python --path res/stats_pytorch/fid_stats_cifar10_train.npz --fid res/stats_pytorch/fid_stats_cifar10_train.npz
    • precalculate stats

      # precalculate stats store as npz for CIFAR 10, will download CIFAR10 data to ../data/cifar10
      python --save-stats-path res/stats_pytorch/fid_stats_cifar10_train.npz
      # precalculate stats store as npz for images in folder/
      python --path foldername/ --save-stats-path res/stats_pytorch/fid_stats_folder.npz
  • in code usage

    • mode=1: image tensor has already normalized by mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
    • mode=2: image tensor has already normalized by mean=[0.500, 0.500, 0.500], std=[0.500, 0.500, 0.500]
      from metrics import is_fid_pytorch
      # using precalculated stats (.npz) for FID calculation
      is_fid_model = is_fid_pytorch.ScoreModel(mode=2, stats_file='res/stats_pytorch/fid_stats_cifar10_train.npz', cuda=cuda)
      imgs_nchw = torch.Tensor(50000, C, H, W) # torch.Tensor in -1~1, normalized by mean=[0.500, 0.500, 0.500], std=[0.500, 0.500, 0.500]
      is_mean, is_std, fid = is_fid_model.get_score_image_tensor(imgs_nchw)
      # we can also pass in mu, sigma for get_score_image_tensor()
      is_fid_model = is_fid_pytorch.ScoreModel(mode=2, cuda=cuda)
      mu, sigma = is_fid_pytorch.read_stats_file('res/stats_pytorch/fid_stats_cifar10_train.npz')
      is_mean, is_std, fid = is_fid_model.get_score_image_tensor(imgs_nchw, mu1=mu, sigma1=sigma)
      # if no need FID
      is_fid_model = is_fid_pytorch.ScoreModel(mode=2, cuda=cuda)
      is_mean, is_std, _ = is_fid_model.get_score_image_tensor(imgs_nchw)
      # if want stats (mu, sigma) for imgs_nchw, send in return_stats=True
      is_mean, is_std, _, mu, sigma = is_fid_model.get_score_image_tensor(imgs_nchw, return_stats=True)
      # from pytorch dataset, use get_score_dataset(), instead of get_score_image_tensor(), other usage is the same
      cifar = dset.CIFAR10(root='../data/cifar10', download=True,
                               transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
      is_mean, is_std, _ = is_fid_model.get_score_dataset(IgnoreLabelDataset(cifar))


  • Refactor TF implementation of IS, FID Together
  • MS-SSIM score - PyTorch
  • MS-SSIM score - Tensorflow


Inception Score (IS)

  • Assumption

    • MEANINGFUL: The generated image should be clear, the output probability of a classifier network should be [0.9, 0.05, ...] (largely skewed to a class). $p(y|\mathbf{x})$ is of low entropy.
    • DIVERSITY: If we have 10 classes, the generated image should be averagely distributed. So that the marginal distribution $p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ is of high entropy.
    • Better models: KL Divergence of $p(y|\mathbf{x})$ and $p(y)$ should be high.
  • Formulation

    • $\mathbf{IS} = \exp (\mathbb{E}{\mathbf{x} \sim p_g} D{KL} [p(y|\mathbf{x}) || p(y)] )$
    • where
      • $\mathbf{x}$ is sampled from generated data
      • $p(y|\mathbf{x})​$ is the output probability of Inception v3 when input is $\mathbf{x}​$
      • $p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ is the average output probability of all generated data (from InceptionV3, 1000-dim vector)
      • $D_{KL} (\mathbf{p}||\mathbf{q}) = \sum_{j} p_{j} \log \frac{p_j}{q_j}$, where $j$ is the dimension of the output probability.
  • Explanation

    • $p(y)$ is a evenly distributed vector
    • larger $\mathbf{IS}​$ score -> larger KL divergence -> larger diversity and clearness
  • Reference

Fréchet Inception Distance (FID)


