Mis-inplementation of JS divergence #27

greatwallet · 2021-09-09T14:06:48Z

Hi, according to definition of JS divergence (as mentioned in your supp file), JS divergence is calculated as the difference of
entropy of average probabilities and average of entropies.

However in your code, the first term of JS, aka the difference of entropy of average probabilities is implemented as:

datasetGAN_release/datasetGAN/train_interpreter.py

Line 304 in dee6d7d

full_entropy = Categorical(logits=mean_seg).entropy()

where mean_seg is defined as average segmentation map of 10 outputs of ensembled pixel_classifiers.

Specifically, I have traced the implementation of mean_seg -->

datasetGAN_release/datasetGAN/train_interpreter.py

Line 302 in dee6d7d

mean_seg = mean_seg / len(all_seg)

-->

datasetGAN_release/datasetGAN/train_interpreter.py

Lines 291 to 294 in dee6d7d

    
           if mean_seg is None: 
        
               mean_seg = img_seg 
        
           else: 
        
               mean_seg += img_seg

--> img_seg

datasetGAN_release/datasetGAN/train_interpreter.py

Lines 282 to 284 in dee6d7d

    
           img_seg = classifier(affine_layers) 
        
           img_seg = img_seg.squeeze()

In fact, img_seg are all unnormalized probabilities, aka logits defined in pytorch distribution's argument. I think in the code you attempted to do average upon logits instead of probabilies (since you have commented out Sigmoid in pixel_classifier)

datasetGAN_release/datasetGAN/train_interpreter.py

Lines 68 to 92 in dee6d7d

    
           class pixel_classifier(nn.Module): 
        
               def __init__(self, numpy_class, dim): 
        
                   super(pixel_classifier, self).__init__() 
        
                   if numpy_class < 32: 
        
                       self.layers = nn.Sequential( 
        
                           nn.Linear(dim, 128), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=128), 
        
                           nn.Linear(128, 32), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=32), 
        
                           nn.Linear(32, numpy_class), 
        
                           # nn.Sigmoid() 
        
                       ) 
        
                   else: 
        
                       self.layers = nn.Sequential( 
        
                           nn.Linear(dim, 256), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=256), 
        
                           nn.Linear(256, 128), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=128), 
        
                           nn.Linear(128, numpy_class), 
        
                           # nn.Sigmoid() 
        
                       )

TL; DR

The unlawful commutation of softmax and linear operation leads to mis-implementation of JS divergence.

The text was updated successfully, but these errors were encountered:

arieling · 2021-09-09T18:51:19Z

Thank you a lot for pointing this out. We are looking into it.

arieling · 2021-09-12T04:23:40Z

@greatwallet Thank you again for pointing this bug out!
We have fixed the bug in the commit of
d9564d4.
The number is also updated in README.

arieling closed this as completed Sep 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mis-inplementation of JS divergence #27

Mis-inplementation of JS divergence #27

greatwallet commented Sep 9, 2021 •

edited

arieling commented Sep 9, 2021

arieling commented Sep 12, 2021

Mis-inplementation of JS divergence #27

Mis-inplementation of JS divergence #27

Comments

greatwallet commented Sep 9, 2021 • edited

TL; DR

arieling commented Sep 9, 2021

arieling commented Sep 12, 2021

greatwallet commented Sep 9, 2021 •

edited