[Bug] DAN loss may be below 0 #54

xianyuanliu · 2021-02-04T18:00:35Z

🐛 Bug

When I run DAN on digits_dann_lightn and action_dann_lightn, MMD loss T_mmd has some values below 0. It will cause T_total_loss below 0 because T_total_loss = T_task_loss + 1 * T_mmd. Is it correct?

To reproduce

Steps to reproduce the behavior:

In digits_dann_lightn,

After merging Add DAN to digits_dann_lightn #53, set fast_dev_run=False and logger=True in main.py.
Run python main.py --cfg ./configs/MN2UP-DAN.yaml --gpus 1.
Check the loss by printing them or tensorboard.

** Stack trace/error message **
This is my output with repeats=10, epoch=100, init_epoch=20.
The T_mmd varies, so does T_total_loss. I think the loss should be above 0.

Expected Behaviour

The loss should be above 0 like CDAN.

There are some useful links.
ADA code
Xlearn code
I checked these codes and ours is almost similar to them. Thus, I am not sure whether this loss output is right.

Environment

[pip3] numpy==1.19.2
[pip3] pytorch-lightning==1.0.3
[pip3] pytorch-memlab==0.2.2
[pip3] torch==1.7.0
[pip3] torchaudio==0.7.0
[pip3] torchsummary==1.5.1
[pip3] torchvision==0.8.1
[conda] blas                      1.0                         mkl
[conda] cudatoolkit               10.2.89              h74a9793_1
[conda] mkl                       2020.2                      256
[conda] mkl-service               2.3.0            py38hb782905_0
[conda] mkl_fft                   1.2.0            py38h45dec08_0
[conda] mkl_random                1.1.1            py38h47e9c7a_0
[conda] numpy                     1.19.2           py38hadc3359_0
[conda] numpy-base                1.19.2           py38ha3acd2a_0
[conda] pytorch                   1.7.0           py3.8_cuda102_cudnn7_0    pytorch
[conda] pytorch-lightning         1.0.2                    pypi_0    pypi
[conda] pytorch-memlab            0.2.2                    pypi_0    pypi
[conda] torchaudio                0.7.0                      py38    pytorch
[conda] torchsummary              1.5.1                    pypi_0    pypi
[conda] torchvision               0.8.1                py38_cu102    pytorch

The text was updated successfully, but these errors were encountered:

haipinglu · 2021-02-08T12:52:21Z

@sz144 #53 has been merged into the master. Could you please take a look at this issue on MMD? Thanks.

xianyuanliu · 2021-03-25T14:45:16Z

In this DAN example, we use an unbiased estimation of MMD with linear complexity following the original paper. It may be the reason.

Refer to Xlearn, the complete version should be

def DAN(source, target, kernel_mul=2.0, kernel_num=5, fix_sigma=None):
    batch_size = int(source.size()[0])
    kernels = guassian_kernel(source, target,
        kernel_mul=kernel_mul, kernel_num=kernel_num, fix_sigma=fix_sigma)

    loss1 = 0
    for s1 in range(batch_size):
        for s2 in range(s1+1, batch_size):
            t1, t2 = s1+batch_size, s2+batch_size
            loss1 += kernels[s1, s2] + kernels[t1, t2]
    loss1 = loss1 / float(batch_size * (batch_size - 1) / 2)

    loss2 = 0
    for s1 in range(batch_size):
        for s2 in range(batch_size):
            t1, t2 = s1+batch_size, s2+batch_size
            loss2 -= kernels[s1, t2] + kernels[s2, t1]
    loss2 = loss2 / float(batch_size * batch_size)
    return loss1 + loss2

xianyuanliu added bug Something isn't working question Further information is requested labels Feb 4, 2021

xianyuanliu assigned shuo-zhou and haipinglu Feb 4, 2021

haipinglu added this to To do (sorted by urgency) in v0.1.0 via automation Feb 8, 2021

haipinglu moved this from To do (sorted by urgency) to In progress (next 2 weeks) in v0.1.0 Mar 25, 2021

xianyuanliu closed this as completed Apr 8, 2021

v0.1.0 automation moved this from In progress to Done Apr 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] DAN loss may be below 0 #54

[Bug] DAN loss may be below 0 #54

xianyuanliu commented Feb 4, 2021

haipinglu commented Feb 8, 2021

xianyuanliu commented Mar 25, 2021

[Bug] DAN loss may be below 0 #54

[Bug] DAN loss may be below 0 #54

Comments

xianyuanliu commented Feb 4, 2021

🐛 Bug

To reproduce

Expected Behaviour

Environment

haipinglu commented Feb 8, 2021

xianyuanliu commented Mar 25, 2021