Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I get negative vae loss #34

Open
Yitian-Li opened this issue Apr 14, 2022 · 5 comments
Open

I get negative vae loss #34

Yitian-Li opened this issue Apr 14, 2022 · 5 comments

Comments

@Yitian-Li
Copy link

I'm tryting to train a model with my dataset, However I get negative vae loss, it seems quite strange.
Could you help me with this? Thanks!
vae training log with factor rot:

/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Namespace(batch_size=64, cross_entropy_loss=False, datapath='../datasets/data/coeff', epochs=600, factor='rot', gpu=0, lr=0.0001, lr_epochs=150, lr_fac=0.5, output_path='./weights', root_folder='.', val=False, write_iteration=600)
46975
2022-04-14 10:43:10.860388: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-04-14 10:43:11.042504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: NVIDIA Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:08:00.0
totalMemory: 23.88GiB freeMemory: 22.99GiB
2022-04-14 10:43:11.042551: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2022-04-14 10:43:11.427630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-04-14 10:43:11.427679: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2022-04-14 10:43:11.427686: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2022-04-14 10:43:11.427814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22300 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla P40, pci bus id: 0000:08:00.0, compute capability: 6.1)
Date: 2022-04-14 10:43:26       Epoch: [Stage 1][0/600] Loss: 2.7066.
Date: 2022-04-14 10:43:40       Epoch: [Stage 1][1/600] Loss: 2.4838.
Date: 2022-04-14 10:43:55       Epoch: [Stage 1][2/600] Loss: 2.2725.
Date: 2022-04-14 10:44:10       Epoch: [Stage 1][3/600] Loss: 2.0638.
Date: 2022-04-14 10:44:24       Epoch: [Stage 1][4/600] Loss: 1.8573.
Date: 2022-04-14 10:44:39       Epoch: [Stage 1][5/600] Loss: 1.6532.
Date: 2022-04-14 10:44:54       Epoch: [Stage 1][6/600] Loss: 1.4517.
Date: 2022-04-14 10:45:09       Epoch: [Stage 1][7/600] Loss: 1.2532.
Date: 2022-04-14 10:45:24       Epoch: [Stage 1][8/600] Loss: 1.0580.
Date: 2022-04-14 10:45:39       Epoch: [Stage 1][9/600] Loss: 0.8668.
Date: 2022-04-14 10:45:54       Epoch: [Stage 1][10/600]        Loss: 0.6800.
Date: 2022-04-14 10:46:08       Epoch: [Stage 1][11/600]        Loss: 0.4984.
Date: 2022-04-14 10:46:23       Epoch: [Stage 1][12/600]        Loss: 0.3226.
Date: 2022-04-14 10:46:38       Epoch: [Stage 1][13/600]        Loss: 0.1537.
Date: 2022-04-14 10:46:53       Epoch: [Stage 1][14/600]        Loss: -0.0077.
Date: 2022-04-14 10:47:07       Epoch: [Stage 1][15/600]        Loss: -0.1606.
Date: 2022-04-14 10:47:22       Epoch: [Stage 1][16/600]        Loss: -0.3031.
Date: 2022-04-14 10:47:37       Epoch: [Stage 1][17/600]        Loss: -0.4346.
Date: 2022-04-14 10:47:52       Epoch: [Stage 1][18/600]        Loss: -0.5539.
Date: 2022-04-14 10:48:07       Epoch: [Stage 1][19/600]        Loss: -0.6600.
Date: 2022-04-14 10:48:21       Epoch: [Stage 1][20/600]        Loss: -0.7503.
Date: 2022-04-14 10:48:37       Epoch: [Stage 1][21/600]        Loss: -0.8247.
Date: 2022-04-14 10:48:51       Epoch: [Stage 1][22/600]        Loss: -0.8824.
Date: 2022-04-14 10:49:06       Epoch: [Stage 1][23/600]        Loss: -0.9245.
Date: 2022-04-14 10:49:20       Epoch: [Stage 1][24/600]        Loss: -0.9519.
Date: 2022-04-14 10:49:35       Epoch: [Stage 1][25/600]        Loss: -0.9678.
Date: 2022-04-14 10:49:50       Epoch: [Stage 1][26/600]        Loss: -0.9839.
Date: 2022-04-14 10:50:05       Epoch: [Stage 1][27/600]        Loss: -1.0732.
Date: 2022-04-14 10:50:20       Epoch: [Stage 1][28/600]        Loss: -1.2186.
Date: 2022-04-14 10:50:34       Epoch: [Stage 1][29/600]        Loss: -1.2832.
Date: 2022-04-14 10:50:49       Epoch: [Stage 1][30/600]        Loss: -1.3243.
Date: 2022-04-14 10:51:04       Epoch: [Stage 1][31/600]        Loss: -1.3485.
Date: 2022-04-14 10:51:19       Epoch: [Stage 1][32/600]        Loss: -1.3644.
Date: 2022-04-14 10:51:34       Epoch: [Stage 1][33/600]        Loss: -1.3820.
Date: 2022-04-14 10:51:49       Epoch: [Stage 1][34/600]        Loss: -1.3819.

vae training log with factor gamma:

root@train-disco3-0:/data1/DiscoFaceGAN/vae# python demo.py --datapath ../datasets/data/coeff --factor gamma
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Namespace(batch_size=64, cross_entropy_loss=False, datapath='../datasets/data/coeff', epochs=600, factor='gamma', gpu=0, lr=0.0001, lr_epochs=150, lr_fac=0.5, output_path='./weights', root_folder='.', val=False, write_iteration=600)
46975
2022-04-14 10:42:52.898209: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-04-14 10:42:53.063619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: NVIDIA Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:08:00.0
totalMemory: 23.88GiB freeMemory: 23.22GiB
2022-04-14 10:42:53.063673: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2022-04-14 10:42:53.419503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-04-14 10:42:53.419554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2022-04-14 10:42:53.419561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2022-04-14 10:42:53.419681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22532 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla P40, pci bus id: 0000:08:00.0, compute capability: 6.1)
Date: 2022-04-14 10:43:07       Epoch: [Stage 1][0/600] Loss: 23.9197.
Date: 2022-04-14 10:43:22       Epoch: [Stage 1][1/600] Loss: 21.9015.
Date: 2022-04-14 10:43:36       Epoch: [Stage 1][2/600] Loss: 19.9280.
Date: 2022-04-14 10:43:50       Epoch: [Stage 1][3/600] Loss: 17.9590.
Date: 2022-04-14 10:44:04       Epoch: [Stage 1][4/600] Loss: 15.9933.
Date: 2022-04-14 10:44:18       Epoch: [Stage 1][5/600] Loss: 14.0305.
Date: 2022-04-14 10:44:32       Epoch: [Stage 1][6/600] Loss: 12.0710.
Date: 2022-04-14 10:44:46       Epoch: [Stage 1][7/600] Loss: 10.1152.
Date: 2022-04-14 10:45:01       Epoch: [Stage 1][8/600] Loss: 8.1635.
Date: 2022-04-14 10:45:15       Epoch: [Stage 1][9/600] Loss: 6.2167.
Date: 2022-04-14 10:45:29       Epoch: [Stage 1][10/600]        Loss: 4.2754.
Date: 2022-04-14 10:45:44       Epoch: [Stage 1][11/600]        Loss: 2.3405.
Date: 2022-04-14 10:45:58       Epoch: [Stage 1][12/600]        Loss: 0.4129.
Date: 2022-04-14 10:46:12       Epoch: [Stage 1][13/600]        Loss: -1.5061.
Date: 2022-04-14 10:46:27       Epoch: [Stage 1][14/600]        Loss: -3.4153.
Date: 2022-04-14 10:46:41       Epoch: [Stage 1][15/600]        Loss: -5.3130.
Date: 2022-04-14 10:46:55       Epoch: [Stage 1][16/600]        Loss: -7.1975.
Date: 2022-04-14 10:47:10       Epoch: [Stage 1][17/600]        Loss: -9.0669.
Date: 2022-04-14 10:47:23       Epoch: [Stage 1][18/600]        Loss: -10.9186.
Date: 2022-04-14 10:47:38       Epoch: [Stage 1][19/600]        Loss: -12.7502.
Date: 2022-04-14 10:47:52       Epoch: [Stage 1][20/600]        Loss: -14.5583.
Date: 2022-04-14 10:48:07       Epoch: [Stage 1][21/600]        Loss: -16.3394.
Date: 2022-04-14 10:48:21       Epoch: [Stage 1][22/600]        Loss: -18.0895.
Date: 2022-04-14 10:48:36       Epoch: [Stage 1][23/600]        Loss: -19.8038.

@zhou0425
Copy link

Hello,I met same problem, how did you solve it? thank you!

@Yitian-Li
Copy link
Author

Yitian-Li commented Oct 11, 2022 via email

@xi4444x
Copy link

xi4444x commented Mar 21, 2023

请问您可以分享一下怎么解决的吗?

@Yitian-Li
Copy link
Author

Yitian-Li commented Mar 21, 2023 via email

@xi4444x
Copy link

xi4444x commented Mar 21, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants