Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN loss for graph classification #5

Closed
chenz97 opened this issue Jun 18, 2020 · 2 comments
Closed

NaN loss for graph classification #5

chenz97 opened this issue Jun 18, 2020 · 2 comments

Comments

@chenz97
Copy link

chenz97 commented Jun 18, 2020

Hi, thanks for the code. When I ran the graph classification code by make test_phase=1 save_dir=save data_type='S5CONS', I found that the loss is always NaN both for training and validation, and the performance figures didn't change. Do you have any idea why this might be happening? Thank you!

@chenz97 chenz97 closed this as completed Jun 26, 2020
@ilibarra
Copy link

Dear dmis-lab team,

Thank you for providing the code and environment to run your hierarchical graph model. I am getting the same nan loss errors as described above. Because I cannot find a description on how to solve it. I am here providing more details.

Currently using numpy 1.11.0 tensorflow 1.19.5 and CPUs

These are the current outputs

 EPOCH 1 EVAL ALL
loss : nan accuracy : 0.4082 hit ratio : 0.5714 pred_rate : [1.0, 0.0, 0.0] macro f1 : 0.1932 micro f1 : 0.4082 expected return : -0.0009
03/27/2021 02:29:09 PM: EPOCH 11 TRAIN ALL 
loss : nan accuracy : 0.3177 hit ratio : 0.4822 pred_rate : [1.0, 0.0, 0.0] macro f1 : 0.1607 micro f1 : 0.3177 expected return : 0.0001
03/27/2021 02:29:09 PM:

Is there a conda environment or requirements.txt file that could assure this behavior does not occur, hopefully to avoid checking packages one by one?

Thank you for any input.

@ilibarra
Copy link

ilibarra commented Mar 27, 2021

It seems that prob is nan during the first and following epochs, regardless of x, y and labels properly defined. Then loss nan as well.

Any way to troubleshoot / check if y probs are initialized as zero? Thanks.

In graph_trainer.py, line 69

_, loss, pred, prob = self.sess.run([self.model.train_step, self.model.cross_entropy,
                            self.model.prediction, self.model.prob],
                             feed_dict=feed_dict)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants