Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve no-key error message in ase target parser #242

Merged
merged 2 commits into from
Jun 12, 2024
Merged

Conversation

PicoCentauri
Copy link
Contributor

@PicoCentauri PicoCentauri commented Jun 10, 2024

Fixes #194

The default ase error message of unknown keys is very unhelpful for us. I improved the error message for ase readers. With these options

architecture:
  name: experimental.soap_bpnn
  training:
    batch_size: 2
    num_epochs: 1

training_set:
  systems:
    read_from: qm9_reduced_100.xyz
    length_unit: angstrom
  targets:
    energy:
      key: hello
      unit: eV

The error is

(tests) (base) philiploche@Philips-MacBook-Air:~/repos/lab-cosmo/metatrain/tmp$ mtt train options.yaml 
[2024-06-10 17:09:00][INFO] - This log is also available in 'outputs/2024-06-10/17-09-00/train.log'.
[2024-06-10 17:09:00][INFO] - Random seed of this run is 1038472122
[2024-06-10 17:09:00][INFO] - Setting up training set
"energy key 'hello' was not found in system 'qm9_reduced_100.xyz' at index 0"

Contributor (creator of pull-request) checklist

  • Tests updated (for new features and bugfixes)?
  • Documentation updated (for new features)?
  • Issue referenced (for PRs that solve an issue)?

📚 Documentation preview 📚: https://metatrain--242.org.readthedocs.build/en/242/

@frostedoyster
Copy link
Collaborator

frostedoyster commented Jun 10, 2024

I don't think this is what we want for #194. I constantly see very bad error messages when not using --debug. It's not just the ASE reader. The issue is that the error wrapper that we have cuts off all the stack trace and shows only the last line (or few lines). These changes are good, but they can't fix the underlying issue. I can show more examples of terrible errors if needed

@PicoCentauri
Copy link
Contributor Author

Can you give another example?

This PR is improving the error message you saw. What would you like to have for example for this error?

@frostedoyster
Copy link
Collaborator

frostedoyster commented Jun 10, 2024

Sure:

$ mtt train options.yaml
[2024-06-10 20:48:47][INFO] - This log is also available in 'outputs/2024-06-10/20-48-47/train.log'.
[2024-06-10 20:48:48][INFO] - Random seed of this run is 2490997340
[2024-06-10 20:48:48][INFO] - Setting up training set
[2024-06-10 20:48:48][WARNING] - No Forces found in section 'energy'. Continue without forces!
[2024-06-10 20:48:48][WARNING] - No Stress found in section 'energy'. Continue without stress!
[2024-06-10 20:48:48][INFO] - Setting up test set
[2024-06-10 20:48:48][INFO] - Setting up validation set
[2024-06-10 20:48:48][INFO] - Setting up model
The error below most likely originates from an architecture. If you think this is a bug, please contact its maintainer (see the architecture's documentation).

'NoneType' object does not support item assignment
mtt train options.yaml
[2024-06-10 20:52:07][INFO] - This log is also available in 'outputs/2024-06-10/20-52-07/train.log'.
[2024-06-10 20:52:08][INFO] - Random seed of this run is 2960232964
[2024-06-10 20:52:08][INFO] - Setting up training set
Interpolation key 'systems.read_from' not found
    full_key: [0].targets.energy.read_from
    object_type=dict

The point is that IMO cutting out the stack trace is a bad choice, at least at this point in the development of metatrain. Sure, if we become famous with thousands of users and we have 20 developers, then we can have a custom 1-line error for everything. But for now, I would keep the stack trace

@PicoCentauri
Copy link
Contributor Author

PicoCentauri commented Jun 10, 2024

I mean this looks like some issues inside the bpnn.

I disagree. If you are not a developer giving a traceback doesn't really help. I mean how often do you read the traceback of a compiler error and you have no idea what is going on.

But, I am open to discuss.

@frostedoyster
Copy link
Collaborator

frostedoyster commented Jun 11, 2024

$ mtt train options.yaml 
[2024-06-11 07:37:51][INFO] - This log is also available in 'outputs/2024-06-11/07-37-51/train.log'.
'<' not supported between instances of 'str' and 'int'

Why don't we just remove the custom handling of errors for now? These errors are crap

@Luthaf
Copy link
Member

Luthaf commented Jun 11, 2024

We could take inspiration on sphinx, and save the whole tracback to a file, and show the path to the file in the user-facing error message. Something like

Unexpected error occurred. This most likely originates from the 'XXX' architecture. 

If you think this is a bug, please contact its maintainer (see the architecture's documentation), 
and include the full log at /tmp/nqfiouabspvhqbfod/error.log

We could also try to include the last line of the error, although this won't be helpful most of the time as we see

@PicoCentauri
Copy link
Contributor Author

Also a good idea. However, we already have log file which contains the same as the stdout/stderr and opening another might be too much? We can however to an output log and an error log, where the latter contains the error including the traceback...

After talking to @frostedoyster, I suggest the following

            logging.error(
                f"{traceback.format_exc()}\n"
                "If the error message below is unclear, please help us improve it by "
                "opening an issue at https://github.com/lab-cosmo/metatrain/issues. "
                "Thank you!\n\n"
                f"{err}")

This will log the error message to stdout and file on the bottom which is directly visible to the user; Above an info helping us improving the message and above the traceback. This would look like for the current issue.

[2024-06-11 13:46:04][INFO] - This log is also available in 'outputs/2024-06-11/13-46-04/train.log'.
[2024-06-11 13:46:04][INFO] - Random seed of this run is 951776256
[2024-06-11 13:46:04][INFO] - Setting up training set
[2024-06-11 13:46:05][ERROR] - Traceback (most recent call last):
  File "/Users/philiploche/repos/lab-cosmo/metatrain/.tox/tests/lib/python3.12/site-packages/metatrain/__main__.py", line 119, in main
    train_model(**args.__dict__)
  File "/Users/philiploche/repos/lab-cosmo/metatrain/.tox/tests/lib/python3.12/site-packages/metatrain/cli/train.py", line 189, in train_model
    train_targets, target_info_dictionary = read_targets(
                                            ^^^^^^^^^^^^^
  File "/Users/philiploche/repos/lab-cosmo/metatrain/.tox/tests/lib/python3.12/site-packages/metatrain/utils/data/readers/readers.py", line 192, in read_targets
    blocks = read_energy(
             ^^^^^^^^^^^^
  File "/Users/philiploche/repos/lab-cosmo/metatrain/.tox/tests/lib/python3.12/site-packages/metatrain/utils/data/readers/readers.py", line 51, in read_energy
    return _base_reader(
           ^^^^^^^^^^^^^
  File "/Users/philiploche/repos/lab-cosmo/metatrain/.tox/tests/lib/python3.12/site-packages/metatrain/utils/data/readers/readers.py", line 33, in _base_reader
    return reader(filename, dtype=dtype, **reader_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/philiploche/repos/lab-cosmo/metatrain/.tox/tests/lib/python3.12/site-packages/metatrain/utils/data/readers/targets/ase.py", line 30, in read_energy_ase
    raise KeyError(
KeyError: "energy key 'hello' was not found in system 'qm9_reduced_100.xyz' at index 0"

If the error message below is unclear, please help us improve it by opening an issue at https://github.com/lab-cosmo/metatrain/issues.Thank you!

"energy key 'hello' was not found in system 'qm9_reduced_100.xyz' at index 0"

What do you think?

@frostedoyster
Copy link
Collaborator

Ok, I would save the stack trace to a file (very good idea) and remove it from the logs, where we can have the last line + the suggestion of "if this is unclear, please report it, help improve it and include the full error at <error_file>"

@PicoCentauri
Copy link
Contributor Author

Thanks for your feedback. The current output is

$ mtt train foo
(tests) (base) philiploche@tsf-444-wpa-0-067:~/repos/lab-cosmo/metatrain$ mtt train foo
[2024-06-12 16:08:01][INFO] - This log is also available at '/Users/philiploche/repos/lab-cosmo/metatrain/outputs/2024-06-12/16-08-01/train.log'.
[2024-06-12 16:08:01][ERROR] - If the error message below is unclear, please help us improve it by opening an issue at https://github.com/lab-cosmo/metatrain/issues.
When opening the issue, please include the full traceback log from '/Users/philiploche/repos/lab-cosmo/metatrain/outputs/2024-06-12/16-08-01/error.log'. Thank you!

[Errno 2] No such file or directory: '/Users/philiploche/repos/lab-cosmo/metatrain/foo'

the error.log file contains the full traceback.

@Luthaf had a good idea adding the architecture_name to the architecture error. I will open an issue and do this in another PR.

@PicoCentauri PicoCentauri force-pushed the key-error branch 3 times, most recently from a4134f1 to 44e8732 Compare June 12, 2024 15:02
Copy link
Collaborator

@frostedoyster frostedoyster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you very much @PicoCentauri

@frostedoyster frostedoyster merged commit 8586497 into main Jun 12, 2024
13 checks passed
@frostedoyster frostedoyster deleted the key-error branch June 12, 2024 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve no-debug option / Improve error messages
4 participants