Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Improve strictness when parsing TDB FUNCTION and PARAMETER lines #308

Merged
merged 4 commits into from Mar 9, 2021

Conversation

bocklund
Copy link
Collaborator

@bocklund bocklund commented Mar 1, 2021

Checklist

  • The documentation examples have been regenerated if the Jupyter notebooks in the examples/ have changed. To regenerate the documentation examples, run jupyter nbconvert --to rst --output-dir=docs/examples examples/*.ipynb from the top level directory)
  • If any dependencies have changed, the changes are reflected in the
    • setup.py
    • environment-dev.yml

This PR makes the TDB parser more strict by requiring that PARAMETER and FUNCTION lines are terminated by an optional-suppressed reference key and a line end. This fixes behavior where the parser would silently consume all TDB content until the next line terminator (!) for these two commands once the func_expr grammar was successfully parsed.

  • FUNCTION and PARAMETER pyparsing grammars are updated to include through the line ending, if the parsed function expression and reference key is not following by a line ending, a ParseException is now raised.
  • A failing test based on unexpected behavior of G^ref when temperature range is set? #307 was created and merging this will close unexpected behavior of G^ref when temperature range is set? #307
  • Before this change, I had 522 databases that were known to be parsed successfully pycalphad. After making the change, all 522 still parsed successfully. The only TDB that did not parse correctly was the ALNIPT_TDB database in our test suite, which had references of REF: 0, which I changed to REF0 (I also verified that Thermo-Calc 2020b says that REF: 0 is an invalid reference format in GES6)

@codecov
Copy link

codecov bot commented Mar 1, 2021

Codecov Report

Merging #308 (11908cc) into develop (be40446) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #308      +/-   ##
===========================================
+ Coverage    86.77%   86.78%   +0.01%     
===========================================
  Files           46       46              
  Lines         4460     4464       +4     
===========================================
+ Hits          3870     3874       +4     
  Misses         590      590              
Impacted Files Coverage Δ
pycalphad/io/tdb.py 87.89% <ø> (ø)
pycalphad/tests/datasets.py 100.00% <ø> (ø)
pycalphad/tests/test_database.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be40446...11908cc. Read the comment docs.

@bocklund
Copy link
Collaborator Author

bocklund commented Mar 1, 2021

The failing Ubuntu Python 3.9 test is caused by an error uploading coverage to codecov. The test suite itself passed.

@bocklund bocklund merged commit 5038950 into pycalphad:develop Mar 9, 2021
@bocklund bocklund deleted the tdb-strict-parsing branch March 9, 2021 17:43
bocklund added a commit that referenced this pull request Apr 27, 2021
This PR fixes a regression introduced in #308 where the reference key in TDB `FUNCTION` and `PARAMETER` lines was limited to alphanumeric characters. Now `:_-` characters as well to fix some local databases that fail to parse. It's unclear what other valid characters could be. Thermo-Calc gives examples in the documentation that include `REF:250`, `REF_002`, and `REF-SGTE`, so at least those are confirmed valid characters.

On my on the 522 local TDB files, this dropped the parse failures from 41 to 14. The 14 failures to parse are due to whitespace in the reference key, e.g. `PARAM ...; 6000 N REF: 0 !`, which is invalid per Thermo-Calc's database checker.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unexpected behavior of G^ref when temperature range is set?
2 participants