[WIP] Commit yaml files #768

bryanwweber · 2019-12-11T22:14:28Z

Thanks for contributing code! Please include a description of your change and
check your PR against the list below (for further questions, refer to the
contributing guide).

There is a clear use-case for this code change
The commit message has a short title & references relevant issues
Build passes (scons build & scons test) and unit tests address code coverage

Please fill in the issue number this pull request is fixing

Fixes #662, #669

Changes proposed in this pull request

Commit YAML input files into the repository

I also plan to convert all the tests that are currently using CTI/XML input files to use the equivalent YAML version, except of course tests that are testing the behavior of one of those two formats, or the conversion of them.

It occurred to me that ruamel.yaml will still be required to run the tests, so the benefit of removing the build-time dependency is less than I hoped. I still think this is the appropriate resolution of #662 and #669 though.

ischoegl · 2019-12-11T22:49:21Z

Some xml/cti occurrences are also replaced in #737 (addressing #736), which is pending resolution.

speth · 2019-12-12T01:42:10Z

I would suggest limiting the replacement of CTI/XML with YAML in other tests. CTI and XML are still fully supported for Cantera 2.5, and I'd hate to have a bug creep in because we stopped testing these formats.

If you really want to include all of the converted input files, I think this would be a good opportunity to think about which of the existing ones we should actually keep. There are quite a few that are either not used in the test suite or barely used and that I don't think are all that useful to users. A few examples are silicon.cti, silicon_carbide.cti, argon.cti, and nasa.cti (just a combination of nasa_gas and nasa_condensed, as far as I can tell)

ischoegl · 2019-12-15T14:20:52Z

If you really want to include all of the converted input files, I think this would be a good opportunity to think about which of the existing ones we should actually keep. There are quite a few that are either not used in the test suite or barely used and that I don't think are all that useful to users.

What is the benefit of no longer supporting those input files (there is next to no development effort involved)? Also, it is hard to know what people are using. speaking for myself, I have been using argon and silicon carbide, mostly when I need a phase that just acts as a cheap reservoir for heat transfer purposes. It’s a minor issue, but I wanted to put it out there.

bryanwweber · 2019-12-15T15:29:08Z

I think one primary reason is that we don't want to implicitly endorse one set of thermokinetics/transport data over another. By including the file with a Cantera distribution, we're kind of doing that. Obviously, there are several files we use in examples (nDodecane_reitz for one) that aren't generic, and GRI-3.0 seems like an historical exception, but I think in general we'd like to distribute fewer input files.

ischoegl · 2019-12-15T16:02:20Z

This is understandable. However, the message that users should use 'modern' input files is already muddled as cantera de facto endorses gri30, nDodecane_reitz, and the h2o2 portion of gri30 (for the latter, there are better h2o2 models around). Are there better ways to weaken this message than making cantera usage less convenient? An alternative would be to expand usage examples that have clear disclaimers regarding models used?

bryanwweber · 2019-12-15T17:24:33Z

Personally, I see the distinction as related to the example. If the example is associated with a published work, then the example should more than likely use the same input file as the publication (I believe this is the case for the RK phase example that introduced the Reitz file).

As for GRI-3.0, I personally give that an historical exception. At this point, everyone should know not to use GRI-3.0 for any publication-quality work, so I don't mind including it. We need to have some sort of input file for our examples, and having a somewhat realistic (if outdated) file seems like a benefit over having a trivial input file (e.g., 1-2 species, 1 reaction, etc.).

The provenance of many of the other files is questionable, to me. To my understanding, many of them were invented for the purpose of the related example and haven't been peer-reviewed, and we should probably do a better job of clarifying where that's the case. One problem I have is that many of them were added before I started contributing, so I'm not sure what's good and what isn't.

ischoegl · 2019-12-15T18:15:02Z

As for GRI-3.0, I personally give that an historical exception. At this point, everyone should know not to use GRI-3.0 for any publication-quality work, so I don't mind including it.

I am less optimistic in this regard, and believe that this needs to be pointed out especially to new users that may not be as familiar with the current state of affairs.

The provenance of many of the other files is questionable, to me. To my understanding, many of them were invented for the purpose of the related example and haven't been peer-reviewed, and we should probably do a better job of clarifying where that's the case.

This is my understanding also.

I agree that some simple, semi-realistic example input files are needed (whether or not those are argon or silicon-carbide is beside the point). (EDIT: I guess I misread, and am actually advocating for more scaled down examples). I personally hope that the NSF project will result in improvements here.

One suggestion I'd have for 2.5 is to add fields to all cantera-supplied YAML (and CTI?) files that include a disclaimer and/or clarify provenance (similar to the deprecated field that @speth suggested elsewhere). To that end, ck2yaml etc. could take an optional argument --comment that adds an entry so files do not have to be modified by hand.

ischoegl · 2019-12-15T18:40:30Z

Adding/referencing an open issue that could be laid to rest here: #339

bryanwweber · 2019-12-15T22:04:57Z

I am less optimistic in this regard, and believe that this needs to be pointed out especially to new users that may not be as familiar with the current state of affairs.

I'm not sure where the best place to do this is - include a disclaimer in every example that uses GRI-3.0? I guess another question is, does it matter? There's no way anyone could publish a paper these days using GRI-3.0. Are there other outputs from using Cantera? (That question is clearly unanswerable 😄 ) Maybe someone uses it at an engineering firm, wouldn't they validate with experimental results for a design?

I guess, it seems like adding a disclaimer for all the examples that use GRI-3.0 seems like adding quite a bit of noise 🤷‍♂

(EDIT: I guess I misread, and am actually advocating for more scaled down examples)

What do you mean by "scaled down"?

I personally hope that the NSF project will result in improvements here.

As far as I know, we're not planning any work specifically on existing examples. I'm also not sure what you mean by improvements - improvements where? If you have any ideas here, feel free to put an issue at Cantera/improvements.

One suggestion I'd have for 2.5 is to add fields to all cantera-supplied YAML (and CTI?) files that include a disclaimer and/or clarify provenance (similar to the deprecated field that @speth suggested elsewhere)

This is already done, see

cantera/data/gri30.yaml

Lines 1 to 6 in 6b9fe8a

    
           description: |- 
        
             GRI-Mech Version 3.0 7/30/99  CHEMKIN-II format 
        
             See README30 file at anonymous FTP site unix.sri.com, directory gri; 
        
             WorldWideWeb home page http://www.me.berkeley.edu/gri_mech/ or 
        
             through http://www.gri.org , under 'Basic  Research', 
        
             for additional information, contacts, and disclaimer

To that end, ck2yaml etc. could take an optional argument --comment that adds an entry so files do not have to be modified by hand.

I believe that ck2yaml currently reads any comments in the CK files and adds them to the description key. I'm not sure that adding a command line flag to read in a long text blob is a great UI, maybe it could be either a short message or a file name. PRs welcome!

Adding/referencing an open issue that could be laid to rest here: #339

I am not planning to address #339 in this pull request.

ischoegl · 2019-12-15T22:42:52Z

I'm not sure that adding a command line flag to read in a long text blob is a great UI, maybe it could be either a short message or a file name

The description field is clearly different as it pertains to what the original authors of the mechanism had to say. All I was pointing to here is to add a short official comment by cantera developers to clarify what those input files represent and/or how they should be used (how this gets implemented is secondary).

This suggestion would partially address #339, especially as the future fate of some of the distributed input files is being actively discussed at this point. I am writing this in an attempt to provide constructive suggestions. I am not planning a PR myself if this is just a shot in the dark; this is clearly a higher level decision. It’s not up to me ...

Ad scaled down Input / improvements: I don’t think that existing examples need work, but I was hoping for a more comprehensive suite of examples, some of which could be really basic ... 😉

ischoegl · 2019-12-17T17:05:40Z

@bryanwweber ... rather than adding a PR, I posted Cantera/enhancements#15

codecov · 2019-12-18T05:11:13Z

Codecov Report

Merging #768 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #768      +/-   ##
==========================================
+ Coverage   71.44%   71.44%   +<.01%     
==========================================
  Files         372      372              
  Lines       43935    43946      +11     
==========================================
+ Hits        31388    31399      +11     
  Misses      12547    12547

Impacted Files	Coverage Δ
test/general/test_containers.cpp	`100% <100%> (ø)`	⬆️
test/thermo/thermoFromYaml.cpp	`100% <100%> (ø)`	⬆️
src/thermo/ThermoFactory.cpp	`83.72% <100%> (+0.24%)`	⬆️
src/base/AnyMap.cpp	`87.5% <100%> (+0.05%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba32337...b72a6ca. Read the comment docs.

bryanwweber · 2019-12-18T17:29:59Z

The description field is clearly different as it pertains to what the original authors of the mechanism had to say.

I don't see the description field in this way. I think it is a more general field that can contain whatever we want. Thank you for adding the enhancement issue for references though, I think that will be valuable if we can come up with a convention there.

This suggestion would partially address #339, especially as the future fate of some of the distributed input files is being actively discussed at this point.

Well, I ended up partially addressed #339 in 89157a4 at least identifying when I didn't know the source of the data.

Ad scaled down Input / improvements: I don’t think that existing examples need work, but I was hoping for a more comprehensive suite of examples, some of which could be really basic ... 😉

If you have something specific in mind (or even if not), feel free to open an issue on Cantera/enhancements. We do have the tutorials on the website which are pretty basic.

ischoegl · 2019-12-18T18:44:57Z

Well, I ended up partially addressed #339 in 89157a4 at least identifying when I didn't know the source of the data.

Looks good! Thanks for addressing.

bryanwweber · 2019-12-18T19:29:55Z

@speth @ischoegl I added a possible deprecated implementation for the YAML files in 546223a with two sample messages in argon.yaml and gri30.yaml.

I don't necessarily think it's worth adding to CTI given the impending deprecation of the whole format, plus I don't know where to add that 😂

ischoegl · 2019-12-18T19:43:17Z

@bryanwweber ... the way I see this alternative:

Pro:

This implementation is short and won't require further code modifcations.

Con:

@speth: The one advantage of approach [#737] would be that we wouldn't need to start of by introducing some deprecated phase definitions in gri30.yaml.

The gri30.yaml file cannot be generated from source via ck2yaml, which is imho a serious drawback.
The default loading behavior changes silently / without warning (which is inconsequential)

bryanwweber · 2019-12-18T19:48:28Z

@ischoegl: The gri30.yaml file cannot be generated from source via ck2yaml, which is imho a serious drawback.

Until we remove those phases after Cantera 2.5.0, i.e., for a limited amount of time. After Cantera 2.5.0, the file will be able to be generated directly from the CK input. As such, I don't see this as a drawback for this approach.

ischoegl · 2019-12-18T21:52:43Z

I'm ok with this solution (as long as the ck2yaml generation is preserved). The warning message could be more explicit about how transport models are handled in general (which was a nice feature of the warning messages of the alternative approach).

Edit: closed #737

bryanwweber · 2019-12-18T22:50:20Z

The warning message could be more explicit about how transport models are handled in general (which was a nice feature of the warning messages of the alternative approach).

Yes, these are just a message for the sake of seeing some output is generated. I'm interested in some feedback on the actual implementation for now, and I'll keep working on the messages. I was actually surprised at how easy this was to do, at least to the extent that the feature is available here.

ischoegl · 2019-12-18T23:58:43Z

I was actually surprised at how easy this was to do, at least to the extent that the feature is available here.

Yup. Seems like your solution hit a sweet spot - short and simple. Wouldn't have closed my PR otherwise!

I'm interested in some feedback on the actual implementation for now, and I'll keep working on the messages.

Sure. I'm wondering whether it wouldn't make sense to move all warnings to AnyMap? You can probably raise warnings for all kinds of other mapped objects (thermo phases, species thermo, etc.) with very little effort using the same idea.

bryanwweber · 2019-12-19T00:04:42Z

The reason I included it here is that it requires a YAML file with this field to actually function, and this is the only branch that has YAML files committed to the repo.

ischoegl · 2019-12-19T16:00:37Z

data/ohn.yaml

@@ -0,0 +1,18 @@
+description: This definition extracts the O/H/N submechanism from GRI-Mech 3.0


This is hand-copied from nasa.cti in 89157a4 (similar for many other files) ... would it make sense to add parsing of the header block and conversion to the description field to ctml2yaml? (this would also benefit users converting their own files)

Yes, I think it would be good to add this functionality... I think it should be possible in ctml2yaml, I believe comments are accessible in the tree. I'm not sure about cti2yaml, because that actually execs the input file, so I'm not sure if comments are available. I believe a docstring would be available and could be added though...

From what I recall, CTI files typically have a commented # header. The simplest way would be just to parse this block using with open(...) as cti_file: etc.?

ctml2yaml doesn't really stand a chance here of having a relevant description to copy, since the comment header in the CTI file would have been lost when running ctml_writer, and there are very few XML files with any header comment blocks, at least in the cantera repository.

I spent a little time trying to get access to the comments in a CTI file when writing cti2yaml, but couldn't find a way to get at them in a way that provided correspondence to the code itself -- they don't seem to be available even through the ast module. If we're only after the header comment to use for this field, then I think @ischoegl's suggestion is probably the best we can do.

@speth I think it's probably worth seeing how hard this would be to add to ctml2yaml, in case anyone has hand-edited XML files that include comments, at least for a top-level comment. Not a high priority, I'll add an issue at Cantera/enhancements eventually.

ischoegl · 2019-12-19T16:52:18Z

data/air.yaml

@@ -0,0 +1,177 @@
+description: |-


Following up on discussion in commit 89157a4 (which is not displayed outside of the commit):

I believe it would be preferable to add the description to the initial source air.inp instead of adding it by hand after conversion. I.e. ck2yaml will generate this field automatically (I did as much in a small addition added to #777).

It is a near duplicate of prandtl1.m

Python best practices are to use the with statement to manage opening and closing files. The Python docs recommend specifying newline='' to avoid certain problems with Windows-style carriage returns. See https://docs.python.org/3/library/csv.html#id3

Remove newlines and strip leading and trailing spaces to reformat the comment into a single line.

bryanwweber · 2020-01-05T17:23:16Z

@speth I think this is finished now (finally). I reverted most of the controversial formatting changes, keeping mostly the ones adding spaces and the version requirements to the docstring.

speth

Just a few final suggestions.

data/argon.yaml

data/lithium_ion_battery.yaml

test/general/test_containers.cpp

test/thermo/thermoFromYaml.cpp

In many cases, these are copied from the corresponding CTI file. In some cases, the source of the data is unknown and these are also noted.

Add requirements to all Python examples. Update formatting to resolve some PEP8 problems.

bryanwweber · 2020-01-07T03:16:07Z

@speth Thanks for all the comments! Updated again.

ischoegl · 2020-01-07T18:08:40Z

@bryanwweber ... it looks like the chemkin to yaml conversion is no longer used in SConstruct, but the chemkin inputs persist in the data folders. Was this by intent or is it an oversight?

bryanwweber · 2020-01-07T18:29:21Z

@ischoegl By intent, we decided to handle/discuss other input files in a separate PR to avoid extending the scope of this one even further.

ischoegl · 2020-01-07T18:40:02Z

@bryanwweber ... as #777 depends on chemkin input, I (temporarily?) added the chemkin->yaml conversion back into SConstruct.

bryanwweber · 2020-01-07T18:49:29Z

@ischoegl OK, we can discuss that in #777

bryanwweber mentioned this pull request Dec 12, 2019

Transitional handling of 'gri30_mix' and 'gri30_multi' phases in YAML #737

Closed

ischoegl mentioned this pull request Dec 17, 2019

Add reference field to YAML input files Cantera/enhancements#15

Closed

bryanwweber force-pushed the commit-yaml-files branch 2 times, most recently from 9aa49d2 to 68f3564 Compare December 18, 2019 04:43

bryanwweber force-pushed the commit-yaml-files branch from 68f3564 to 89157a4 Compare December 18, 2019 17:21

bryanwweber force-pushed the commit-yaml-files branch from 89157a4 to 546223a Compare December 18, 2019 19:25

bryanwweber mentioned this pull request Dec 18, 2019

Fix examples in samples folder #774

Closed

ischoegl reviewed Dec 19, 2019

View reviewed changes

bryanwweber added 5 commits January 4, 2020 13:18

[MATLAB] Allow None as a transport option

768a4bb

[MATLAB/Examples] Remove transport1.m example

67f97b4

It is a near duplicate of prandtl1.m

[YAML] Switch interfaces and examples to use YAML files

d26039d

[cti2yaml] Reformat species-level notes

8dc6b7b

Remove newlines and strip leading and trailing spaces to reformat the comment into a single line.

bryanwweber force-pushed the commit-yaml-files branch from 01e3491 to c747b45 Compare January 5, 2020 17:20

bryanwweber marked this pull request as ready for review January 5, 2020 17:20

speth requested changes Jan 7, 2020

View reviewed changes

data/argon.yaml Outdated Show resolved Hide resolved

data/lithium_ion_battery.yaml Outdated Show resolved Hide resolved

test/general/test_containers.cpp Outdated Show resolved Hide resolved

test/thermo/thermoFromYaml.cpp Outdated Show resolved Hide resolved

bryanwweber added 4 commits January 6, 2020 22:14

[YAML] Add descriptions to YAML files

caaa9bb

In many cases, these are copied from the corresponding CTI file. In some cases, the source of the data is unknown and these are also noted.

[Cython] Set python_requires in setup.py

2f251f4

[YAML] Add deprecated field to YAML files and phases

552e912

[Examples] Formatting changes

b72a6ca

Add requirements to all Python examples. Update formatting to resolve some PEP8 problems.

bryanwweber force-pushed the commit-yaml-files branch from c747b45 to b72a6ca Compare January 7, 2020 03:14

speth approved these changes Jan 7, 2020

View reviewed changes

speth merged commit 6ccefcb into Cantera:master Jan 7, 2020

bryanwweber mentioned this pull request Jan 7, 2020

ck2yaml and cti2yaml require Python 3, breaking builds that use system Python 2 #669

Closed

bryanwweber mentioned this pull request Jan 7, 2020

YAML format documentation Cantera/cantera-website#91

Merged

ischoegl mentioned this pull request Jan 9, 2020

ck2yaml may pick incorrect file for validation tests #788

Closed

This was referenced Jan 11, 2020

[SCons] Copy missing yaml input to LiC6_electrode sample #789

Closed

What should we do with all our input files? Cantera/enhancements#22

Closed

Read comment blocks from XML and CTI files when converting to YAML Cantera/enhancements#23

Open

bryanwweber deleted the commit-yaml-files branch January 11, 2020 20:53

bryanwweber mentioned this pull request Jan 14, 2020

Document requirements of Python examples #755

Closed

speth mentioned this pull request Jun 11, 2020

Adding citation to the chem input file in the data folder #339

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Commit yaml files #768

[WIP] Commit yaml files #768

bryanwweber commented Dec 11, 2019 •

edited

ischoegl commented Dec 11, 2019

speth commented Dec 12, 2019

ischoegl commented Dec 15, 2019

bryanwweber commented Dec 15, 2019

ischoegl commented Dec 15, 2019

bryanwweber commented Dec 15, 2019

ischoegl commented Dec 15, 2019 •

edited

ischoegl commented Dec 15, 2019 •

edited

bryanwweber commented Dec 15, 2019

ischoegl commented Dec 15, 2019

ischoegl commented Dec 17, 2019

codecov bot commented Dec 18, 2019 •

edited

bryanwweber commented Dec 18, 2019

ischoegl commented Dec 18, 2019

bryanwweber commented Dec 18, 2019 •

edited

ischoegl commented Dec 18, 2019 •

edited

bryanwweber commented Dec 18, 2019

ischoegl commented Dec 18, 2019 •

edited

bryanwweber commented Dec 18, 2019

ischoegl commented Dec 18, 2019 •

edited

bryanwweber commented Dec 19, 2019

ischoegl Dec 19, 2019 •

edited

bryanwweber Dec 19, 2019

ischoegl Dec 19, 2019

speth Dec 20, 2019

bryanwweber Dec 25, 2019

ischoegl Dec 19, 2019 •

edited

bryanwweber commented Jan 5, 2020

speth left a comment

bryanwweber commented Jan 7, 2020

ischoegl commented Jan 7, 2020

bryanwweber commented Jan 7, 2020

ischoegl commented Jan 7, 2020

bryanwweber commented Jan 7, 2020

		@@ -0,0 +1,18 @@
		description: This definition extracts the O/H/N submechanism from GRI-Mech 3.0

[WIP] Commit yaml files #768

[WIP] Commit yaml files #768

Conversation

bryanwweber commented Dec 11, 2019 • edited

ischoegl commented Dec 11, 2019

speth commented Dec 12, 2019

ischoegl commented Dec 15, 2019

bryanwweber commented Dec 15, 2019

ischoegl commented Dec 15, 2019

bryanwweber commented Dec 15, 2019

ischoegl commented Dec 15, 2019 • edited

ischoegl commented Dec 15, 2019 • edited

bryanwweber commented Dec 15, 2019

ischoegl commented Dec 15, 2019

ischoegl commented Dec 17, 2019

codecov bot commented Dec 18, 2019 • edited

Codecov Report

bryanwweber commented Dec 18, 2019

ischoegl commented Dec 18, 2019

bryanwweber commented Dec 18, 2019 • edited

ischoegl commented Dec 18, 2019 • edited

bryanwweber commented Dec 18, 2019

ischoegl commented Dec 18, 2019 • edited

bryanwweber commented Dec 18, 2019

ischoegl commented Dec 18, 2019 • edited

bryanwweber commented Dec 19, 2019

ischoegl Dec 19, 2019 • edited

Choose a reason for hiding this comment

bryanwweber Dec 19, 2019

Choose a reason for hiding this comment

ischoegl Dec 19, 2019

Choose a reason for hiding this comment

speth Dec 20, 2019

Choose a reason for hiding this comment

bryanwweber Dec 25, 2019

Choose a reason for hiding this comment

ischoegl Dec 19, 2019 • edited

Choose a reason for hiding this comment

bryanwweber commented Jan 5, 2020

speth left a comment

Choose a reason for hiding this comment

bryanwweber commented Jan 7, 2020

ischoegl commented Jan 7, 2020

bryanwweber commented Jan 7, 2020

ischoegl commented Jan 7, 2020

bryanwweber commented Jan 7, 2020

bryanwweber commented Dec 11, 2019 •

edited

ischoegl commented Dec 15, 2019 •

edited

ischoegl commented Dec 15, 2019 •

edited

codecov bot commented Dec 18, 2019 •

edited

bryanwweber commented Dec 18, 2019 •

edited

ischoegl commented Dec 18, 2019 •

edited

ischoegl commented Dec 18, 2019 •

edited

ischoegl commented Dec 18, 2019 •

edited

ischoegl Dec 19, 2019 •

edited

ischoegl Dec 19, 2019 •

edited