Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface for .NET languages: initial pass #1331

Merged
merged 45 commits into from Aug 17, 2022
Merged

Interface for .NET languages: initial pass #1331

merged 45 commits into from Aug 17, 2022

Conversation

burkenyo
Copy link
Contributor

@burkenyo burkenyo commented Jul 4, 2022

Changes proposed in this pull request

This PR introduces the initial pass at creating an interface for consuming Cantera from the .NET languages (C#, F#, VB.net or any other that targets the runtime). With the turn towards open-source and cross-platform that begun with .NET Core, coupled with crucial investments in the runtime and libraries to allow high-performance code, there is increasing interest in the community in using .NET for scientific computing.

This PR introduces three C# projects:

  • Cantera, containing the core types for accessing Cantera by interoperating with the C interface. This project can be built for .Net Standard 2.0 or .Net 6, which would theoretically allow consuming Cantera from any "active" .Net implementation, including the legacy Windows-only .Net Framework. In practice, most users will likely consume it from projects targeting the cross-platform .Net 6 or newer.
  • Cantera.Tests, containing the unit tests running in XUnit.Net. This project targets .Net 6.
  • Cantera.Examples, containing a console application that allows running examples. This project targets .Net 6.

In addition, this PR introduces "sourcegen", a Python-based source generator for scaffolding code for Cantera interfaces in other languages. Currently, sourcegen parses the CLib header files and sends them to a sub-package for creating C# source files. Future efforts could potentially switch from parsing header files to using a master YAML symbols file and introduce sub-packages for scaffolding source files in other languages like Julia. For the C# interface, sourcegen presently produces over 4000 lines of code, more than were written by hand for this PR.

Finally, this PR makes a few changes to the CLib files, for example to add setters that were missing. The biggest change is to make the ct_setLogWriter function accept a function pointer for a callback function that will be used for logging. This allows a language using CLib Interop to intercept and collect Cantera logging messages.

Note: the C# projects are built using the .Net CLI. It is not currently plugged into the SCons system, but that should be feasible with a future PR.

If applicable, provide an example illustrating new features this pull request is introducing

Cantera.Examples contains a EquilibriumExample which illustrates calculating the frozen and equilibrium sound speeds of a phase, modeled after the equivalent Python example. The assumption is that future PRs will add additional examples as the .Net Interface is fleshed out.

Checklist

  • The pull request includes a clear description of this code change
  • Commit messages have short titles and reference relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • Style & formatting of contributed code follows contributing guidelines
  • The pull request is ready for review

@bryanwweber
Copy link
Member

Hi @burkenyo! Welcome to Cantera 👋 I understand @speth encouraged you to submit this PR, awesome! I have one suggestion and one question I hope you can help clarify. First, can you please make sure each new file you've added here ends with a blank newline? You can see the absence of this by the red circle with the line in the Files tab.

The question is about sourcegen. Is this an existing open-source package, or something that you've written from scratch? If the latter, can you add some examples of how it should be used/configured? If the former, we should include its license file and move it to the ext folder.

Thank you!

@burkenyo
Copy link
Contributor Author

burkenyo commented Jul 5, 2022

Hi @bryanwweber, Here is a readme.md I had come up for sourcegen (which I wrote from scratch), but I wasn’t sure if y’all has a preferred place to put docs, e.g. in with code or somewhere else:


sourcegen - Python based source generator for creating interfaces for other languages

A common script “parses” the CLib header files and generates intermediate objects which represent the functions:

  • return type (string)
  • name (string)
  • params: list of
    • return type
    • name

The script delegates the source generation to a language-specific module. The module shall create the source files in a location appropriate to the build for that languages build process.

The common script takes two mandatory arguments:

  1. The name of the language-specific module.
  2. The full path of the directory where generated files shall be placed.

The language-specific modules export a class named “SourceGenerator”:

  1. A constructor which takes the output directory and the parsed config object
    The constructor stores these for later use.
  2. A “generate_source” function which will takes the included file and its list of parsed functions
    The generate_source function generates source files for each file it receives as it is called
    and/or builds up data structures containing source symbols to be generated later.
  3. A “finalize” function which takes no arguments
    The finalize function performs any final code generation tasks and necessary clean up.

The implementation details for the SourceGenerator classes is of course highly dependent on the build process of the destination language.

In addition, each module can contain a yaml-based config file named “config.yaml”. The core script recoginizes a key called “ignore”, which is a map. Each key in the “ignore” map is the name of a clib header file. The value is an array which is either empty or contains strings representing function names defined in the header file. The core script interprets the ignore map as follows:

  • for each key followed by an empty array, the entire file is excluded from code generation
  • for each key followed-by a non-empty array, only the functions referenced in the array are excluded from code generation.

The config file may contain additional values for use by the language-specific module.

Usage

Source generation can be invoked in two ways:

  • Call run.py from the shell, passing in the name of the language module and the output directory as command-line arguments.
  • Import the sourcegen module and call the generate_source function directly, passing the name of the language module and the output directory as arguments.

@burkenyo
Copy link
Contributor Author

burkenyo commented Jul 6, 2022

OK! I updated all files to contain a new line at the end. I wrote a nifty script to fixup the commits, so there are no new commits.

@ischoegl
Copy link
Member

ischoegl commented Jul 11, 2022

Hi @burkenyo ... thanks for this PR - this looks really promising!

One thing I am not 100% sure about is how this relates to Cantera/enhancements#39, although as far as I can tell, it is very much aligned. In some of the other discussions, jinja was mentioned as a potential vehicle for code generation - do you have an opinion on this? (I apologize about not being very knowledgeable in this area; from what I can tell, sourcegen would be an intermediate step?).

On a separate note, your updates to clib appear highly relevant to some other ongoing work also, in particular #1182 ... if you think that this provides a universal approach to handle errors from clib, it may be worth pulling those bits out into a separate PR so this can be merged quickly?

PS: Either way, please also add yourself to the AUTHORS file ...

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution, @burkenyo. I think this will be an interesting addition to Cantera, and I think this offers some interesting steps toward realizing the goals I had for Cantera/enhancements#39, even if C# was not one of the languages I had considered. Besides the in-line suggestions / questions, I had a few high level comments as well:

  • I appreciate the chance to review this addition while it's still a relatively tractable size. I know there are a number of additional steps that will be needed to make this a fully-fledged Cantera interface. I think it would be helpful, @burkenyo, if you could create an Issue in our "Enhancements" repo (https://github.com/Cantera/enhancements/issues) outlining what you see as the next steps for this interface (e.g., extension to wrap additional classes, integration with SCons, Nuget packaging, etc.).

  • Can you add a short README.md to interfaces/dotnet? I think this would be the place to describe:

    • What the requirements are to build the package
    • The steps to build the package and run the examples. (I guess dotnet build and dotnet run are mostly obvious for existing C# users, but it doesn't hurt to be explicit, and there are some additional requirements for what the user needs to do with the Cantera shared library)
    • What you need to do to be able to use the project, say to get autocompletion within VS Code
  • I'd love to see some more comments that explain what's going on in sourcegen, in particular csharp/_SourceGenerator.py. This would be make it easier to extend for generating additional language interfaces.

  • I think it would be nice to have the tests run as part of GH Actions (see .github/workflows/main.yml), at least on some platforms, even before worrying about integrating this with the SCons build process.

@ischoegl: Yes, I had initially envisioned using Jinja handle the templates needed for code generation, as mentioned in Cantera/enhancements#39. However, looking at what csharp/_SourceGenerator.py does with just f-strings, I'm not sure we really need much more, and there's definitely some simplicity offered by having the templating being done right inline. Either way, I don't think migrating this to Jinja would be that difficult if we thought that was warranted later.

include/cantera/base/ExternalLogger.h Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera.Examples/src/Program.cs Outdated Show resolved Hide resolved
include/cantera/base/ExternalLogger.h Show resolved Hide resolved
interfaces/dotnet/Cantera/src/SpeciesCollection.cs Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera.Examples/Cantera.Examples.csproj Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera/src/CanteraException.cs Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera.sln Show resolved Hide resolved
@burkenyo burkenyo changed the title Interface for .NET languages Interface for .NET languages: initial pass Jul 17, 2022
@burkenyo
Copy link
Contributor Author

@speth, here ya go: Cantera/enhancements#156

@burkenyo
Copy link
Contributor Author

burkenyo commented Jul 19, 2022

@speth here is a script you can use to detect long lines that were edited within a particular range of commits. I made it to find lines longer than 88 chars per your comment above. The script is written in Powershell (Core) (because forget trying to manipulate data structures in bash) and it automates some git command to gather the info.

https://gist.github.com/burkenyo/b8a78569b55d4d955f9aefe24f4b15c2

@burkenyo
Copy link
Contributor Author

HI @speth @bryanwweber @ischoegl,

Thank you for your feedback and comments. I've addressed almost everything and hope to push soon. I'm avoiding pushing to my dotnet-interface branch for now so I don't trigger so much CI churn.

Copy link
Member

@bryanwweber bryanwweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @burkenyo! Thanks for this submission 😄 I don't know anything about C#, so I focused my review on the sourcegen package. I think that will be very useful, and it's a neat implementation. I have a few suggestions and questions below. I started by reading the README.md, so the comments on that file are made without context of reading the code. The questions I have are as though I were a new developer trying to create a new interface, and I think some further explanation is needed to onboard someone to be able to generate a new interface.

In particular, I think the data structure that should be generated by generate_source should be explained, and the purpose of finalize should be explained. For the former, if the data structure depends on the language, then I think an explanation of the data structure of the clib functions as provided to generate_source is required.

interfaces/sourcegen/README.md Outdated Show resolved Hide resolved
interfaces/sourcegen/README.md Outdated Show resolved Hide resolved
interfaces/sourcegen/README.md Outdated Show resolved Hide resolved
interfaces/sourcegen/README.md Outdated Show resolved Hide resolved
interfaces/sourcegen/README.md Show resolved Hide resolved
interfaces/sourcegen/sourcegen/csharp/_SourceGenerator.py Outdated Show resolved Hide resolved
interfaces/sourcegen/sourcegen/csharp/config.yaml Outdated Show resolved Hide resolved
@@ -0,0 +1,5 @@
This file is part of Cantera. See License.txt in the top-level directory or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be its own file, having a string constant in sourcegen/__init__.py or something like that would be fine, which could still be returned by get_preamble.

interfaces/sourcegen/sourcegen/_ochestrate.py Outdated Show resolved Hide resolved
interfaces/sourcegen/sourcegen/_ochestrate.py Outdated Show resolved Hide resolved
Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @burkenyo. Besides a couple of dangling issues from my previous review (that GitHub really likes to hide...), most of these are just comments on spelling in the comments.

CONTRIBUTING.md Outdated Show resolved Hide resolved
interfaces/dotnet/README.md Outdated Show resolved Hide resolved
interfaces/dotnet/README.md Outdated Show resolved Hide resolved
interfaces/dotnet/README.md Outdated Show resolved Hide resolved
interfaces/dotnet/README.md Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera/src/Exceptions.cs Show resolved Hide resolved
interfaces/dotnet/Cantera/src/Exceptions.cs Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera/src/SpeciesCollection.cs Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera/src/SpeciesCollection.cs Outdated Show resolved Hide resolved
@speth
Copy link
Member

speth commented Jul 27, 2022

I noticed that several of the files added here ended up with CRLF line endings. Can you replace the line endings in these files? I saw it in interfaces/dotnet/examples/Application/Application.csproj, interfaces/dotnet/examples/Application/Program.cs, interfaces/dotnet/examples/SoundSpeed/Program.cs, interfaces/dotnet/examples/SoundSpeed/SoundSpeed.csproj and interfaces/dotnet/Cantera.Tests/src/ApplicationTest.cs, though this list might be incomplete.

@burkenyo
Copy link
Contributor Author

burkenyo commented Aug 5, 2022

@bryanwweber Thanks for your comments. Will take a look in the next couple days.

@burkenyo
Copy link
Contributor Author

@bryanwweber I've completely redone the sourcegen stuff. Take a look! (readme.md updates still in progress)

@bryanwweber
Copy link
Member

@bryanwweber I've completely redone the sourcegen stuff. Take a look! (readme.md updates still in progress)

Thanks! I'll take a look soon, hopefully this weekend 😊

@burkenyo
Copy link
Contributor Author

burkenyo commented Aug 14, 2022

@speth I fixed the failing test (clib_test) by fixing up the commit where I removed the extra new line from the thermo report (Correct flushing logic in ExternalLogger).

@burkenyo
Copy link
Contributor Author

@speth I removed all \rs. They seem to be created as an artifact of running dotnet new, likely do to its originating on Windows, but are irrelevant to the C#/dotnet ecosystem.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these updates, @burkenyo. I only have a couple minor cleanup suggestions and one bit of additional documentation that I'd like before I think this is ready to be merged. I think all the development that's needed to make this a full-featured interface can be handled in subsequent PRs.

interfaces/sourcegen/sourcegen/_helpers.py Outdated Show resolved Hide resolved
interfaces/sourcegen/sourcegen/_types.py Outdated Show resolved Hide resolved
interfaces/sourcegen/sourcegen/_types.py Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera/src/ThermoPhase.cs Outdated Show resolved Hide resolved
interfaces/dotnet/Cantera/src/ThermoPhase.cs Outdated Show resolved Hide resolved
interfaces/dotnet/examples/SoundSpeed/Program.cs Outdated Show resolved Hide resolved
Comment on lines 36 to 40
### Status

The .NET Interface is in preview and still missing many features
needed for parity with the C++ and Python interfaces. Interested contributors can see
the status of the project at [Enhancement #156](https://github.com/Cantera/enhancements/issues/156).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until this is more integrated into the overall SCons-driven build process, it would be helpful to describe how to actually build the interface and run the tests / samples in the interim. I guess dotnet build is fairly straightforward, but for now, to run dotnet test you need to:

  • build Cantera normally with SCons
  • link build/lib/libcantera_shared.3.0.0.??? to build/lib/libcantera.3.0.0.??? where ??? is the OS-dependent suffix
  • run DYLD_LIBRARY_PATH=/full/path/to/build/lib dotnet test (on macOS), or substitute that with just LD_LIBRARY_PATH on Linux or do ??? on Windows.

For the samples, I'm not sure exactly what you're supposed to do.

@bryanwweber
Copy link
Member

Thanks for all the updates @burkenyo ! I'm not going to be able to review this in a timely manner, so I'll dismiss my review and let @speth and @ischoegl take it from here

@ischoegl ischoegl self-requested a review August 16, 2022 22:57
Copy link
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @burkenyo ... thanks for all your work on this; I believe this to be a very valuable contribution. I don't have any input on the implementation itself (I believe the PR has been carefully reviewed and you promptly addressed everything), but there are some questions I hope you can answer:

  1. Following up on @speth's comment: what is the best way to install the .NET interface for a given OS? It would help to see what the generated interface looks like.
  2. What is the plan for documentation of the interface? At the moment, we provide documentation for the 'conventional' interfaces, i.e. C++, Python and Matlab (see website) - one of the hiccups I see is that clib is undocumented, but I assume that sourcegen can be amended to produce documentation also?
  3. I assume that integration into the SCons build system won't be an issue; same about automated CI testing on GitHub
  4. What do you anticipate in terms of packaging of pre-compiled code - I believe that conda has a dotnet package, so would there be a plan to build a cantera-dotnet package? Are there other packaging systems? NuGet?
  5. How do you envision that this interface will be used? If the preferred environment is Visual Studio, should there be .sln files for examples?

Despite my questions (which should probably be answered on Cantera/enhancements#156), I don't have any objections against this PR being merged.

interfaces/dotnet/examples/.template.config/template.json Outdated Show resolved Hide resolved
@burkenyo
Copy link
Contributor Author

@ischoegl Thanks for the call-out; I've fixed-up the commits to ensure all files have a new line at the end. (Will also explain shortly about other question re: csproj and build system.)

@bryanwweber bryanwweber dismissed their stale review August 17, 2022 00:24

Changes addressed, not enough time to re-review

@burkenyo
Copy link
Contributor Author

Hi again @ischoegl, as soon as I get this PR resolved, my next stage will be to flush out Cantera/enhancements#156.
But to briefly answer your questions:

What is the best way to install the .NET interface for a given OS?

As you noticed, there is a conda package that has the .NET SDK. This would provide what one needs to compile and test the C# code. Additionally, you can install the .NET SDK directly from Microsoft. I believe there are APT and homebrew packages as well (these have the added benefit of not having any MS stuff in there).

What is the plan for documentation of the interface?

C# has first-class support for doc-comments using the /// syntax, which is compatible with doxygen to generate documentation websites about the actual API. Bonus, if you have a C# plugin in your text editor, you'll see the docs when you hover over symbols in code. I've ensured that all significant public APIs in this PR have been marked-up appropriately. In addition, the examples will allow the user to see the APIs in action.

I assume that integration into the SCons build system won't be an issue

I haven't quite figured out how to wire this stuff into Scons. I do think we want to use the excellent open-source build tools that are part of the .NET SDK to handle the actual dependency resolution, compiling, packaging, and test-running. That said, I think we want to write a custom builder or use the generic run-a-command builder to orchestrate all these calls to the dotnet CLI as part of the large build and CI testing.

What do you anticipate in terms of packaging of pre-compiled code

I was envisioning two NuGet packages as the primary means of distribution. One will be the primary Cantera package, which will include all the native libraries needed as well as the compiled "assembly" (as it is called in .NET land – as a jitted ecosystem, the assembly contains all the intermediate bytecode which the runtime compiles on the fly to native code). A .NET assembly can be consumed from any .NET-compatible language, including not just C# but F# and others as well. The second package will include the examples as project templates. This allows the user to run the command dotnet new ct-examples to create a directory of "projects", one for each example, and the use dotnet run --project X to run example X directly.

How do you envision that this interface will be used? If the preferred environment is Visual Studio, should there be .sln files for examples?

I don't anticipate the user needing Visual Studio at all. All they should need is the open-source dotnet CLI and their favorite text editor. Many popular editors such as VS Code and Sublime feature plugins backed by OmniSharp (also 100% open-source), so they can get all the goodies such as code completion and error highlighting. However, since the open-source tooling does understand a .sln file as a grouping of related projects, you are probably right that it is a good idea to include one in the examples.

@ischoegl
Copy link
Member

@burkenyo … thank you for your clarifications! I am looking forward to seeing this developed into a full-fledged Cantera interface!

Uses an MSBuild task and a Python script to generate C# P/Invoke calls and SafeHandles from the Cantera CLib header files.
The script is written in Python because it's already part of the build environment.
Instead of using heap-allocated arrays that require garbage collection to pass the values, create a (stack-allocated) tuple on the fly and pass a pointer to it to the CLib function.
Uncaught exceptions thrown in a managed method called-backed from the Cantera library will cause the process to crash. This change provides a means of registering an exception that is caught in a call-back so it can be thrown when emerging completely from the interop layer.
Each example is now its own project. The projects will be packaged together as a dotnet CLI template.
This brings the C# code in line with the standards spelled out in CONTRIBUTING.MD
Separated method into a simple null check which should be inlined and a separate call to a local method which should handle the throwing (throws are never inlined in C#).
* Replaced all instances of file I/O and path manipulation with instances of the pathlib.Path class
* Removed redundant parentheses from generators, and replaced some generators entirely with the map() function.
* Renamed some ambiguous local variables
* Fixed some typos
* Changed to single invocation of generate_source(), and removed
  finalize().
* Split up ignore functionality into ingore_files and ignore_functions.
* Added HeaderFile dataclass to represent a parsed CLib header file.
* Created a with_unpack_iter decorator to provide unpacking capabilities
  to dataclasses, replacing unpack helper function.
* Refactored the C# SourceGenerator to fit new API and greatly increase
  readability and maintainability.
* Added a typed Config class.
* Made some names more readable.
* Reworked file structure slightly: only strict record types in _dataclasses.py
Restores original definition of ct_setLogWriter by introducing ct_setLogCallback.
* Add a solution file
* Update author in template.json
We’re able to use the same basic native library name (“cantera_shared”) on all systems and copy that native library and example data files directly from the upstream SCons build output and data directory.
These comments illuminate the necessity of the various testing dependencies.
@burkenyo
Copy link
Contributor Author

I believe I have now addressed all comments. @speth Ready for final review.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @burkenyo. This looks good to me.

Glad to see the updates to make it easy to run the tests and examples in-place, even before it's integrated better with SCons.

@speth speth merged commit 2138987 into Cantera:main Aug 17, 2022
speth pushed a commit that referenced this pull request Aug 17, 2022
#1331
* Ensures lines in source files are a maximum of 88 characters
* Ensures all if statements use blocks
* Trims spurious end-of-line whitespace
* Adds licensing preamble to source files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants