New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interoperability between EGSnrc and PyMedPhys #399
Comments
Yup :) sounds awesome :) |
From what I can tell there is no easy, direct path from Fortran to Webassembly. It might be possible if flang compiler support aligns with the particular flavour of Fortran the egs Mortran system emits. According to this it looks like only Fortran 2008 is supported. Rather if Fortran 77 is emitted, which I suspect is the case, then it might be possible to instead rely on f2c to transform like:
Gfortran offers more robust Fortran language support, but I don't recall every hearing about bytecode support. Actually I think the GCC developers intentionally resist because it can result in some types of license circumvention. But this may have changed when LLVM appeared. If the above pathway is viable, a faithful port still might be impossible. There are serious limitations in Webassembly at the moment, especially around floating-point optimizations (which might be necessary) and memory management. Signalling NaNs, for example, will be difficult to provide reliably. This will almost certainly be a very bumpy road. |
Pinging @darcymason. He had some ideas when we were chatting about this on the 19th. |
Here are some related issues that might be able to help with ideas: |
And here is the pr that did it: |
Some continued improvement is also going on at: |
@rtownson, this is where we'll be tracking EGSnrc to the browser efforts for a bit. See how we go ay. |
It does look like EGSnrc's mortran does emit f77: |
I just had some theories, even assumptions, I would say ... that since scipy has fortran libraries that there shouldn't be any fundamental limitations to mortran->fortran->...->webassembly working. @SimonBiggs last comment seems hopeful in that regard. In any case, you guys are deeper into this than me. Watching with interest... |
@mchamberland I think we might gain the most initially by making it so that EGSnrc can be bundled into a Python Wheel. @scaramallion used https://github.com/joerick/cibuildwheel to make wheels for a range of OSes (pydicom/pylibjpeg-libjpeg#3 (comment)) for his library https://github.com/pydicom/pylibjpeg-libjpeg. I think it would be awesome to have an EGSnrc Python binding API as well as have it wheel installable from pip. Down the road we could consider the webassembly route... but for now... I think we can have some huge wins just by making EGSnrc more accessible from Python. @mchamberland might this be something you'd be keen on helping me achieve? |
The EGSnrc team is motivated to remove Mortran from the code base entirely for a few reasons. I had proposed replacing Mortran with a mix of Make and Python (where it makes sense). A proposed alternative would be rewriting the entirety of EGSnrc in rust. If people feel like contributing down these roads for EGSnrc that might be more helpful than keeping the legacy anchor that is mortran around. |
Oh wow, rewriting egs in rust. That is a very promising (and ambitious)
option. (It would also give webassembly for free...)
How many are interested in undergoing such an effort?
…On Tue., 9 Jun. 2020, 1:20 am crcrewso, ***@***.***> wrote:
The EGSnrc team is motivated to remove Mortran from the code base entirely
for a few reasons. I had proposed replacing Mortran with a mix of Make and
Python (where it makes sense). A proposed alternative would be rewriting
the entirety of EGSnrc in rust.
If people feel like contributing down these roads for EGSnrc that might be
more helpful than keeping the legacy anchor that is mortran around.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#399 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSBK65WKR74UN3ES472RZLRVT6TRANCNFSM4HZSD3WA>
.
|
...a big downside of course is one loses the rich verified history that egs has. I know personally I would much rather run my simulations on a tried and true code base. ...see, even when one uses scipy, we are still using Fortran. There is value on using old code bases, ones that have a deep and long history of being verified. But, if there is enough automated testing of the prior code base, and it can be verified that the new rust code is equivalent to the old code, then I would be willing to make the jump. |
True, very true, and it's a deep conversation to be had. If there was 2 years funding for such a project I'd take it on. |
Either way, it should be possible to distribute the current egs as Python wheels. And if that is done, there are already huge usability wins for the community. At the end of the day, if someone removes mortran, or rewrites it into rust, it still doesn't remove the need in the community for an easy to install python wheel package for egs. And to create that package one need only set up a compile CI for each OS. ... Big wins with small effort :) it is always great if a big project is full of really small ones that can be successful in their own right. |
Yes we have been discussing an EGSnrc rewrite over the last year (well the last 10 years really...). For the history buffs there is a revealing quote in of the EGS4 files: "And yes, Mortran MUST DIE but we have to put something as powerful in its place. Any volunteers?" That was in... 1991!! I reckon the call to action stands, almost 30 years later. |
We have been looking into rewriting EGSnrc in C++ and also in Rust, and @mxxo is looking into that already. Any comments and suggestions welcome, technical or philosophical! Once we embark on this in earnest, we will announce it here and create a project branch where everyone will be welcome to contribute. |
I am quite keen to put my +1 vote next to choosing Rust over C++. I would also certainly be interested in lending a hand. |
@SimonBiggs you are right, there is a lot of value in a vetted code. But the way the code is designed in rigid (global variables, monolithic code, fortran77 backend, etc.) and fragile (for much the same reasons) in terms of development. Most of the vetting value resides in the core physics routines, and any new implementation of these can be vetted by brute force against the the legacy code as a reference. I think it is worth as shot! My top priorities would be:
|
The task may seem daunting (it did to me until a couple years ago), but in reality the radiation transport problem EGSnrc is solving is conceptually simple once the core physics and transport mechanics is resolved (it is, thanks 🙏 to its authors!). In the end, it is just a collection of particle objects with a few variables (r, u, E) that scatter (and create secondaries of course) via a relatively small set of functions. A lot of the EGSnrc code has to do with arcane methods for input, output, and general data processing, much of which can be done much more easily today (especially if standard libraries are invoked, for example, although we want to limit external dependencies). |
@SimonBiggs I record your Rust vote! it is not an easy sell (I am still not 100% sold on this choice myself yet!) |
Also, if we truly are still in the brainstorming stage, can I extend horizons a little. If the choice of C++/Rust is for speed... don't discount Python (sounds crazy I know), but there are numerical libraries built to be able to be compiled to GPUs. This one for example is built by DeepMind + Google with Machine Learning in mind: They are certainly aiming for speed, and they achieve it in both development time and run time... If all options are on the table, personally, I would prefer to redact my Rust vote and instead vote for a combination that leverages where a huge amount of computation speed up research + resources are currently being focused. |
@ftessier +1 for (modern) C++. Rust is interesting and exciting, but the long term is hard to predict. C++ will definitely be around for at least a few more decades. |
@hdclark this is why we are hesitant to go "full Rust", and will for the moment try it out with actual code (philosophical discussions can only get you so far...). I have been watching the Rust project since it became more serious in 2018, and I have to say I am enthralled by their governance, their commitment to code efficiency and to long term stability (horizon of 40 years). The core mission of Rust to make programming memory safe (and thread safe, by the same token) is really appealing. I would not bet on C++ being around (meaning mainstream) for decades, when there is such effervescence today around functional programming, safer programming, and as web applications have already overtaken the software industry. It is rapidly becoming the dinosaur in the room. I am not against C++ on principle though, I mean we are still using Mortran 😂 |
@SimonBiggs I am indeed not considering Python for the EGSnrc backend, perhaps naively. I will look at jax, I was not aware of such a project! I love python for all my scripting needs, but my understanding of its performance for numerical computation is limited to the general idea that as an interpreted language it must be order(s) of magnitude slower than C. I am not at this point ready to commit to the effort to rewrite core EGSnrc routines in python, but I sure would like to see an actual comparison! 😄 No worries about your vote redaction; I find this discussion stimulating, keep the ideas and opinions coming. 👍 |
@ftessier Sounds like a reasoned approach. I predict C++ will be around for a while because the proposals since C++11 have gratuitously stolen useful ideas from Python, Haskell, and even ... Rust. Many big-ticket projects (e.g., LLVM, Chromium, V8, even Zircon) are using C++. That being said, if the EGSnrc bikeshed gets painted a Rusty colour I will be happy. |
@ftessier the built in parallelisation of the random number generation is quite interesting: https://jax.readthedocs.io/en/latest/notebooks/Common_Gotchas_in_JAX.html#JAX-PRNG |
Indeed, can't say I get it, but I think I get the idea. In EGSnrc of course, it is all pleasingly parallel (independent threads really), so we can just provide different seeds to each thread and off we go 😃 . Aaah the Mersenne Twister, my favorite RNG for "ever"; but upon looking up modern RNGs recently (lead there, you guessed it, from the Rust Rand Book, I have taken a liking xoshiro generators, and I am looking forward to implementing |
There is an ongoing discussion about the short term solution for the EGSnrc default rng in issue 292 if you feel like jumping in. |
Hi @ftessier, A python Monte Carlo statics library has been swapping out its backend for JAX recently. Their benefits seems to be speed just a little bit slower on a CPU when compared to C, but without any maintenance burden also being able to run the code on TPUs and GPUs with massive speed benefits. Their blog post announcement is over at: https://pymc-devs.medium.com/the-future-of-pymc3-or-theano-is-dead-long-live-theano-d8005f8a0e9b I think a JAX based EGSnrc would be easy to maintain and potentially an order of magnitude (or two) faster that a CPU locked codebase. |
Thanks for the head up on this @SimonBiggs, this is a possibility I had not considered. I had thought about a simulation interface in python with a C (and Mortran!) backend as a possible step forward. Reading a bit about JAX feeds those thoughts, to say the least! In the past year I have been eyeing Rust however.... |
The fact that most articles in the following search are old and unofficial makes me think Rust might not be the best way to go if speed is the ultimate aim: https://www.google.com/search?q=compiling+rust+to+GPU ...at the end of the day I'd imagine you'd want to be able to run the software on something like the following: https://lambdalabs.com/gpu-workstations/vector I think any re-write must make running on the GPU a first class citizen. And JAX comes with all of the power of XLA compilation (https://www.tensorflow.org/xla/architecture) built in... |
Point taken, and we certainly ought to think about running on GPUs or TPUs, especially if it comes at "zero" cost, so to speak, and if the source code remains general, i.e., it does not become tied to any particular hardware-of-the-day. |
Yup, and since JAX code is essentially numpy syntax it is exceptionally readable, maintainable, and the numpy syntax is sticking around for the long haul. |
@ftessier, to add another Monte Carlo library now also taking this approach this following library by Uber (https://pyro.ai/#about) that undergoes Hamiltonian Monte Carlo that is now rewriting its backend in numpy/JAX: https://github.com/pyro-ppl/numpyro ...the more I read into it, it seems like this is most certainly a brilliant path/direction for EGSnrc to follow in... |
Thanks for the update @SimonBiggs. I am looking at JAX (and XLA) documentation again, and the two parts of EGSnrc that in my mind could benefit from this are geometry operations (linear algebra) and simultaneous particle transport (à la STOPS in vmc++). In order to play with this, we need a rudimentary port of radiation transport into python; anyone? I will add this to our list of potential EGSnrc-related projects! |
Would you be able to point out a minimum viable set of functions within the |
I had a quick look at this - if the starting point is Fortran rather than Mortran, it looks like something like transpyle may be able to do some of the transpiling work. I think it would be necessary given the ultimate scale of the conversion. Or perhaps someone else knows a different tool - that one is the best I could find. If you could post the Fortran (as per @SimonBiggs request above) I'd be willing to at least start looking at whether that can do some basic translation of flow structures, statements etc. |
If one wants to be able to call the EGSnrc code within Python I've had some success with NumPy's f2py (https://numpy.org/doc/stable/f2py/) over in the following PR: However, I struggled with being able to wrap the EGSnrc source in a decoupled manner into modular pieces that I could then wrap up. The main issue here likely is that my fortran/mortran skills are lacking... |
Ideally the RNG would indeed be independent and reproducible, however in practice this is going to be difficult with different language and compilers! I predict we will have to resort to brute force (fuzzy) testing of the functions instead, but that does not worry me. The easiest to port would be the geometry library, but the most meaningful would be the physics engine. Porting EGSnrc is a sizeable project, and for the purpose of testing a JAX implementation, for example, I would suggest a simpler Monte Carlo transport code, as a proof of principle. What I mean is that the EGSnrc code has developed its own intricacies (and idiosyncrasies!) over time. It would probably take more time than it is worth to port all of it at this stage for the sake of investigating whether JAX has efficiency advantages (or not). The key routine to port though is The difficulty in porting (having tried a few ports myself, mostly for fun :-)) is not rewriting the functions; that's mostly straight math operations. The contortions appear because of the data structures implemented as a long list of common blocks. They are well-organized (kudos to the original authors), but it requires some deciphering. I would be willing to help bootstrap this, of course! Again, the issue is to not waste a good few month's time rewriting the code, and then find out it does not pan out at all. Transpilation at the outset as @darcymason suggested might be an early investigative option (the starting point can be Fortran, as Mortran just output a Fortran file). I would suggest working in stages, from simpler building blocks, even synthetic ones. For example, first compare RNG speed. Then, load a I anyone tries this, please keep in touch, I would be very interested to witness progress, and eventually contribute. If this has traction here, don't hesitate to start a Discussion on |
I used that too (many many years ago) - it certainly might be helpful for pasting together Python code to parts that haven't been converted.
I noticed that in my quick review too - a central question would be whether to continue to use globals, or pass data classes into subroutines. The latter is certainly better practice, but would make the code transpilation more complicated - another reason why a programmable tool is useful. The transpyre library uses XML to store the intermediate abstract syntax tree - I'm not a huge fan of XML in general, but in this case it might make customizing code conversion much more doable.
Sounds like a good idea. Probably makes sense to look into that before any transpiler effort like I mentioned. Those kinds of tests are outside my abilities, but the transpilation does intrigue me and I will help there is I can. I might take a look at |
In the following notebook on Google's colab infrastructure (for reproducible hardware comparisons) I set up a million particles with position, direction, and energy, and then I iterated on them with random normal numbers. 10 iterations took ~1.2 seconds when CPU bound, 10 iterations took ~14 ms when running on the GPU: https://colab.research.google.com/drive/1QZKnByThJrq9NhgBbFb8Nx6cPybFVWZJ?usp=sharing To test it out for yourself, to swap to GPU you need to press "Change runtime type": Then select "GPU": Given online benchmarks show that NumPy on the CPU is on par with C++ in terms of speed: https://stackoverflow.com/a/7614252/3912576 It seems like, for the hardware available on Google's colaboratory, a speed up of about 2 orders of magnitude might be expected. Does my reasoning seem okay?
Up next 🙂. |
May as well use an actual table... is there a table (for example https://github.com/nrc-cnrc/EGSnrc/blob/master/HEN_HOUSE/data/eii_penelope.data) that you would recommend as being most relevant? |
I think having that transpilation would be amazing and it would be a good way to divide and conquer this. It would be great if you could investigate that avenue. |
Ok, this is at least interesting then! I have run your python script locally (with
For the record, here is the code I used: // compile with: g++ -std=c++17 -pedantic -Wall -O3 -Wextra particles.cc
// includes
#include <iostream>
#include <vector>
#include <chrono>
#include <random>
// defines
#define NUM_PARTICLES 1000000
#define ITERATIONS 10
#define RUNS 10
// main
int main () {
// positions
std::vector<double> x(NUM_PARTICLES, 0.0);
std::vector<double> y(NUM_PARTICLES, 0.0);
std::vector<double> z(NUM_PARTICLES, 0.0);
// directions
std::vector<double> u(NUM_PARTICLES, 0.0);
std::vector<double> v(NUM_PARTICLES, 0.0);
std::vector<double> w(NUM_PARTICLES, 0.0);
// energies
std::vector<double> E(NUM_PARTICLES, 0.0);
// random number generator
std::mt19937 generator(1);
std::uniform_real_distribution<double> sample(-1.0, 1.0);
// update particle arrays a number of times
for (int run=0; run<RUNS; run++) {
// poor man's timere
auto start = std::chrono::high_resolution_clock::now();
// update particle arrays
for (int i=0; i<ITERATIONS; i++) {
for (int n=0; n<NUM_PARTICLES; n++) {
x[n] += sample(generator);
y[n] += sample(generator);
z[n] += sample(generator);
u[n] += sample(generator);
v[n] += sample(generator);
w[n] += sample(generator);
E[n] += sample(generator);
}
}
// report duration
auto stop = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(stop-start);
std::cout << "duration = " << duration.count() << " milliseconds" << std::endl;
}
return EXIT_SUCCESS;
} So Jax/Numpy is not bad! I will try to run this locally on the GPU with |
And I've transpiled in a rough way probably about 70% of ELECTR (from Mortran, not Fortran
I'll post over there when I've got things developed a little further. |
Thanks @darcymason, nice to know you can transpile at this level already!! In my mind, transpilation would be extremely useful to convince ourselves that acceptable performance can be achieved with a python backend. For the actual implementation, I would root to rewrite from scratch (perhaps with some inspiration from the transpiles code), to actually get away from the somewhat convoluted and "fragile" original code. |
@ftessier I have the resources to be able to run a number of tests against both EGSnrc and the appropriate python replacement on the whole system level. Additionally I would be happy to do some unit testing on the python code. I would, however, need to a library of input files for testing. I've always been afraid of doing an implementation of EGSnrc myself because of how hard it is to unit test the original code base. |
Agreed - the transpiled code I'm getting so far is built for testing only; starting from scratch one could definitely make it more pythonic and take advantage of arrays, generators, etc. for improved readability, maintainability and speed. |
@crcrewso Thanks for the enthusiasm and the offer to help! Indeed, unit testing the original code is not a path I want to embark on (and loose myself in). For the back end, when it comes to testing the physics we can probably get away with fuzzy testing the hell out of every functions (e.g., compton, electr, brems, etc.), since these are mostly math operations, with well-defined inputs and outputs (except for the darn commons). Eventually, we can also test the geometry in this way (easier, since egs++ is already well packaged in objects with a well-defined-ish API). One more difficult difficult part to verify is At any rate, we have some way to go to test efficiency with toy models, before embarking on this outright. And I am still eyeing Rust, in other news. 😉 Please don't hesitate to comment further on the related |
@ftessier no worries. I’m glad this has picked up some steam. I’ll be happy to help testing! |
@darcymason That is awesome. Would you be able to share the partially transpiled code? I think it'd significantly help me understand what's going on 🙂. |
Sure, I'll work on posting it up - its still a hot mess, though, so I'll add an issue there to explain where I think the next steps are going. I think I see a path through to the end, but haven't hit all corners yet.
And another set of eyes would probably do the same for me 😄 |
WIP now available at darcymason/egsnrc2py. |
@mchamberland are you alright for me to close this issue? |
@mchamberland yep. I’m watching the other two repos where this is being worked on. |
Let's open the discussion on whether EGSnrc itself can be included in the web app. I have no idea where to even start. I'll offer what expertise I have.
The text was updated successfully, but these errors were encountered: