Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gamma Index multiprocessing #72

Closed
jasqs opened this issue Feb 19, 2019 · 33 comments
Closed

Gamma Index multiprocessing #72

jasqs opened this issue Feb 19, 2019 · 33 comments
Labels
enhancement New feature or request

Comments

@jasqs
Copy link
Collaborator

jasqs commented Feb 19, 2019

Hi,

I am using the gamma index analysis in my project and need to perform analysis for many 3D dose distributions which is time consuming. Do you plan to implement multiprocessing to gamma_shell script?

best
Jan Gajewski

@SimonBiggs
Copy link
Member

Hi @jasqs,

I certainly could do just that for you if you would like. You mention CLI, what do you envision the output of the CLI would look like? Would it be just a pass rate? Or would you be after a full 3D grid saved in an accessible file format?

Also, to help me, if you were to design it yourself what inputs would you use? I'm not asking you to design it, but if you provide me with how you might envisage using it that would help me in making sure what ever is made meets your need.

Cheers,

Simon

@SimonBiggs
Copy link
Member

Actually sorry, it's been a long day on my end. I completely misread your question.

Yes, I could implement multi processing if you would like. As a question, do you happen to have an NVIDIA card? I could also potentially implement CUDA acceleration which would drastically speed things up.

Also, the gamma function in its current form can already have a few adjustments made just with parameter choice to make it relatively fast. @Centrus007 you've done quite a bit of work using this, would you be able to give a little guidance on parameters that an be used for a fast calculation.

Cheers,

Simon

@jasqs
Copy link
Collaborator Author

jasqs commented Feb 19, 2019

Hi,

thank you for quick response. Actually I have even two GPU NVIDIA Titan X cards available in my server. We normally use them to make Monte Carlo calculations. The implementation of gamma index on GPU would certainly solve our problem with time performance (as it did with Monte Carlo). Would it be too much effort to prepare such version for us? Also version with multi processing would be helpful as our server has 40 cores.

Jan Gajewski

@SimonBiggs
Copy link
Member

SimonBiggs commented Feb 19, 2019 via email

@SimonBiggs
Copy link
Member

SimonBiggs commented Feb 19, 2019 via email

@Matthew-Jennings
Copy link
Member

Hi @jasqs,

I visited Cracow in June last year! It is a beautiful city. I didn't know you had a proton therapy centre there! I'd have visited it if I'd known. We're getting one soon here in Adelaide, Australia. :)

Until multiprocessing is implemented, here are some things you can try to improve gamma calculation times:

  1. (recommended) Try calculating gamma on a random voxel subset of your dose grid. Experiment with it as needed.

    • You can do this by passing a value (say, 10000) to the random_subset parameter. Although not thoroughly tested, we've found that calculating gamma on just 3% of the total number of dose grid voxels gives an almost identical pass rate as calculation on the full voxel population. I haven't yet inspected the gamma distribution itself (rather than pass rate), but I'm confident it will be very similar. You can test this yourself very easily.
    • The reduction in calculation time roughly approximates the percentage of total points used for the random subset. For example, for one of my personal cases, the dose grid had 290,000 points which took ~5 min to calculate normally. Calculating on a random subset of 10000 points (~3.4% of the total) reduced the calculation time to ~13 seconds with no change in pass rate.
  2. If pass rate is all that matters, try setting the max_gamma parameter to a value just above 1 (say, 1.1). Beware that we've observed slight (<1%) changes in the pass rate value for different values of max_gamma above unity. The pass rates should strictly be identical. We're aware and debugging. But if strict accuracy isn't too important, this is an option.

  3. Try reducing the interp_fraction parameter from its default of 10 to 5. Once again, you might observe a small change in pass rate.

@Matthew-Jennings Matthew-Jennings added the enhancement New feature or request label Feb 20, 2019
@SimonBiggs
Copy link
Member

The pass rates should strictly be identical. We're aware and debugging.

Could have that just been fluctuations due to the use of random_subset?

@jasqs
Copy link
Collaborator Author

jasqs commented Feb 20, 2019

Thank you @Centrus007 and @SimonBiggs for comments.

@Centrus007 we have proton centre in operation since 2016 (more than 100 patient treated -> CCB). If you visit Krakow someday just let me know and you are invited. What solution (company) are you going to have in Adelaide? In Krakow we have IBA cyclotron and two dedicated gantries with scanning as well as self-made eye treatment room. I am working in a research group in two projects. One of it is dedicated to use fast MC simulations (FRED GPU based MC tool -> paper) in clinical routine.

I will test the random voxel subset technique and let you know the results. The solution with max_gamma is good to calculate passing rate only but in our study we want to have the full 3D GI map to be able to import it in TPS and see where exactly the differences are.

@SimonBiggs I would be very interested in contribution to pymedphys. Right now my code is used mostly by me and my colleagues. It is not commented properly now and probably many things in there could be written much better as I am not an expert in python (yet). For the last few years I have been writing in Matlab mostly. Now I moved all my work to python as it was more convenient to share with our external partners. In my repository you can find mostly functions for dose evaluation, scanning spot analysis, Bragg peak analysis (including Bortfeld fit), DVH analysis and read/write files specific for our FRED MC tool and dicom. If you think any of this could be interested for you I can share it and modify/comment with needs.

@SimonBiggs
Copy link
Member

@jasqs, most of what you've described there would be most welcome within pymedphys if you would be okay with it being in there.

Don't stress too much about the coding style, if you just bring one function in we can bounce back and forth iterating together until we're all happy about what is being merged in. I find I learn most effectively in processes like that, having people read my code while providing feedback and guidance.

I would recommend picking one of those functions (maybe the Bragg peak analysis tools) and just picking a small function to start with, submit a pull request, trying to follow the style of the functions you already see within pymedphys. Ask questions, and we can code a bit with you on your first sets to help out.

@Centrus007 can you add @jasqs to the repo and give him membership rights? And maybe add a module for @jasqs to begin writing in? @jasqs do you have a preference for the name of the module that you will begin writing in?

@Matthew-Jennings
Copy link
Member

@jasqs, I haven't personally had a lot of involvement with the project; but a good friend of mine, Scott Penfold (https://researchers.adelaide.edu.au/profile/scott.penfold), is front-running the project from a physics perspective. I actually spoke with him a little today and he said he'd seen some presentations on FRED; it's a small world!

My understanding is that PROTOM won the contract in the end, but I'm not sure on any more specifics (e.g. number of beam lines).

If you do decide that you'd like to contribute (please know that we'd really appreciate it if you do!), I've added you to the PyMedPhys repo. You should be able to make contributions now! Some things to note if you do contribute:

  • The meat of the code is in src/pymedphys. You'll notice that there are a number of folders within src/pymedphys that organise the project into "modules". Although far from perfect at the moment, these help logically separate the code in the project. For code that needs documentation, our documentation sections roughly match these modules.
  • Unless you think your content already fits nicely within an existing module, I'd agree with @SimonBiggs and suggest we make a new one for you. You can name it whatever you like! I've had a little look through your code, and the scanning-spot analysis and Bragg-peak stuff look like great additions!
  • Obviously you can add or leave out whatever you like. Please don't worry too much about commenting, code quality, etc. We can work with you on that. Besides, some of our own work could do with plenty of polishing :)

@SimonBiggs
Copy link
Member

@jasqs could you give installation of the following two packages a shot:

Let me know how you go. If you are able to install those, then all that needs to be done is within the gamma code I just need to run an import check for tfinterp, and then verify it and TensorFlow are built and installed with CUDA support, if both of those conditions are true then I would dynamically swap out RegularGridInterpolator for tfinterp.regular_nd.

That way GPU support is optional, but available for those advanced users who want it.

Then instead of the interpolation search being done on the CPU with scipy.interpolate.RegularGridInterpolator, that part of the computation would be done on the GPU using TensorFlow and tfinterp.regular_nd.

@jasqs
Copy link
Collaborator Author

jasqs commented Feb 25, 2019

@SimonBiggs Unfortunately I am not able to compile tfinterp. I have properly installed CUDA 9.0 but I get error in compilation of tfinterp. I will try to work with this but I am not a C++ expert so I do not really know where is the problem yet.

@SimonBiggs
Copy link
Member

SimonBiggs commented Feb 25, 2019 via email

@SimonBiggs
Copy link
Member

SimonBiggs commented Mar 1, 2019

Notes regarding setup. I will make multiple edits to this comment until I am complete. I have not finished this yet, but I am recording here what I have been doing on Ubuntu 18.04. Based on installation instructions at https://docs-cupy.chainer.org/en/stable/install.html.

Downloaded the following https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.105-1_amd64.deb

Based on the instructions at https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=debnetwork I ran:

sudo dpkg -i cuda-repo-ubuntu1804_10.1.105-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-10-0

Then

pip install cupy-cuda100

@jasqs
Copy link
Collaborator Author

jasqs commented Mar 18, 2019

@SimonBiggs I there any progress in implementation of gamma index calculation on multiprocessing/GPU?

@SimonBiggs
Copy link
Member

There was a little, I tried a few different angles. But I think for now I won't dive into the GPU side of things. I shall hopefully prioritise the multiprocessing option soon however.

@SimonBiggs
Copy link
Member

@jasqs so three places where the code is slowest seem to all relate to a numpy array being copied. I suspect I will initially get a far better speed up by making those arrays not need to be copied instead of implementing multiprocessing. So I'm going to focus on the following issue instead:

#79

@jasqs
Copy link
Collaborator Author

jasqs commented Apr 10, 2019

@SimonBiggs did you try to implement GI no multiprocessing? I am trying to calculate 3D GI for image with ~8.5E6 pixels (head and neck CT resolution) and i takes very long time. Trying to calculate only for random subset of the data takes also very long and I cannot validate the results as I am not able to calculate the full image in reasonable time. I think If not on GPU calculating on multiple CPUs would solve the problem in my case.

@SimonBiggs
Copy link
Member

SimonBiggs commented Apr 10, 2019 via email

@SimonBiggs
Copy link
Member

SimonBiggs commented Apr 10, 2019 via email

@Matthew-Jennings
Copy link
Member

Matthew-Jennings commented Apr 10, 2019

For however many cores you have, split the reference grid into that many
pieces, then calculate gamma on each core for the subset of the reference,
and then join the resulting gamma values back up at the end.

Is it that simple? How do you handle the edges of each subgrid? Have the grids overlap by an amount at least as large as the distance_to_agreement, and then be sure to score each edge voxel in only one of the subgrids?

@SimonBiggs
Copy link
Member

SimonBiggs commented Apr 11, 2019 via email

@Matthew-Jennings
Copy link
Member

Ah yes, that makes sense.

@SimonBiggs
Copy link
Member

:)

@jasqs
Copy link
Collaborator Author

jasqs commented Apr 11, 2019

This is exactly I was thinking it should work. I did not finished the calculation on full grid. It was calculation for 2 days and I had to stop it :)

@SimonBiggs
Copy link
Member

SimonBiggs commented Apr 11, 2019 via email

@jasqs
Copy link
Collaborator Author

jasqs commented Apr 12, 2019

using just 10 random points it calculated in about 3 min.

@SimonBiggs
Copy link
Member

Can you investigate the dose grids and see how different the evaluation and the reference are at those positions?

@jasqs
Copy link
Collaborator Author

jasqs commented Apr 12, 2019

I managed to split calculations for 40 cores but each instance loads a copy of the whole evaluation grid. This takes a lot of RAM. I suppose the multiprocessing should be done internally in gamma_shell so the evaluation grid would be loaded only once.

@SimonBiggs I will check the differences and get back.

@SimonBiggs
Copy link
Member

SimonBiggs commented Apr 12, 2019 via email

@SimonBiggs
Copy link
Member

SimonBiggs commented Apr 12, 2019 via email

@SimonBiggs
Copy link
Member

SimonBiggs commented Apr 12, 2019 via email

@SimonBiggs
Copy link
Member

I think potentially this will find a solution in better documentation of the speed profile of the gamma tool. Make it clear that it will be very slow for disagreeing datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants