Change egsphant format to allow up to 95 materials #633

rtownson · 2020-09-15T17:58:49Z

Change the egsphant reading/writing algorithms in ctcreate and DOSXYZnrc so that up to 95 materials are allowed. Previously, for media indices greater than 9 they were replaced with * and the resulting egsphant file would not be valid, resulting in a crash in DOSXYZnrc.

As per Fred's suggestion below, the full range of 95 ASCII printable characters are used for the material indices in the egsphant file. This makes the file slightly less human-readable, but is backwards compatible (expanding to 2 digits per index is not). I made a change from his indexing scheme so that it starts from 0.

>>> nums  = [ i for i in range(0,95) ]
>>> chars = [ chr((i+16) % 95 + 32) for i in nums ]
>>> index = [ (ord(c)+47) % 95 for c in chars ]

>>> nums
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94]
>>> chars
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '<', '=', '>', '?', '@', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', '\\', ']', '^', '_', '`', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '{', '|', '}', '~', ' ', '!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/']
>>> index
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94]

This didn't require any changes to egs++ or egs_view, because the XYZ geometry uses the ramp file to assign materials. Other custom codes will need to be updated in order to read new phantoms with >9 materials, but phantoms generated with <=9 materials will still be backward compatible because the indexing starts at 0.

3rd party codes that I know of that might want to update: 3ddose_tools, egs_brachy, CERR (I'm not sure if it reads egsphant, but it does read 3ddose).

MartinMartinov · 2020-09-15T18:23:07Z

Hi Reid, Something I have actually been using to get around the 9 media issue in egs_brachy and 3ddose_tools is using the following list of numbers and characters

123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz

to enumerate up to 61 different media. The reason we went with adding characters instead of using 2 digits is so that our code could still read old egsphants (reading less than 9 media), while being able to use egsphant with more media. Just thought I'd add the CLRP perspective :)

Martin

ojalaj · 2020-09-15T18:26:07Z

@MartinMartinov Great idea to have backward compatibility.

rtownson · 2020-09-15T18:33:27Z

Thanks @MartinMartinov I like this suggestion, it does seem to provide better backwards compatibility.

mainegra · 2020-09-15T21:53:33Z

I also had a "fixed" version back when I was running calculations for TG-195. And I went with something similar to what @MartinMartinov suggested. I used a sort of hexadecimal numbering system. It is good that this is being finally addressed for good!

mainegra

The only issue with this is that if one wants to visually inspect the egsphant file there won't be any space between the values ... for a program that knows the format is not an issue though.

ojalaj · 2020-09-16T05:24:15Z

@mainegra there are no spaces in the current version with single digits (up to 9 materials) either, but I don't see it as a problem, when e.g. quickly "visualizing" .egsphant in some text editor.

ojalaj · 2020-09-16T06:33:09Z

@rtownson @mainegra @blakewalters When applying the fix above, the .egsphant turns out OK, but due to the change, in DOSXYZnrc scoring and/or .3ddose output there is something wrong (I did re-compile in both ctcreate and dosxyznrc). See attached screenshot. Actually when I open the .3ddose I see that every other value is 0, so in this screenshot the those low values might be interpolated by CERR.

I had an e-mail correspondence with @blakewalters about this ~10 years ago, but remembered only this one change and can't remember was there something else that needed to be changed in order to make this work (correctly).

ftessier · 2020-09-16T11:52:59Z

Thanks @MartinMartinov I like this suggestion, it does seem to provide better backwards compatibility.

While at it, why not allow for all 95 printable ASCII characters, starting at number 1 (char 49), up to the tilde (char 126), and then wrap around with space (char 32) up until number 0 (char 48). For a given medium index i (starting at 1), the corressponding ASCII code is c = (i+16) mod 95 + 32. Conversely, for a given character c, the corresponding medium index is i = (c+46) mod 95 + 1.

ojalaj · 2020-09-16T12:10:28Z

@ftessier et al - while I like the progress of more sophisticated solution for this, do you have some quick-fix for my issue above?

ftessier · 2020-09-16T12:29:24Z

do you have some quick-fix for my issue above?

I am guessing a corresponding change is required on the DOSXYZnrc side to read the egsphant information in double-digits. Looking into it now.

ftessier · 2020-09-16T12:37:04Z

Seems like the format for reading the media indices in DOSXYZnrc has to be adjusted to i2 (on line 1521). Try changing it to :medformat: FORMAT($IMAXi2); in your dosxyznrc.mortran file (under your $EGS_HOME), recompile with make and see if that works!

rtownson · 2020-09-16T14:33:43Z

@ftessier great suggestion on using the full ASCII set. I'll look into implementing this, unless you already are!

ojalaj · 2020-09-16T14:57:59Z

@ftessier Thanks a lot! Will try this once another simulation finishes (~tomorrow). I will confirm you then!

ftessier · 2020-09-16T15:45:36Z

@ftessier great suggestion on using the full ASCII set. I'll look into implementing this, unless you already are!

Go ahead! Note that using the space ' ' character (ascii decimal 32) might prove confusing, but this will happen for medium 79, so at this point one is already confused anyways 😂 (we could also skip the ' ' char in the list, for 94 media).

ftessier · 2020-09-17T16:32:58Z

@ftessier Thanks a lot! Will try this once another simulation finishes (~tomorrow). I will confirm you then!

Any luck @ojalaj?

ojalaj · 2020-09-17T17:23:27Z

Sorry @ftessier - I need to postpone this till tomorrow - I had accidentally used ECUT=0.521 in my simulation instead of 0.7, so needed to restart with 0.7 - and it will finish tomorrow so hopefully you will wake-up with new information on this.

rtownson · 2020-09-17T17:40:50Z

@ojalaj I have another alternative for you to test! The commit I just added does the switch to using the full range of ASCII characters as Fred suggested.

blakewalters · 2020-09-17T17:56:19Z

Hi Everyone: Sorry to be late to this party, and it sounds like you're already well on your way to iterating down to a solution, but I was able to retrieve that ancient email (2013) to which @ojalaj was referring:

"Hi Jarkko:

There should be no practical upper limit to the HU number that ctcreate can handle, provided that your CT ramp also goes up that high. Now, from your output, I can see that, for some reason, the high CT upper bounds you're entering for your ramp are causing the remainder of the inputs on those lines to shift to the right. These errors will cause an error in the phantom (.egsphant) file that you're creating. Now, the new version of ctcreate has a less restricted input format for these lines, so I would suggest that, once you get the new BEAMnrc release up and running, you use it to convert your data.

Currently, only one digit can be used to store medium number in the .egsphant file. Thus, medium numbers >=10 will cause problems. So, to get around the fact that you have 10 media, you'll have to modify a couple of codes:

Go into %OMEGA_HOME%\progs\ctcreate/\ctcreate.mortran and, within the subroutine write_phantom, change the line:

1399 FORMAT($IMAXi1);

to:

1399 FORMAT($IMAXi2);

then recompile ctcreate by typing "make" in this directory.

Go into %EGS_HOME%\dosxyznrc/dosxyznrc.mortran and change the line:

:medformat: FORMAT($IMAXi1);

to:

:medformat: FORMAT($IMAXi2);

then recompile dosxyznrc by typing "make" in this directory. This will allow dosxyznrc to read 2 digits specifying the medium no.

Good luck, and let us know if there are any further issues."

Now, I can't find a response to this particular email, so I'm assuming that the fix worked at the time.

ftessier · 2020-09-17T18:00:44Z

Thanks @blakewalters for confirming, and thanks @rtownson for the ascii conversion update, which means we don't have to modify dosxyznrc in the end, and .egsphant file size won't double overnight! 😃

ftessier · 2020-09-17T18:02:46Z

Thanks @blakewalters for confirming, and thanks @rtownson for the ascii conversion update, which means we don't have to modify dosxyznrc in the end, and .egsphant file size won't double overnight! 😃

Well we have to modify dosxyznrc for the conversion from ascii back to medium index:

i = (c+46) mod 95 + 1.

rtownson · 2020-09-17T18:06:24Z

Well we have to modify dosxyznrc for the conversion from ascii back to medium index:
i = (c+46) mod 95 + 1.

Oh, whoops. I made those changes but they didn't get caught. I guess I forgot to copy the modified dosxyz from egs_home to HEN_HOUSE! I'll fix that.

ojalaj · 2020-09-18T09:58:17Z

@ftessier If there is no reason not to increase $MXMED to 95, then why not? Having it already in maximum would perhaps decrease the possibility for Reddit forum questions titled "How do I increase the number of media in dosxyznrc"&""Why do I get this error with 25 media" etc.

Anyhow, @rtownson @ftessier I tried with changes provided here. I did re-compile both ctcreate and dosxyz (under EGS_HOME). .egsphant looks as expected, i.e., I have expected characters for media # 9+. Then I tried both batch and interactive runs. For interactive run I get following:

 BATCH #  TIME-ELAPSED  TOTAL CPUTIME  RATIO  TIME OF DAY  RNG pointers

     1          0.0            0.0      0.00    12:46:53   ixx jxx =   60  93

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x2B7F535A1697
#1  0x2B7F535A1CDE
#2  0x2B7F540343FF
#3  0x43C486 in photon_
#4  0x43DA3C in shower_
#5  0x49F8F8 in MAIN__ at dosxyznrc_e88d089.F:?
/home/XXXXXX/EGS2020d/HEN_HOUSE//scripts/run_user_code: line 225: 13103 Segmentation fault      (core dumped) $command -i $input_file

and for batch runs to wX.eo files the following:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x2B771A4D8697
#1  0x2B771A4D8CDE
#2  0x2B771AF6B3FF
#3  0x43C486 in photon_
#4  0x43DA3C in shower_
#5  0x49F8F8 in MAIN__ at dosxyznrc_e88d089.F:?

and some slurm related seg fault lines.

This shouldn't be memory related since in Slurm generated emails I have "Memory Utilized: 2.58 GB Memory Efficiency: 85.89% of 3.00 GB".

Any hints how to fix this?

ftessier · 2020-09-18T10:07:08Z

Thanks @ojalaj, I guess we have to go back and test this ourselves then 😃 . Strange, the code change seems pretty innocuous. The issue with increasing $MXMED is that in the mortran backend the memory is pre-allocated, so we have to be careful with array sizes, which can rapidly grow in larger phantoms! I will review if there is any issue with $MXMED in this regard. I agree that it makes perfect sense to increase it to 95.

ojalaj · 2020-09-18T10:19:53Z

@ftessier OK, I fully understand the memory issue with $MXMED.

FYI that since in ctcreate/Makefile it has DOSXYZ_HOME = $(HEN_HOUSE)user_codes$(DSEP)dosxyznrc$(DSEP), I made the changes to dosxyznrc.mortran both under HEN_HOUSE/user_codes/dosxyznrc and EGS_HOME/dosxyznrc, with 'make'. Don't know was this unnecessary/mistake. In addition, there were no other errors/warnings in run-time .egslog/.egslst files.

ftessier · 2020-09-18T10:24:45Z

I find some instances in the EGSnrc core where $MXMED is used in 2d and 3d arrays. For example, in the nist_brems common block, the other arrays dimensions are 50 and 100, so going from 20 media to 95, will increase the number of array elements by 375000, or 1.5 MB at 4 bytes per element. Although this is tiny by today's standard, it will impact performance by dispersing memory access. Might be interesting to test! @nrc-cnrc/egsnrc what do yout think?

mainegra · 2020-09-18T12:45:58Z

I find some instances in the EGSnrc core where $MXMED is used in 2d and 3d arrays. For example, in the nist_brems common block, the other arrays dimensions are 50 and 100, so going from 20 media to 95, will increase the number of array elements by 375000, or 1.5 MB at 4 bytes per element. Although this is tiny by today's standard, it will impact performance by dispersing memory access. Might be interesting to test! @nrc-cnrc/egsnrc what do yout think?

@ftessier one would have to test if there is any significant impact in performance when defaulting to 95 media instead of a much smaller number of media. A simplistic test could be running a simulation for a large homogeneous phantom once with $MXMED set to 1 and then increasing $MXMED to 95. See if there is any significant time penalty. This is not completely realistic as it will depend on the hardware available to a specific user. And probably OS as well. But at least we could get an idea of how bad it could get. I'm assuming in cluster environments with dedicated hardware this would be of no relevance.

But for users working with their home computers, maybe running VMs, RAM could be somewhat limited. Perhaps most machines in the world nowadays have enough memory to deal with EGSnrc defaulting to 95 media? 😉 I was working at home using a Windows laptop on WSL2 under the assumption 8GB was good enough for the quick things I was running, but the machine was constantly accessing the hard drive (swapping???) and at times slow. It all got better after I installed 16 GB of RAM and an SSD hard drive.

Maybe a compromising solution would be to increase $MXMED from 9 to 20 or 25 and hope it is very clearly explained in the manuals how change this.

BTW in common nist_brems the arrays are declared as $REAL which currently defaults to double precision, hence for your calculation above one must consider 8 bytes per element.

ftessier · 2020-09-18T14:17:28Z

I get roughly a 10 MB increase in RESident memory when going from $MXMED=1 to $MXMED=95 (I set the stack size to 10 and total maximum of voxels to 1x1x1 for the sake of this test). If I in fact used 95 media, then if I understand correctly, I may be using about 100 MB more (VIRTual size has increased by about that). I did not notice any time difference between the runs: the OS is smart enough to page only the active memory addresses in RAM residence (note that the TIME+ field below is not the total running time, just the time at which I took the snapshot!)

### MXMED = 1
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                   
21001 ftessier  20   0   51484   4128   1872 R  99.7  0.0   0:12.14 dosxyznrc

### MXMED = 95
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                   
20725 ftessier  20   0  150940  15476   1876 R 100.0  0.1   0:42.93 dosxyznrc

So all in all, I say we just increase the default $MXMED in dosxyznrc to 95. At any rate, once you start increasing the total number of voxels into the 10^6, you are already in the 10s of MB of memory, mitigating the increase due to $MXMED (media memory does not scale with the number of voxels).

For the record, with $MXMED=95, but the other defaults as is (stack size 10000, voxel limits 128x128x128), the memory looks like this:

### MXMED = 20 (current default)
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND   
25389 ftessier  20   0  153004  30412   1876 R  98.7  0.1   0:03.01 dosxyznrc

### MXMED = 95
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND   
25005 ftessier  20   0  237780  40044   1872 R 100.0  0.2   0:10.41 dosxyznrc

so resident memory increases from 30 MB to 40 MB, i.e. by 30%.

By the way, dosxyznrc (EGSnrc) itself can handle $MXMED > 95, this restriction is only relevant if you are going to read or write an .egsphant file using the 95 printable ascii characters.

rtownson · 2020-09-18T14:27:25Z

Any hints how to fix this?

@ojalaj Did you copy the new HEN_HOUSE/users_codes/dosxyznrc/dosxyznrc.mortran into egs_home/dosxyznrc? Did you get the most recent updates that has changes to dosxyz, in commit 6c75322?

ojalaj · 2020-09-18T15:31:52Z

@rtownson Yes and yes.

rtownson · 2020-09-18T15:48:12Z

@ojalaj could you email me the egsphant file, perhaps? Or the CT data and ctcreate input file, if it is anonymous.

ojalaj · 2020-09-18T16:09:25Z

@rtownson I'll send you a link to download the .egsphant and .inp soon.

ojalaj · 2020-09-21T20:17:36Z

After the most recent commit no seg fault anymore!

blakewalters

Nice work, @rtownson!

ftessier · 2020-09-22T16:38:22Z

I made a change from his indexing scheme so that it starts from 0.

Doesn't that break the legacy files, which expect indices 1 to 9? Oh, were they 0 to 9 before?

rtownson · 2020-09-22T16:59:36Z

That does not break the legacy files, which expect indices 1 to 9? Oh, were they 0 to 9 before?

Yes they were 0-9 before, which was a surprise to me. But both Jarkko and I had old egsphant files with 0's in them! I guess it made sense to use the extra character to get 10 materials.

blakewalters · 2020-09-22T17:06:08Z

Good question, @ftessier. It's always been possible (but I'm not sure how probable) to have med=0 in the .egsphant file but then this gets assigned to vacuum, not the first medium of the ramp (i.e. in the list of media in the .egphant file). I think we're okay as long as, on the ctcreate end, the ramp starts with med=1, which was the case when I tested this.

Add support for more than 9 materials in egsphant files by changing the material indexing to iterate through all 95 printable ASCII characters. Previously, media indices greater than 9 were replaced with '*' and the resulting invalid egsphant file caused a crash in DOSXYZnrc.

ftessier · 2020-09-22T18:20:41Z

@rtownson I squashed the 3 commites together; is the combined commit message fine by you?

ojalaj · 2020-09-22T18:47:41Z

I have always thought that med=0 is somehow 'reserved'/hardcoded for vacuum inside the code. E.g. in PIRS794 Ch 5.3 for medsur it says "The default setting of medsur is 0, indicating the surrounding region is vacuum.", Ch 16.2.5 for ctcreate it says "If the CT number in a voxel is less than material ct lower bound then the medium in that voxel is set to vacuum with density zero." (and in this case med=vacuum=0). And yes, as @rtownson wrote, I do have legacy .egsphant files with 0s, perhaps due to latter example from PIRS794.

Revert to the egs_brachy encoding for egsphant files, allowing for 63 media encoded with the set of alphanumeric characters (where 0 is reserved for vacuum). Briefly, #633 expanded the number of media in egsphant files to 95, using all printable ascii characters. However, the number of media was also independently expanded in egs_glib for the development of egs_brachy, but using a different encoding consisting of only the 63 alphanumeric ascii characters. This led to inconsistent egsphant files that are no longer interchangeable. This issue was originally reported and discussed in clrp-code/egs_brachy#22.

Revert to the egs_brachy encoding for egsphant files, allowing for 62 media encoded with the set of alphanumeric characters (where 0 is reserved for vacuum). Briefly, #633 expanded the number of media in egsphant files to 95, using all printable ascii characters. However, the number of media was also independently expanded in egs_glib for the development of egs_brachy, but using a different encoding consisting of only the 62 alphanumeric ascii characters. This led to inconsistent egsphant files that are no longer interchangeable. This issue was originally reported and discussed in clrp-code/egs_brachy#22.

rtownson added the bug label Sep 15, 2020

rtownson requested review from ftessier, mainegra and blakewalters September 15, 2020 17:58

rtownson self-assigned this Sep 15, 2020

rtownson marked this pull request as draft September 15, 2020 18:29

mainegra reviewed Sep 15, 2020

View reviewed changes

rtownson changed the title ~~Fix ctcreate to allow double digit media indices~~ Change egsphant format to allow up to 95 materials Sep 17, 2020

rtownson force-pushed the fix-ctcreate-nmed branch from d087ea1 to 6c75322 Compare September 17, 2020 18:07

rtownson marked this pull request as ready for review September 17, 2020 18:18

blakewalters approved these changes Sep 21, 2020

View reviewed changes

mainegra approved these changes Sep 22, 2020

View reviewed changes

ftessier force-pushed the fix-ctcreate-nmed branch from 11c6aee to 373b42e Compare September 22, 2020 18:19

ftessier merged commit 1a094d9 into develop Sep 22, 2020

ftessier deleted the fix-ctcreate-nmed branch September 22, 2020 20:06

rtownson mentioned this pull request Dec 16, 2021

EGSnrc update for >9 egsphant materials requires egs_glib update clrp-code/egs_brachy#22

Closed

ftessier mentioned this pull request Jul 8, 2022

Discrepancy between EGSnrc ctcreate and egs_glib egsphant media encoding #894

Closed

ftessier mentioned this pull request Jul 8, 2022

Fix #894: revert egsphant to egs_brachy encoding #895

Merged

Change egsphant format to allow up to 95 materials #633

Change egsphant format to allow up to 95 materials #633

Conversation

rtownson commented Sep 15, 2020 • edited

MartinMartinov commented Sep 15, 2020 • edited by ftessier

ojalaj commented Sep 15, 2020

rtownson commented Sep 15, 2020

mainegra commented Sep 15, 2020

mainegra left a comment

Choose a reason for hiding this comment

ojalaj commented Sep 16, 2020

ojalaj commented Sep 16, 2020

ftessier commented Sep 16, 2020 • edited

ojalaj commented Sep 16, 2020

ftessier commented Sep 16, 2020

ftessier commented Sep 16, 2020

rtownson commented Sep 16, 2020

ojalaj commented Sep 16, 2020

ftessier commented Sep 16, 2020 • edited

ftessier commented Sep 17, 2020 • edited

ojalaj commented Sep 17, 2020

rtownson commented Sep 17, 2020

blakewalters commented Sep 17, 2020 • edited

ftessier commented Sep 17, 2020

ftessier commented Sep 17, 2020 • edited

rtownson commented Sep 17, 2020

ojalaj commented Sep 18, 2020 • edited

ftessier commented Sep 18, 2020

ojalaj commented Sep 18, 2020

ftessier commented Sep 18, 2020

mainegra commented Sep 18, 2020

ftessier commented Sep 18, 2020 • edited

rtownson commented Sep 18, 2020

ojalaj commented Sep 18, 2020

rtownson commented Sep 18, 2020

ojalaj commented Sep 18, 2020

ojalaj commented Sep 21, 2020

blakewalters left a comment

Choose a reason for hiding this comment

ftessier commented Sep 22, 2020 • edited

rtownson commented Sep 22, 2020

blakewalters commented Sep 22, 2020

ftessier commented Sep 22, 2020

ojalaj commented Sep 22, 2020

rtownson commented Sep 15, 2020 •

edited

MartinMartinov commented Sep 15, 2020 •

edited by ftessier

ftessier commented Sep 16, 2020 •

edited

ftessier commented Sep 16, 2020 •

edited

ftessier commented Sep 17, 2020 •

edited

blakewalters commented Sep 17, 2020 •

edited

ftessier commented Sep 17, 2020 •

edited

ojalaj commented Sep 18, 2020 •

edited

ftessier commented Sep 18, 2020 •

edited

ftessier commented Sep 22, 2020 •

edited