Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-thread experiment not using the specified number of threads #3658

Closed
NaroaCS opened this issue Mar 9, 2023 · 17 comments
Closed

Multi-thread experiment not using the specified number of threads #3658

NaroaCS opened this issue Mar 9, 2023 · 17 comments
Assignees
Labels
About Batch This issue concerns batch experiments About HPC This issue concerns the use of GAMA in an High Performance Computing environment 😅 Workaround Issue is not fixed but a workaround exists
Projects
Milestone

Comments

@NaroaCS
Copy link

NaroaCS commented Mar 9, 2023

Describe the bug
I am running a batch experiment on a server that has 40 threads ( 2 CPU x 10 thread/CPU x 2 hyperthreading).
I have defined the parameter parallel: 40 when defining the experiment in the .gaml script, but when I launch it, it only runs 4 threads in parallel.

To Reproduce
I guess it is difficult to reproduce, but this is how I have defined the experiment:
experiment batch_experiments_pheromone type: batch parallel: 40 repeat: 15 until: (cycle >= numberOfDays * numberOfHours * 3600 / step) {
parameter var: evaporation among: [0.05, 0.1, 0.15, 0.2,0.25,0.3];
parameter var: exploitationRate among: [0.6,0.65,0.7, 0.75, 0.8];
parameter var: numBikes among: [150, 250, 350];
parameter var: WanderingSpeed among: [1/3.6#m/#s,3/3.6#m/#s,5/3.6#m/#s];
}

and here is the full code: https://github.com/CityScope/VehicleClustering.git

I am running the script with the following command, which I execute in the 'headless' folder of GAMA:
bash gama-headless.sh -batch batch_experiments_pheromone ../../VehicleClustering/CS_CityScope_GAMA/models/clustering.gaml

Expected behavior
I would expect GAMA to make use of all the threads available instead of only 4. The way that I see this is because it prints 'FINISH INITIALIZATION' four times in a row, runs the four experiments, and repeats the process once it is done. See the screenshot below.

Screenshots
image

image

Desktop (please complete the following information):

  • OS: Ubuntu 22.04.1
  • PC Model: Server running Ubuntu
  • GAMA version: 1.8.2
@ptaillandier
Copy link
Collaborator

I tried with GAMA version 1.9 (here) and it seems to work well. You should use this version rather than GAMA 1.8.2. However, with the default options, GAMA will not run 40 simulations in parallel, but only 15 corresponding to the number of repetitions. If you really want to run 40 in parallel, you have to set to "true" this preference: "Execution" -> "Parallelism" -> "In batch mode, allow to run simulations with different parameter sets..."

Or can add this line in the init of the experiment:
gama.pref_parallel_simulations_all <- true;

@ptaillandier ptaillandier added the ❌ Not reproducible The issue cannot be reproduced in the same context label Mar 10, 2023
@chapuisk
Copy link
Contributor

Hey,
It also exists an option for the batch script wrapper, -hpc (see more on the -help of the script) that allows to define number of core gama will use distributing experiment runs,
Kevin

@NaroaCS
Copy link
Author

NaroaCS commented Mar 10, 2023

Thanks so much to both of you - it is working now!! 😄
Here's what I did:

  • I updated to GAMA version 1.9.
  • When I included the -hpc 40 in the bash command, it started to launch the 15 repetitions in parallel
  • Once I included the gama.pref_parallel_simulations_all <- true; in the experiment init, it launched the 40 in parallel.

@RoiArthurB
Copy link
Collaborator

@chapuisk the -hpc option should be needed only when you want to limit the number of threads, by default it's supposed to take every resources available.

@NaroaCS can you verify if it's still works with patrick's suggestion but without the hpc option? If not, it'll be another issue 🙃

@NaroaCS
Copy link
Author

NaroaCS commented Mar 10, 2023

Just tested:

  • It works well without the -hpc now that I have gama.pref_parallel_simulations_all <- true;
  • Without the gama.pref_parallel_simulations_all <- true; I understand that it should be running 15 threads, but it only does so if I include the -hpc

@AlexisDrogoul
Copy link
Member

@NaroaCS So even on GAMA 1.9, w/o the -hpc option and w/o gama.pref_parallel_simulations_all <- true; , you only run 4 simulations in parallel ?
@RoiArthurB If it is the case, could you open a new issue specifically on this ?

@AlexisDrogoul AlexisDrogoul added 😅 Workaround Issue is not fixed but a workaround exists About Batch This issue concerns batch experiments labels Mar 11, 2023
@AlexisDrogoul AlexisDrogoul added this to the GAMA 1.9.0 milestone Mar 11, 2023
@AlexisDrogoul AlexisDrogoul added the About HPC This issue concerns the use of GAMA in an High Performance Computing environment label Mar 11, 2023
@RoiArthurB
Copy link
Collaborator

@AlexisDrogoul from what I understand, with the pref_parallel_simulations_all everything works as expected, but without it, it needs the -hpc parameter to have more than 4 simulations...

That's quite strange, but it seems quickly fixable by enabling pref_parallel_simulations_all by default for batch headless (which, IMO, is some expected behavior) 🤔🤔🤔

@AlexisDrogoul
Copy link
Member

OK. I agree with this change. At first, I thought that we might not need to set the preference (as it might have other side effects), but it appears that the exploration algorithms do not factorise a lot their calls to this preference, so there is no single place where we could write something like if (GamaExecutorService.CONCURRENCY_SIMULATIONS_ALL.getValue() || experiment.isBatch() && experiment.isHeadless())...... unless of course you create a new method in GamaExecutorService and call that method instead.
So I guess it is ok to set the value of the preference launching a new experiment.

@AlexisDrogoul
Copy link
Member

I've committed a solution -- please test and close the issue if it is the intended behaviour.

@AlexisDrogoul
Copy link
Member

The commit is here: gama-platform/gama@ba1f1ee

(somehow, my first line has disappeared).

@NaroaCS
Copy link
Author

NaroaCS commented Mar 13, 2023

I just tested it with this version
GAMA_1.9.0_Linux_with_JDK_03.13.23_648d692c.zip (sorry I didn't know how to download it from the commit).

Now, w/o pref_parallel_simulations_all and w/o -hpc:

  • If I set parallel: 40 in the experiment, it correctly launches 40 simulations
  • If I set parallel: 15 in the experiment, it launches 15
  • If I don't define the parallel option, it also launches 40 sims because that is the number of threads

If I do it w/o pref_parallel_simulations_all but setting -hpc 15, the -hpc seems not to do anything, and it still launches 40 sims.

@AlexisDrogoul
Copy link
Member

AlexisDrogoul commented Mar 13, 2023

@RoiArthurB if my understanding is correct, -hpc should be treated like parallel:, right ?

So we should probably work a bit on the priority of these options.

Right now, we have:

  • no -hpc, no parallel:: pref_parallel_simulations_all is considered as true and all cores are used.
  • no -hpc, parallel: defined: pref_parallel_simulations_all is considered as true but in the limit defined by parallel
  • -hpc defined, parallel: defined or not: it seems that the value of -hpc is not considered.

In my opinion, when the modeler wants to limit the number of threads, -hpc should have the priority, followed by parallel:. The easiest solution would then be to translate the value passed from -hpc to parallel: before launching the experiment. What do you think ?

@NaroaCS
Copy link
Author

NaroaCS commented Mar 21, 2023

Not sure if this is a related issue or maybe it is unrelated. Let me know and I can move it to a new Issue if needed.

Currently, I am running 15 threads in parallel in a server that has 1 T of RAM and 40 threads.

I have set the specs in Gama.ini as follows:

-Xms4096m
-Xmx100g
-Xss1g
-Xmn50g

But I keep getting this error after some hours of execution:

Message: Your system is running out of memory. GAMA will exit now. Please try to quit other applications and relaunch it
Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
        at java.prefs/java.util.prefs.FileSystemPreferences.sync(FileSystemPreferences.java:768)
        at java.prefs/java.util.prefs.FileSystemPreferences.flush(FileSystemPreferences.java:844)
        at java.prefs/java.util.prefs.FileSystemPreferences.syncWorld(FileSystemPreferences.java:484)
        at java.prefs/java.util.prefs.FileSystemPreferences$3.run(FileSystemPreferences.java:451)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
        at java.base/java.util.TimerThread.run(Timer.java:516)
naroa@matlaberp4:~/GAMA_1.9.0_march13/headless$ 

This is the behavior of the server:
image

It seems that the memory use is very low, but the cache and buffer are high (green plot), but may not be due to this process, since it was already high, it seems, before I launched anything. Does the cache and buffer memory trigger the OutOfMemory error? I also see some peaks in the load, which I am not sure should have happened, but not sure if this can also have an impact on memory usage.
Do you have any ideas on why this might be happening? Is Gama accumulating data that I should be clearing? I am saving everything that I need on .csv-s, so I wouldn't need any other info on the simulations.

Thanks for your help!

@RoiArthurB
Copy link
Collaborator

In my opinion, when the modeler wants to limit the number of threads, -hpc should have the priority, followed by parallel:. The easiest solution would then be to translate the value passed from -hpc to parallel: before launching the experiment. What do you think ?

@AlexisDrogoul I aggree and will commit something to apply it.


Currently, I am running 15 threads in parallel in a server that has 1 T of RAM and 40 threads.

I have set the specs in Gama.ini as follows:

-Xms4096m
-Xmx100g
-Xss1g
-Xmn50g

But I keep getting this error after some hours of execution:

Message: Your system is running out of memory. GAMA will exit now. Please try to quit other applications and relaunch it
Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
        at java.prefs/java.util.prefs.FileSystemPreferences.sync(FileSystemPreferences.java:768)
        at java.prefs/java.util.prefs.FileSystemPreferences.flush(FileSystemPreferences.java:844)
        at java.prefs/java.util.prefs.FileSystemPreferences.syncWorld(FileSystemPreferences.java:484)
        at java.prefs/java.util.prefs.FileSystemPreferences$3.run(FileSystemPreferences.java:451)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
        at java.base/java.util.TimerThread.run(Timer.java:516)
naroa@matlaberp4:~/GAMA_1.9.0_march13/headless$ 

@NaroaCS If you still start the headless with the bash script, then the configuration file Gama.ini isn't read. Actually, you should use the parameter -m which gonna set the eclipe parameter -Xmx as follow :

$ bash gama-headless.sh -m 100g -batch batch_experiments_pheromone ../../VehicleClustering/CS_CityScope_GAMA/models/clustering.gaml

This is because the headless is started with Java "by hand" (not through and binary as GAMA GUI) and we never tougth about reading this file... But it might be a great default behavior, will make it too 🤔

Also, we do not recommand to set RAM per thread (with -Xss or -Xmn) and let GAMA scale in the full RAM and max cores itself (it can allow GAMA to better dynamically allocate RAM to threads when needed, etc).

Do you have any ideas on why this might be happening? Is Gama accumulating data that I should be clearing? I am saving everything that I need on .csv-s, so I wouldn't need any other info on the simulations.

I think it's because you didn't allow max memory properly (and it was limited at 4GB).

Also, if you want to make better use of the RAM over long batch, you can add the facet keep_simulations: false to your experiment. By defaut, this facet is on true as it's keeping ended simulations in RAM to allow drawing some charts over every simulations; but as you said that you don't need any tracks, you can add this to have the garbage collector erasing every ended simulation from the memory ;)

@lesquoyb lesquoyb moved this from To be tested to To fix in Gama 1.9.1 Mar 22, 2023
@lesquoyb lesquoyb removed ❌ Not reproducible The issue cannot be reproduced in the same context 👍 Fix to be tested labels Mar 22, 2023
@NaroaCS
Copy link
Author

NaroaCS commented Mar 22, 2023

Thanks so much, @RoiArthurB !! :D
I've just launched the simulations, so I'll need to wait for a day or so to see if it stays alive. I'll keep you posted!

@NaroaCS
Copy link
Author

NaroaCS commented Mar 24, 2023

That was it! I'm not getting the memory error anymore!
Thanks so much 😃

@lesquoyb lesquoyb moved this from To fix to Done in Gama 1.9.1 Mar 28, 2023
@lesquoyb
Copy link
Collaborator

I think that's enough for your initial problem, I'm closing this issue and leave @RoiArthurB open one new for the -hpc problem and one for reading parameters from the ini file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
About Batch This issue concerns batch experiments About HPC This issue concerns the use of GAMA in an High Performance Computing environment 😅 Workaround Issue is not fixed but a workaround exists
Projects
No open projects
Gama 1.9.1
  
Done
Development

No branches or pull requests

6 participants