New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup loop_ecology that was too slow for the evol simulation #58
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seams to describe the described speed up and be compatible with reported unit test on species list and richness plotting objects.
Thank you for looking into this and contributing to the code. Unfortunately the code does not fully recreate the species object class (Admittedly that's not properly covered by the unit tests):
While it's a bit faster than the current implementation I'm not convinced that this is an actual bottleneck? Test with a 5828 species file I had at hand said ~700ms vs 420ms (median) for the current vs new implementation. Do you have profiling data that shows this to be the bottleneck? Nevertheless thank you again for looking into this. |
@benj919 Hi, Sorry to have forgotten to mention that only several evol simulations(~20) are really slow with the original implementation, the modified part becomes somehow a bottleneck only with one of those evol landscape configurations, like the l12421th andscapes config file, and it becomes really slow only after about 200 iterations, at the very beginning it is still fast enough.
Wheras I printed out the content of species_list for both the current implementation and the improved one into 2 different text files then compared the 2 files with linux command diff and I see no difference. PS: The evol simulation with the 12421th landscapes configuration could not end in 48 hours. For the current implementation, after about 200 steps, the concerned code segment took about 40 seconds meanwhile the improved implementation takes about 6 seconds. |
Obviously the identical function fails for the doubles here, printing them rounds and truncates values from the true value. If you compare the View() calls in the sample code above you'll see the missing class attributes.
How many species do you have in your simulations? I don't now what your enumeration of simulations stands for. Do you have a profiling trace for them? |
The one I used for the number 12421 has a parameter as max_number_of_species = 200000, I used the files provided by ohagen, and I do not have a profiling trace for the current implementation because the concerned simulation does not seem to be able to finish in 48 hours. |
I have added the explicit return statement to test2. If there's none R will (silently) return the last object accessed in a function. Do you have some profiling data for any of your simulations? Simulations with a lot of species routinely run for a long time. How did you identify this part of the code as the bottleneck? |
I submitted a lot of simulations on a cluster and found a few that did not finish in 48 hours. I then picked up one simulation with all its config files and landscapes files, I locally keep running it with some timing code inserted in the package. Then I found out the ecology_loop function becomes slower and slower, I then put more timing instructions in that function to figure out which lines are the cause. If we keep running the package for the evol simulation 12420 provided by ohagen, the current implementation will become slower at each iteration, after 200 iterations, the concerned code segment will take up to 40 seconds. The total runtime becomes bigger at each step, that's how I figured out this as a bottleneck. WIth the modified implementation, the simulation will not become too slower, at step 200th, the concerned code with modification only takes about 6 seconds. By the way, I compared also the resulting species plots generated at each iteration between the current implementation and the modified one, I see no difference. |
Also you can check the objects to themself before and after applying the function:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZHG2017 , I forwarded @benj919 the last optimization report you sent me, which might contain some hints to the question he asked. I will merge this after @benj919 questions are replied.
@benj919, I will add an further issue if you think that adding the tests that @ZHG2017 developed could be handy for our checking routine.
For optimized code please ensure that the results are bitwise identical to the previous results. Unless there is a proven reason to have the code change the result. For identifying bottlenecks please familiarize yourself with the profiling functions in R to get a reliable overview on what's going on. I have forwarded @ohagen the scripts I used during the initial optimizations to check for correctness and a calling script with optparse that has a switch available to enable and run the profiling data collection. |
I don't think I could use any script to identify the bottleneck because the concerned simulation will last for more than 48 hours that I have never waited until its end to identify this bottleneck. At the moment, I have no idea how to get the bitwise identical result as previous one. So the issue becomes either we do not have the bitwise identical result or abandon those slow simulations. |
You don't need those simulations to finish to start looking for trends. You can stop them after 48h and evaluate the profiling traces already. How many time steps did those simulations finish in 48hrs? Out of those how many hours were spent in the above optimized code fragment? We know that simulations with many species have an escalating cost associated with them. That's why we originally implemented those abort conditions on too many species being alive. You should get the code for the bitwise comparison results from ohagen soon. |
Let me provide a bit more details, there is a folder for the config files(more than one file inside), there is another folder for the landscape files(more than one file inside), the R script will use both folders for one specific simulation. My question is then how could we benchmark and identify the bottleneck of this simulation that will never finish ? |
By not finishing it. The logfiles should tell you roughly how many timesteps a simulation can finish within 48hrs. By adjusting either the start or end timestep you can limit the simulation to run for a shorter amount of time. Or you just abort the simulation after 48hrs and compare the partial outputs. Once you have identified and optimized the bottlenecks present in those partial runs you can repeat the process and progress further into the simulation in 48hrs. Depending on the input size you should be able to figure out the trends within 12hrs or even shorter runs already. |
I need to mention more details, keep running this simulation locally, we could see that at each step, the simulation runs slower than previous step, the printed out timing I ever used shows that, the only code segment that will increase its runtime at each step is the one I tried to improve. And locally, I could see obvious speedup with the modification, so I wonder what would be the necessity to run another benchmark for identification of bottleneck. By the way, I am reading your script, I see there is the Rprof() used for the benchmark purpose that I have ever considered before, but I personally don't like it since it shows too many irrelevant informations , like those intermediate functions calls, which will be troublesome. |
You need to show the overall behaviour of the code. If the simulation gets slower with every time step the base assumption is that every piece of code gets slower. The interesting part is which piece gets disproportionally slower. For that you need to either do the profiling, or if you want to go with the timing you need to fully time all the major parts of the code. One you have that information you can, for example runtime vs time step, where every bar is split into the time per component/function. Do you have graphs for this showing the behaviour of all code parts? If this is the only code segment that gets slower the simulation would have to go through ~4000+ time steps at 40sec/iteration? Rprof can be used to profile by code line as well, which usually promotes the high level functions to the top of the list; Rprof(..., line.profiling = TRUE) and summaryRprof(..., lines="show"). I agree that it shows a lot of things, but they are needed to decide on what is actually important and what is not. |
I first put some timing for each function in the main loop of the package and saw only the loop_ecology() had this behavior, the other functions like loop_dispersal() or loop_evolution() does not have such a behavior. Then I did some timing inside the loop_ecology() and figured out the code segment I identified as bottleneck will increase its runtime from less one second to about 40 seconds and after another 50 steps, it will take up to 60 seconds. |
Hi, I have got another question, I have just tried the following code:
But got an error like the following:
So I wonder how could I get all the attributs of species_list either using View() or anything else ? Could you please give me any suggestion ? |
here's a one-line lapply implementation that preserves the attributes:
View() is an interactive function that can't usually be printed. the outer species_list is just a list, you can get the attributes of elements in it with the attributes function:
that can be |
OK, I see, thank you ! Let me run a partial benchmark locally to see if it still works. I did not know this syntax, I need to learn more syntax about R later. |
on the View error, are you using Rstudio? View() is very helpful for manually/visually inspecting elements. As is the environment browser it offers. |
Oh, I see. In fact, I used the command Rscript to launch the code, I do not use Rstudio or any IDE. |
The runtime looks good:
I use Sys.time() et the beginning and at the end of this loop to get the runtime for the current implementation as the first timing message then another timing is for the optimized one, I keep both implementations in the same script just to compare the 2 implementations. By the way, the simulation is running in the reversed order from 2000 to 0.
After about 200 iterations, the runtime shows that the current implementation becomes a bottleneck:
|
Hi benj919, I did a few plots checking as well as bitwise species_list checking between the current implementation and the improved one using identical(), so I have adopted your suggested modification and pushed again for merging. Thank you for the suggested modification to get bitwise correctness ! |
Allocating the species_list is superfluous but otherwise this should work. I'd recommend merging the future changes into the development branch though for keeping track on what's in the released package and what's still under development. |
OK, I thiought it was just a small modification so I asked to merge directly to the master branch, the next time I will ask for merging into the development branch first. |
Agree with Benji! |
Hi, I have made the exact improvement in the loop_ecology() function from the ecology.R, which will improve the performance for the evol simulations where the last for loop to construct species_list was identified as a bottleneck.