-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Re] Ten years challenge: Velho and Legrand (2009) - Accuracy Study and Improvement of Network Simulation in the SimGrid Framework #39
Comments
Thanks for your submission. I'm afraid I've conflict of interest (https://rr-france.github.io/bookrr/) and I cannot edit it. @labarba Coudl you edit this submission for the Ten Years Reproducibility Challenge (only 1 reviewer needed) ? |
I don't have field expertise in computer networks, and I'm also slammed with multiple service roles, so I cannot take one more task. |
@labarba Ok thanks for the quick answer. |
OK, let's try that. @alegrand, any reviewer suggestion? |
And we have a reviewer: @rgrunbla |
Thanks a lot @rgrunbla ! |
Hi, I have a few questions / comments:
I get 10581 lines of output (according to wc -l) before it crashes (and it's always the same number of lines). Could you confirm whether (or not) this is happening from your side ?
Thanks, Rémy |
Hi Remy, sorry for the late reply. Huge thanks for being more responsive than me. I've been quite busy over the last days so I haven't been able to peacefully look at what may be wrong. I should finally be able to do this in the following days. Thanks for your patience. Best, Arnaud |
@benoit-girard @alegrand Gentle reminder |
Hi @rgrunbla, @benoit-girard and @rougier . I really apologize for this unacceptable delay. :( I have finally been able to find some time to look into this this afternoon (on a completely fresh system on an old laptop as my regular one has recently crashed, 3 weeks after the end of the warranty :( ). Good catch! There is indeed a problem even though it does not fail in the same way as you. In my case the script runs to completion (no crash), which is why I had not noticed this. But indeed, when I monitor stderr, there is one of the configuration for which I get the same message as you:
I have to say I do not understand why it would crash and stop on your machine and run to completion on mine since we're both in a Docker image! Anyway, I looked into the log to determine, which configuration fails. There is only one, which corresponds to this in the log file:
This is extremely weird since when looking in the log files of the original article, I should get:
So out of the 7920 tested configurations, there is one that mysteriously fails (the simulation does not even start) and it does not appear to have any particular characteristic. I'm going to investigate this during the week-end and I'll keep you posted. |
Thanks for the update! |
Hello, This is really crazy. It seems to fail solely for this particular configuration (B=1E5, L=0.5, S=17000, M=GTNets) and I really cannot figure out why. It works like a charm for every value S in [16990,17010] but for 17000 (note that this parameter is the size of the message which is sent from a host to an other in the simulation so this is really weird) !!! I've activated SimGrid's debugging logs, and when you compare the execution for 17000 and 17001, if you ignore pointer address differences, the first difference is right before the deadlock message, because there is no next action end for GTNets whereas there should be one. So I've run gdb, and even attached to the forked child (because the GTNets simulation is forked to determine the next completion) and although a flow was created, it appears that there is no event to dequeue. I already took me quite some time and I do not know GTNets well enough (and I really do not want to debug it) to dig into what could be wrong. The goal of this challenge was not to fix/improve the old code (especially as this one was a prototype whic has been deprecated when SimGrid moved to NS3 as GTNets was no longer developed). So I think I'll stop there the investigation and amend the article by explaining that I checked that all the (7920) "new" results match the (316800) "old" ones except for one that mysterious configuration where it stops. While I'm there, I'll take @rgrunbla 's suggestion about the overlay into account and update the URLs with more stable ones (gforge is closing, docker recently announced that they would automatically remove images that had not been accessed in the last few months. :(). The other weird point is that the behavior of this old perl script seems different in my machine (simulation fails silently) and the one of @rgrunbla (stops). If we ever understand what was wrong, I'll also amend the article. |
Yeah, sorry about the big delay in this answer. I'm completely satisfied with @alegrand answer and explanations. No additional requirements, everything is ok from my side. |
Ok, then I have the pleasure to state that the paper has been accepted! |
@rgrunbla : do you have an ORCID? If yes, please let me know... |
Yes I do ! https://orcid.org/0000-0002-9146-9888 |
@alegrand a gentle reminder that I need your updated article.pdf to proceed with publication. |
🔔 This is a wakeup call for @alegrand. All we need from you is a final PDF for publication! 🔔 |
@alegrand please finalize the pdf update (alegrand/reproducibility-challenge#1) |
@alegrand a gentle reminder that without the updated pdf your paper cannot be published. |
@benoit-girard Maybe you can try email |
I just tried that, we will see... |
I have finally taken the time to polish this article. I deeply apologize to the editors and the reviewer (who all worked in a timely manner) for their patience. |
Some information has been lost since March: I was requiring the integration of my pull request (alegrand/reproducibility-challenge#1), that contains the necessary metadata concerning editor and reviewer identity, as well as dates of submission, acceptance and publication. |
@benoit-girard, I have finally merged and kind of fixed the latex (for some reason, after merging, lualatex did not want anymore to break my very long hrefs generated by the swhid). It is here is is imho OK for a final version: https://github.com/alegrand/reproducibility-challenge/blob/master/article.pdf |
@benoit-girard @alegrand Is the article ready to be published then? I can help if necessary. |
Well yes, I think so. As I said, the final version is ready but I'm not sure how to proceed with the publication. |
@benoit-girard will proceed with the publication (or I can do it just tell me Benoît) |
Hi @benoit-girard and @rougier . Someone recently asked me how to cite this work and I just realized that I cannot find this article at http://rescience.github.io/read/#volume-8-2022 nor at http://rescience.github.io/read/#volume-9-2023 and that this issue is still opened. Did I miss again something ? |
I think it is still unpublished. @benoit-girard you confirm ? What is needed for publication ? |
Sure, where are the latex sources? |
Next to the pdf I mentioned earlier. 😄 https://github.com/alegrand/reproducibility-challenge/blob/master/article.tex 🙏 |
Ok, here is the sandbox version: https://sandbox.zenodo.org/record/6265 |
It's a matter of formatting, but page 2 begins with a comma, this is quite strange... Otherwise, it seems to be OK. |
Good catch. @alegrand Ok if I removed the spurious space before comma? |
Of course. Sorry about this. I hadn't noticed. Be my guest! 🙏 |
It's online !!! https://zenodo.org/record/10275726. It will soon appear on journal website. |
Original article: https://dl.acm.org/doi/10.4108/ICST.SIMUTOOLS2009.5592 or https://hal.inria.fr/inria-00361031/file/simutools09.pdf
PDF URL: https://github.com/alegrand/reproducibility-challenge
Metadata URL: https://hal.inria.fr/inria-00361031/
Code URL: https://github.com/alegrand/reproducibility-challenge
Scientific domain: Computer Science
Programming language: C/C++
Suggested editor: @khinsen or @rougier
The text was updated successfully, but these errors were encountered: