-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using time stamp with RunPostCrab? #61
Comments
I'm not sure to understand the issue here. What's the point to redo a production with the same code version? You'll just duplicate what you already done, no? Currently, if you run the post-crab script for a sample which already is in the database (this mean that the sample name is already in the database), the sample is updated, otherwise a new one is added. By updated, I mean only the variables here [1] are updated, and the path is not one of them, since it was designed to handle update of an existing task, and not a completely new task with the same name. Moreover, the path is deprecated in favor for the list of files, which is correctly updated.
To summarize, if you relaunch a production with the exact same code, you really should delete the previous one from the database to avoid any issues. But indeed, we'll run into issues when launching the same code but with different python configuration (like systematics for example), so we'll need to make the name unique in this case. Maybe it's your case? Adding an hash of the python configuration to the sample name may be enough to solve the issue? Also, just for information, it's not really an issue for the database to have samples with the same name, as samples are uniquely identified by their ids, and not by their name. It's however an issue for the post-crab script, since you only have access to the sample name. The timestamp is also already included in the database in the created and modified fields, so adding it into the name is not really needed. [1] https://github.com/cp3-llbb/GridIn/blob/master/scripts/runPostCrab.py#L145-L167 |
Well in my case, this is something stupid: I did the changes to use the full lumi, and for some reason these changes were not ported to the crab config files (which I don't get, but perhaps a mistake from my side). Anyway, I wanted to run a second time with these changes taken into account, then I discovered the "issue" I mentioned. But ok I understand that I didn't manage my entries in the best way, I'll try to simply delete the wrong entries and recreate them with the correct prod. Thanks S. |
Btw, is there some indication how to delete entries in the db? |
Ok I see. Thinking a bit about this, I don't even think we have an easy way to remove sample for the database except doing raw SQL queries... |
hmm, ok. Then perhaps for the time being I'll just hack the samples.json to indicate the correct addresses of the new prods, that's dirty but for now that is ok. |
We solved the issue "offline" with Simon, but we should really provide a script to delete samples from the database. Closing this issue and opening a new one. |
Dear all,
Shall we consider the possibility to add a time stamp (in addition to the tag related to the code version) when running runPostCrab to distinguish two prods made with the same code version? It seems that at the moment it is not possible (is it?).
EDIT:
Actually I think there is a bug. If you run twice with the same version of the code, then the db indicates something strange. Example with a first run on Dec 8th, then a second today (11). As you can see, the first line of the entry is well updated, but the repository is not. It still points to the one created on Dec 8th.
The text was updated successfully, but these errors were encountered: