Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve G4 data setup #132

Merged
merged 4 commits into from
Mar 8, 2023
Merged

Improve G4 data setup #132

merged 4 commits into from
Mar 8, 2023

Conversation

nkrah
Copy link
Collaborator

@nkrah nkrah commented Mar 7, 2023

Changed and extended the Geant4 data setup. Now the script not only checks whether the geant4_data folder exists, but also checks whether all required subfolders exist. If anything is missing, a full fresh download of all data is triggered.

The script introspects the tar files to get the directory into which the tar files are extracted and in this way identifies subfolders which are probably outdated and not needed anymore. The user is informed about this, but the files are not removed.

If a subfolder into which a tar file needs to be extracted already exists (e.g. from a previous failed attempt to install G4 data), this script first deletes the directory and lets the tarfile library recreate it. This has to do with the way user IDs and permissions are handled by tarfile.

NB: The PR also corrects some improperly defined dictionaries (env variables) and adds an environment variable for a dataset that was previously missing.

* Check if data is up-to-date
* fix incorrectly defined env variable dict
* check for outdated data after download
@nkrah
Copy link
Collaborator Author

nkrah commented Mar 8, 2023

Some background info about tar extraction: The tar file contains the user ID of the user who packaged the tar, e.g. somebody at CERN. This user ID does most likely not correspond to the local user ID on the machine where opengate is installed. extractall function of the tarfile library uses the user ID given by the tar file to extract into a directory and changes the user IDs to those of the local user running the python script only after the extraction process. This means, if the directory into which the files are extracted already exists, e.g. from a previous run of the script, and it does not give write access to everybody, most likely the extractall function will fail because of lack of permission. If instead the directory does not exist yet, extractall creates it (with the user ID from the tar file) and the extraction works fine. Therefore, the download_G4_data() function in this PR checks whether the desired destination directory already exists and deletes it in that case before running the extractall() function.

Here is the excerpt from tarfile's doc:
"Extract all members from the archive to the current working directory or directory path. If optional members is given, it must be a subset of the list returned by getmembers(). Directory information like owner, modification time and permissions are set after all members have been extracted. This is done to work around two problems: A directory’s modification time is reset each time a file is created in it. And, if a directory’s permissions do not allow writing, extracting files to it will fail."
[https://docs.python.org/3/library/tarfile.html]

@tbaudier
Copy link
Contributor

tbaudier commented Mar 8, 2023

Thank you very much Nils, I merged

#129

@tbaudier tbaudier merged commit e1b825d into OpenGATE:master Mar 8, 2023
@nkrah
Copy link
Collaborator Author

nkrah commented Mar 8, 2023

Thanks. I was actually still slightly improving the PR. How should we do?

@dsarrut
Copy link
Contributor

dsarrut commented Mar 8, 2023

Please also measure the computation time needed for the check. (a simple "import opengate" before and after the PR merge). itk import is already "very" slow. This is not an issue when running a large simulation, but it may become cumbersome when developing and running scripts. thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants