Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend conda interface: store conda environments in project #1372

Merged
merged 10 commits into from
Mar 28, 2024
Merged

Conversation

jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Mar 12, 2024

This pull request changes the location of the conda environment and stores it inside the project in a conda folder. In addition, it checks if an environment already exists and in that case just raises a warning rather than recreating the environment.

@jan-janssen jan-janssen added the format_black reformat the code using the black standard label Mar 12, 2024
@jan-janssen jan-janssen linked an issue Mar 12, 2024 that may be closed by this pull request
@srmnitc
Copy link
Member

srmnitc commented Mar 14, 2024

How would this play out when there is limited number of files possible in the file system? For example, on a HPC cluster. One might want to keep the project on a backed up space, but the conda environment there would be an overkill.

@jan-janssen
Copy link
Member Author

jan-janssen commented Mar 14, 2024

How would this play out when there is limited number of files possible in the file system? For example, on a HPC cluster. One might want to keep the project on a backed up space, but the conda environment there would be an overkill.

I would exclude all directories which have conda in the name from the backup. It is similar to the way snakemake does it. The advantage is that it is more clear for the user to see which environment belongs to which project.

@srmnitc
Copy link
Member

srmnitc commented Mar 14, 2024

How would this play out when there is limited number of files possible in the file system? For example, on a HPC cluster. One might want to keep the project on a backed up space, but the conda environment there would be an overkill.

I would exclude all directories which have conda in the name from the backup. It is similar to the way snakemake does it. The advantage is that it is more clear for the user to see which environment belongs to which project.

Is this possible, for example on cmti. I feel a lot of users might simply not know it. I do see the advantages. I suggest we discuss this in one of the pyiron meetings.

@jan-janssen
Copy link
Member Author

I would like to include this functionality in the 0.8 release of pyiron_base and as we have no pyiron meeting next week - given the easter holidays - it would be great to discuss this asynchronously.

@srmnitc
Copy link
Member

srmnitc commented Mar 26, 2024

I still have my reservations about storing the whole environment in the project. Mainly about the number of files created and backup as I mentioned. Additionally for example wont the pr.pack function now include the environment files itself? But maybe @pmrv or others can also take a look.

@jan-janssen
Copy link
Member Author

Mainly about the number of files created and backup as I mentioned.

You can still create a central project and store all your environments there, it is just the default that is changing.

Additionally for example wont the pr.pack function now include the environment files itself?

This is correct. Still I guess in most cases people are anyway copying their jobs to an archive project as discussed in pyiron/FAQs#46

Comment on lines 225 to 227
conda_dir = os.path.join(self.path, "conda")
os.makedirs(conda_dir, exist_ok=True)
return CondaEnvironment(env_path=conda_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be possible to make the conda path configurable here, that would side step @srmnitc concerns, which I think are valid.

@jan-janssen
Copy link
Member Author

@pmrv and @srmnitc based on your suggestion, I kept the default option to create a new central conda environment. Still now the users can disable the global installation and create a conda environment only in a given project using the global_installation=False option:

pr.conda_environment.create(env_name="test", env_file="environment.yml", global_installation=False)

Copy link
Member

@srmnitc srmnitc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jan-janssen jan-janssen merged commit f379c2c into main Mar 28, 2024
25 checks passed
@jan-janssen jan-janssen deleted the conda_ex branch March 28, 2024 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
format_black reformat the code using the black standard
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More flexible conda environments
4 participants