New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mamba
under a feature flag to create conda environments
#6815
Conversation
e1592ec
to
1a7d11b
Compare
(that was with a small project installing just one package from I did a local test using
|
`mamba` is a a fast drop-in replacement for the conda command-line utility, in C++. I'm adding a feature flag so we can test it out in selective projects that are failing over and over again because of OOM when solving dependencies, even if they have just one, but they are adding conda-forge as channel.
1a7d11b
to
d34b8d0
Compare
IMHO, this is a good PR and it could help us to reduce resources in our builders when building with conda. Although, we have migrated our builders to bigger servers and builds are not failing anymore. It's more like a nice to have currently. We can come back to this if we start having performance issues with conda again that make our builds to fail. |
we're working towards a |
I talked to Eric today to raise this topic again due to the acceptance that I'm reopening this PR to re-visit soon and give it another test pass (I remember that it was working good but just in case) and see if we can deploy this. The rollout plan would be something like:
|
'--yes', | ||
'--quiet', | ||
'--name=base', | ||
'--channel=conda-forge', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SylvainCorlay! is it possible to install mamba
from a different, and smaller, channel than conda-forge
here?
I can't use micromamba
at this point for "reasons" and I would like to installing it with conda
but ideally using a channel that only contains mamba
, so it does not make conda to consume too many resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you consider using a miniforge flavor including mamba instead of miniconda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment from the peanut gallery: regro/conda-metachannel#31 (comment) conda-metachannel service is down, unfortunately.
Perhaps if there was a mamba
channel, it would help the bootstrapping problem. conda install mamba -c mamba
would be blazing fast, and from there mamba & conda-forge could be used as usual.
About mamba in miniforge... from what I read in conda-forge/miniforge#23, looks like there is no consensus (the path of least resistance seems to be creating a miniforge-mamba
or a microforge
with micromamba
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't use anything different than regular conda
(like micromamba
, miniforge
, etc) at this moment because I can't modify the Docker image we are currently using (I will be able to do this in the future, but I don't know exactly when yet).
So, given this current restriction, I was asking if something like @astrojuanlu mentioned already existed (conda install mamba -c mamba
) because calling conda install mamba -c conda-forge
currently has the same problem of consuming a lot of resources just because it uses conda-forge
channel that contains millions of packages. Although the problem is there, it's not a blocker to start testing mamba
here, but having a better workaround for this would be good.
Comment from the peanut gallery: regro/conda-metachannel#31 (comment) conda-metachannel service is down, unfortunately.
This was a good idea when @astrojuanlu commented it to me. However, if it's currently down it doesn't seem to be something we can rely on by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here is to avoid this random problem with the available tools and restrictions:
$ conda install --yes --quiet --name=base --channel=conda-forge mamba
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): ...working... Killed
Command killed due to excessive memory consumption
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @humitos miniforge now includes a "mambaforge" installer which has mamba pre-installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking into what we could do to allow conda install mamba
from a raw miniconda to be faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@humitos
One way to do this without having to maintain another channel with a copy of mamba and its dependencies is to use a conda lock file.
https://pypi.org/project/conda-lock/
You could generate a conda lock file offline, and use it in your script so that no solving is required.
We don't have `mamba` at this point, so we need to force using `conda`. When the have `micromamba` installed in the Docker image, we will need to update `mamba` here instead.
@@ -27,6 +27,18 @@ In case you prefer to use the latest ``conda`` version available, this is the fl | |||
Makes Read the Docs to install all the requirements at once on ``conda create`` step. | |||
This helps users to pin dependencies on conda and to improve build time. | |||
|
|||
``CONDA_USES_MAMBA``: :featureflags:`CONDA_USES_MAMBA` | |||
|
|||
``conda`` solver consumes 1Gb minimum when installing any package using ``conda-forge`` channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean resolver
``conda`` solver consumes 1Gb minimum when installing any package using ``conda-forge`` channel. | |
Conda's resolver consumes 1Gb minimum when installing any package using ``conda-forge`` channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well... I think it's the same, but they call it "Solver" :)
https://docs.conda.io/projects/conda/en/latest/api/solver.html#conda.core.solve.Solver
Co-authored-by: Santos Gallegos <santos_g@outlook.com>
Awesome! |
We are deploying this tomorrow. Please, contact us at email support if you want to enable this feature on your projects and give us feedback about how it works for your cases. |
mamba
is a a fast drop-in replacement for the conda command-line utility, in C++. See https://github.com/QuantStack/mambaI'm adding a feature flag so we can test it out in selective projects that are failing over and over again because of OOM when solving dependencies, even if they have just one, but they are adding conda-forge as channel in their environment file.
This is another attempt trying to make conda environment more stable. I'm not sold on this solution, but the tests I did where successful and time was reduced in half (
conda env create
compared tomamba env create
). Peak memory was 230Mb withmamba
and 955Mb withconda-env
The changes add a new step (install mamba) that requires using conda-forge to install it, which takes some extra seconds. We could install it inside the docker image directly (after installing conda) if we found that mamba helps us with conda environment considerably.
Here is a good explanation about all these memory/cpu intense problems when using
conda
: https://www.anaconda.com/understanding-and-improving-condas-performance/