New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for creating envs from requirements.txt files via pip #172

Open
wants to merge 8 commits into
base: develop
from

Conversation

Projects
None yet
9 participants
@dan-blanchard
Copy link

dan-blanchard commented Sep 27, 2015

I found myself frequently wishing this existed, so I added it.

It will use conda packages if available, and fallback to pip otherwise.

There was a bit of nonsense necessary to get editable packages working
properly, but it all works.

dan-blanchard added some commits Sep 26, 2015

Add basic support for creating environments from requirements.txt fil…
…es via pip.

I found myself frequently wishing this existed, so I added it.

It will use conda packages if available, and fallback to pip otherwise.

There was a bit of nonsense necessary to get editable packages working
properly, but it all works.
@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Sep 30, 2015

I tweaked this a bit so that it now pulls in all dependencies of packages in requirements.txt and checks to see if those are available as conda packages. That way people don't end up with pip-installed numpy and other stuff like that.

@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Sep 30, 2015

All tests are now passing, so this is good for review.

@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Oct 23, 2015

I have no idea why this build is failing on Travis when it works fine on my machine.

@msarahan

This comment has been minimized.

Copy link

msarahan commented on conda_env/env.py in a114e73 Oct 23, 2015

why not check if dir exists first?

if not os.path.isdir(temp_dir):
    os.makedirs(temp_dir)

Seems a little nicer than dealing with exceptions?

"""
pip_cmd = ('pip', 'install', '--src', temp_dir, '--download', temp_dir,
'--no-use-wheel', '-r', requirements_path)
try:

This comment has been minimized.

@msarahan

msarahan Oct 23, 2015

Contributor

See comment elsewhere. Please use isdir to check if dir exists, then create if necessary rather than try/except.

def from_file(filename):
if not os.path.exists(filename):
raise exceptions.EnvironmentFileNotFound(filename)
with open(filename, 'rb') as fp:
return from_yaml(fp.read(), filename=filename)
if filename.endswith('.txt'):

This comment has been minimized.

@msarahan

msarahan Oct 23, 2015

Contributor

This seems a little too broad of a catch-all. Since the standard name is requirements.txt, would it be OK to match just that? Otherwise, I can see this leading to some interesting bug hunts.

This comment has been minimized.

@dan-blanchard

dan-blanchard Oct 26, 2015

I feel like people frequently have things like dev-requirements.txt and test-requirements.txt, so maybe we could do:

if filename.endswith('requirements.txt'):

This comment has been minimized.

@lukeyeager

lukeyeager Aug 30, 2016

FWIW, I'd like to match *requirements*.txt so that files like requirements_test.txt will work too.

return from_yaml(fp.read(), filename=filename)
if filename.endswith('.txt'):
return from_requirements_txt(filename)
else:

This comment has been minimized.

@msarahan

msarahan Oct 23, 2015

Contributor

Here, rather than a catch-all else statement, replace with the standard for yaml input (I think this is environment.yaml?)

If we don't match standard names, then we need to fall out, or provide a way for people to manually specify what kind of file they are providing. This means modifying this function to accept another argument to specify this.

This comment has been minimized.

@dan-blanchard

dan-blanchard Oct 26, 2015

I feel like checking for specific file names is a little overly strict. If people are passing in file names as arguments part of conda env create -n foo -f FILENAME, it seems strange to restrict what the filenames must be. Why not just assume that .yml or .yaml files should be loaded as such, and .txt files should be treated as pip/setuptools requirements files?

This comment has been minimized.

@msarahan

msarahan Oct 27, 2015

Contributor

This is fine, until someone else defines a new, shiny format for their package management tool that also happens to use yaml. Making it explicit outside of standard filenames is safer over the long haul. I think your idea of "endswith" is a good compromise between flexibility and prevention of future unexpected failure.

This comment has been minimized.

@dan-blanchard

dan-blanchard Oct 27, 2015

The --file argument is making it explicit. The user is explicitly saying, "This is the file that defines my environment." The default value for that argument is environment.yml, but anything else that is assigned to it is obviously what the person wants to create their environment from. If in the future someone comes up with a new package format that use YAML that they would like to be supported by conda env, they're going to need to create pull request to add it anyway, so I don't see how just checking based on the extension right now (and raising an error if the extension isn't in {'txt', 'yml', 'yaml'}) is dangerous in any way.

This comment has been minimized.

@msarahan

msarahan Oct 27, 2015

Contributor

It's not dangerous, but the error messages that someone will encounter in the future will not be helpful. All I am really asking is that you provide a helpful error message if loading fails. Loading should never fail, so long as files follow known standards (environment.yml and requirements.txt). YAML and txt as file extensions happen to currently capture the current range of possibilities. When someone adds something new (and you know they will, and I hope they use YAML, because I think it is technically good), I'd like users to meet a friendly error message along the lines of:

The specified file is not recognized as a conda-env input.  Conda-env currently support conda-env's environment.yaml specification, and pip's requirements.txt specification.  Please file an issue for the new format you're trying to use at https://github.com/conda/conda-env/issues

Rather than something obscure like:
KeyError: ...

@msarahan

This comment has been minimized.

Copy link
Contributor

msarahan commented Oct 23, 2015

Sorry I don't have time to help you troubleshoot Travis right now. I would like to merge this if you can get it past Travis. Please see my other comments. This is a nice addition - thanks for your work on it.

@malev

This comment has been minimized.

Copy link
Contributor

malev commented Oct 27, 2015

@dan-blanchard what's gonna happen if in the same directory you have an environment.yml file and a requirements.txt file?
Also, I think you sould take a look at https://github.com/conda/conda-env/tree/develop/conda_env/specs
You could have a new spec called requirements_txt_spec (maybe)
And this one should return a env like object.
This is how we architecture the different ways to create an environment.
make sense?

@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Oct 27, 2015

@dan-blanchard what's gonna happen if in the same directory you have an environment.yml file and a requirements.txt file?

The environment.yml file will be used by default (since it's the default value for the --file argument to conda env create), and requirements.txt will be used if you specify it as the file you want to use with --file.

You could have a new spec called requirements_txt_spec (maybe)

I can certainly do that if that's what y'all want me to, but I went with this since it is so similar to what is happening for the pip requirements in environment.yml. I wasn't sure if the extra spec was really necessary, but obviously I'm not intimately familiar with the conda-env code.

@malev

This comment has been minimized.

Copy link
Contributor

malev commented Oct 27, 2015

@dan-blanchard I think going with specs is a better approach because you'll take advantage of the architecture already in place for creating new environments.

@malev malev referenced this pull request Nov 18, 2015

Merged

Requirements spec #203

@@ -80,6 +80,8 @@ def execute(args, parser):
# FIXME conda code currently requires args to have a name or prefix
if args.name is None:
args.name = env.name
else:

This comment has been minimized.

@malev

malev Nov 18, 2015

Contributor

don't fully understand what's going on here

This comment has been minimized.

@dan-blanchard

dan-blanchard Nov 19, 2015

Without this the environment won't get a name, because requirements.txt files cannot contain the name of the environment.

@@ -50,7 +64,7 @@ def from_environment(name, prefix, no_builds=False):


def from_yaml(yamlstr, **kwargs):
"""Load and return a ``Environment`` from a given ``yaml string``"""
"""Load and return an ``Environment`` from a given ``yaml string``"""

This comment has been minimized.

@malev

malev Nov 18, 2015

Contributor

nice catch!

@malev

This comment has been minimized.

Copy link
Contributor

malev commented Nov 18, 2015

I'm starting to feel like this might be outside the scope of the project. What does @msarahan thinks about?

@msarahan

This comment has been minimized.

Copy link
Contributor

msarahan commented Nov 19, 2015

I don't think it's out of scope, necessarily. I see it as a good compatibility layer that may make conda a more compelling platform. The best platform isn't conda or pip, it's both. I don't fully understand the long-term support implications, though. It sort of joins conda to pip, and requires you to keep supporting pip once people start using this feature. What happens if/when pip is replaced down the line? You'll have to support pip likely longer than pip itself is supported. I don't think this issue is a deal breaker. I just think we need to discuss long-term implications, and figure out if any engineering solution can reduce future risk.

@malev

This comment has been minimized.

Copy link
Contributor

malev commented Nov 19, 2015

@msarahan that's my concern.
On the other hand I like it ...
I'm gonna try to put some time on run tests and try out this PR
Thanks!

# properly.
specs = list(chain(*[s.split() if '-e' in s else [s] for s in specs]))
# Directory where pip will store VCS checkouts for editable packages
src_dir = join(config.envs_dirs[0], env.name, 'src')

This comment has been minimized.

@malev

malev Nov 20, 2015

Contributor

looks like config.envx_dirs[0] is the location of the root environment.
how is this gonna work when we want to install into some other environment? (like the one we have just created)

This comment has been minimized.

@dan-blanchard

dan-blanchard Nov 21, 2015

config.envs_dirs[0] should be $CONDA_ROOT/envs, and I'm joining that with the name of the environment and src, so the final path will be $CONDA_ROOT/envs/$CONDA_ENV/src. That's exactly what we want.

@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Jan 19, 2016

@githubfun, I've been successfully using my branch since September without any issues, so if this is something you need as much as I do, I suggest you just clone my branch and install it in your conda root environment with python setup.py develop.

@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Feb 8, 2016

Any movement on adding this officially? I was talking to somebody at work about how they can use conda with requirements.txt files now and then I remembered it's only with my fork.

@cowlicks

This comment has been minimized.

Copy link

cowlicks commented Feb 11, 2016

One thing I haven't seen mentioned here, conda install has a --file flag which can take a requirements.txt file. This is only helpful when everything in the requirements.txt is has a conda package. But it is still probably useful in some cases @githubfun @dan-blanchard . Sorry if y'all were already aware of this.

However using conda install --file is usually a pretty bad experience since most of the time there is not a conda package for everything in the requirements.txt. Trying to use conda install --file makes me want this.

At the end of the day what I (and probably others) want is a consistent workflow across projects, whether they use a requirements.txt or environment.yml. Something like

  1. cd project-name
  2. conda env create
  3. source activate project-name

Which is what this PR would let me do.

So yeah +1 to this.

@malev how does #203 address this?

@AlJohri

This comment has been minimized.

Copy link

AlJohri commented Apr 24, 2016

@malev any word on how to get this merged in? what can we do to help?

@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Jun 2, 2016

This is only helpful when everything in the requirements.txt is has a conda package.

Yeah, the main advantage of the approach here is that this will support editable packages on GitHub and the like, which are currently not supported by conda env.

@mlaprise

This comment has been minimized.

Copy link

mlaprise commented Oct 21, 2016

hey @dan-blanchard, like you probably know, I'm really interested by this. Are you using this approach in production ? As of today, is there any other ways to achieve the same goal ?

@dan-blanchard

This comment has been minimized.

Copy link

dan-blanchard commented Oct 22, 2016

hey @mlaprise, we still aren't using conda in production at Parse.ly, but I use this at least once a week for development. I don't know of any other way to achieve the same goal.

@jni

This comment has been minimized.

Copy link
Contributor

jni commented Jun 2, 2017

FWIW I would very much like to see this addition merged. I have wanted it so many times.

@nicoddemus

This comment has been minimized.

Copy link
Contributor

nicoddemus commented Jun 2, 2017

@jni this repository has been deprecated, if there's still interested in this PR it should be re-opened in conda/conda.

@ccordoba12 @kalefranz perhaps using the GH API to post a comment in all open issues and PRs to let people now that the developement has been moved to conda/conda is a good approach to prevent confusion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment