Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new software to install #448

Open
rbiswas4 opened this issue May 5, 2017 · 17 comments
Open

Adding new software to install #448

rbiswas4 opened this issue May 5, 2017 · 17 comments

Comments

@rbiswas4
Copy link
Member

rbiswas4 commented May 5, 2017

@heather999

I was trying to use the lsst sims install you put up on Cori but wanted to install some other python packages using the conda. Specifically, I tried to install basmap (https://anaconda.org/anaconda/basemap) to view plots of the sky in different projections.

This fails with the message:





    Traceback (most recent call last):
      File "/global/common/cori/contrib/lsst/lsstDM/twinkles_20170208/Linux64/miniconda2/3.19.0.lsst4/lib/python2.7/site-packages/conda/exceptions.py", line 479, in conda_exception_handler
        return_value = func(*args, **kwargs)
      File "/global/common/cori/contrib/lsst/lsstDM/twinkles_20170208/Linux64/miniconda2/3.19.0.lsst4/lib/python2.7/site-packages/conda/cli/main.py", line 145, in _main
        exit_code = args.func(args, p)
      File "/global/common/cori/contrib/lsst/lsstDM/twinkles_20170208/Linux64/miniconda2/3.19.0.lsst4/lib/python2.7/site-packages/conda/cli/main_install.py", line 80, in execute
        install(args, parser, 'install')
      File "/global/common/cori/contrib/lsst/lsstDM/twinkles_20170208/Linux64/miniconda2/3.19.0.lsst4/lib/python2.7/site-packages/conda/cli/install.py", line 420, in install
        raise CondaRuntimeError('RuntimeError: %s' % e)
    CondaRuntimeError: Runtime error: RuntimeError: Runtime error: Could not open u'/global/common/cori/contrib/lsst/lsstDM/twinkles_20170208/Linux64/miniconda2/3.19.0.lsst4/pkgs/conda-env-2.6.0-0.tar.bz2.part' for writing ([Errno 13] Permission denied: u'/global/common/cori/contrib/lsst/lsstDM/twinkles_20170208/Linux64/miniconda2/3.19.0.lsst4/pkgs/conda-env-2.6.0-0.tar.bz2.part').

Should I expect this to happen, and what should I do to get around this problem? Thanks!

@heather999
Copy link
Contributor

It may indeed be a permissions issue. Let me take a look. For installations we use for production, we probably do not want to generally allow modification of the conda environment for concern about upgrading other packages in the environment. This twinkles_20170208 install is "older" and as far as I know, not used for production.
Is basemap something generally useful that we should consider including in the future?

@heather999
Copy link
Contributor

For fun I tried to install basemap, as i feared, other packages would be impacted, such as numpy and scipy. I could try pinning those modules we know are used by DM so they wouldn't be touched.. alternatively we could try setting up a separate conda environment to try this out.

The following NEW packages will be INSTALLED:

    basemap:      1.0.7-np111py27_0
    cffi:         1.9.1-py27_0
    cryptography: 1.7.1-py27_0
    geos:         3.4.2-0
    idna:         2.2-py27_0
    ipaddress:    1.0.18-py27_0
    mkl:          11.3.3-0
    pyasn1:       0.2.3-py27_0
    pycparser:    2.17-py27_0
    pyopenssl:    16.2.0-py27_0

The following packages will be UPDATED:

    conda:        4.2.13-py27_0            --> 4.3.17-py27_0
    numpy:        1.11.2-py27_nomkl_0      [nomkl] --> 1.11.2-py27_0
    scipy:        0.18.1-np111py27_nomkl_0 [nomkl] --> 0.18.1-np111py27_0

@rbiswas4
Copy link
Member Author

rbiswas4 commented May 5, 2017

I need it but I would not argue that it is generally useful enough for including in your future installs.

For installations we use for production, we probably do not want to generally allow modification of the conda environment for concern about upgrading other packages in the environment.

I understand the concern about the production environment. But it would be nice to for other DESC members trying to run on NERSC to utilize your work in setting up stack installs, and adding new packages/modifying numpy versions etc. if necessary. While one could argue that each member trying to run on NERSC could follow your instructions in installing the stack, I wonder if it is possible to achieve this separation by using conda environments, or clone your install. That would be so much more efficient.

In the shorter term, could you please point me to your instructions for stack installs on NERSC?

@rbiswas4
Copy link
Member Author

rbiswas4 commented May 5, 2017

Did not see your update before I posted my reply, but yes I think conda environments would be a great way to go ... but this does not solve the problem of requiring a new numpy version, right?

So, I think the question is

  • (a) if we are not talking about the production environment, can we still change numpy and get the stack install to run? Or does this mean that we need to find a version of the external package which matches the stack numpy /scipy /conda ? (I think such packages may well be available)
  • (b) In the current environment is it a problem if I want to install a new package which does not upgrade these dependencies?

@heather999
Copy link
Contributor

You're right, the conda env won't solve the problem on its own... but it could be a playground as we try to figure this out.

In this case, I think the nomkl versions of numpy and scipy are a hard requirement for DM and that's precisely what the basemap installation is balking at. Other than that, the versions are exactly the same. It may be that basemap really wouldn't care if it was numpy nomkl. So I'd be inclined to try basemap with the versions of the packages that DM prefers.
I've done a little test trying to pin to numpy and scipy to the nomkl versions, I don't think I've got it working quite yet. I'll try again.

To answer b), I believe there's no problem with installing a new package that leaves the rest of the environment untouched - it just seems there are very few such packages! I think anything that depends on numpy or scipy defaults to use the mkl version of those packages - while we really want the nomkl versions. When you do the "conda install ", conda will report the changes it will perform, so it's easy enough to take a peek before proceeding.

@rbiswas4
Copy link
Member Author

rbiswas4 commented May 5, 2017

Ah I see, I did not know about pinning packages. This sounds good! Thanks for trying this!

@heather999
Copy link
Contributor

Still struggling with pinning in this case. I opened an issue and hoping an expert will respond:
ContinuumIO/anaconda-issues#1684

@heather999
Copy link
Contributor

Haven't forgotten about this.. I have a workaround to avoid the introduction of the mkl versions of numpy and scipy where we have to include nomkl in the conda install step:
conda install basemap nomkl
Now it leaves numpy and scipy alone. I notice however that there is another package numexpr which is initially not set up by DM to use nomkl, but when now attempt to install basemap and nomkl, numexpr would be updated to a nomkl version. This seems like an oversight on the part of the DM installation, so I put in an inquiry on community:
https://community.lsst.org/t/why-is-the-numexpr-python-module-not-nomkl-like-numpy-scipy/1853
No response yet.
What I'm learning though, is that it's really difficult to maintain the set of packages that DM initially installs if we want to introduce additional python packages. It might be fine, but it would be helpful to have clear guidelines of what packages should be left as is and which are ok to tweak - particularly as it pertains to the nomkl variants.

@rbiswas4
Copy link
Member Author

rbiswas4 commented May 9, 2017

@heather999 Thanks for pushing this along!

Do I understand right that when the _nomkl_ part is absent, it is an mkl version.

It might be good to draw attention of @danielsf and @rhiannonlynne here. I think numexpr is one of the packages that are treated as dependencies for either pandas or oorb in sims and is not essential for DM.

@heather999
Copy link
Contributor

Yes, @rbiswas4 that is my understanding, if nomkl is not indicated, then it is the mkl version since last year:
https://www.continuum.io/blog/developer-blog/anaconda-25-release-now-mkl-optimizations
This seems to affect numpy, scipy, and numexpr. It very well could be irrelevant - so I'm happy to get some input from the sims folks - thanks!

@rhiannonlynne
Copy link

I think you're right that numexpr just came from pandas. Actually I'm surprised that conda let you use a non nomkl version, so I think it's safe to go back to nomkl. The version of numpy Ivan be important, so I'd install nomkl and pin numpy to 1.11 for now; I think that would get you the right versions of everything else.

@heather999
Copy link
Contributor

heather999 commented May 10, 2017

Just wanted to give you an update @rbiswas4, I have cloned the conda env and made you your own environment. I have played with pinning, but have run into troubles. Fortunately, it seems I'm not the only one, so I opened a new post on community:
https://community.lsst.org/t/pinning-conda-packages-installed-with-dm-lsst-apps-and-lsst-sims/1859
which seems related to this JIRA:
https://jira.lsstcorp.org/browse/DM-9074
I'm hoping to learn how feasible it is to use the conda pinning mechanism. It sounds like unless DM has created their own channel which cloned the set of packages and versions from the conda channels at the time DM set their package versions, I may not be able to reliably pin our packages. At the very least, I need to figure out how to make this work :) Possibly it would be enough to pin using a more loose definition for the package versions rather than requiring an exact match of all versions.

@heather999
Copy link
Contributor

Learned a little more about pinning, and I got that to work. However, I continued to run into dependency problems when trying to add the basemap package.
Also relearned that you really cannot do a conda install in a conda environment other than root. Rather you need to install everything in a conda env up front either at the command line, or by providing a specification file. One could then use pip install to add more packages but again we run into problems with freezing the python packages DM depends on to avoid upgrades. It is not straightforward to freeze the DM python packages and then add more packages.

So, this is what I've done: Dump the current set of installed python packages:
conda list --explicit > specfile.txt
Edit the specfile to add the basemap package I want to install, in this case I did:
conda info basemap to find one that is compatible with the version of numpy we have.
Then create a new environment which clones the DM python environment and adds basemap:
conda create -name rb --file specfile.txt

@rbiswas4 I haven't used basemap, nor do I know how to test it out. I'm curious to see if this worked as we might expect. To test it, you can log onto Cori and setup to use twinkles_20170208, and then do source activate rb which should point your PATH at the conda environment with DM and basemap.

I think in the future, perhaps our best bet is to gather a list of desired python packages in addition to what DM installs by default. We would plan to install those at the time of any new DM installations. We can grow that list as needed, but care will be required to introduce them.

@rbiswas4
Copy link
Member Author

rbiswas4 commented May 11, 2017

@heather999 Thanks!

I probably won't get a chance to test this out till the weekend. basemap was the example I ran into, but my worry is not exactly about basemap but about any package that has other dependencies. If basemap happens to have something particularly singular about it, or we could replace basemap by a more well-behaved package for the purposes of testing.

If you are looking for a piece of code which tests basemap:

import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
fig, ax  = plt.subplots()
m = Basemap(#llcrnrlat=-32., llcrnrlon=48.,
            #urcrnrlat=-22., urcrnrlon=58,
            projection='moll', lon_0=0., lat_0=0., ax=ax, celestial=True)
x, y = m([0, 20, 40 ], [0, -20, 20])
m.scatter(x, y, marker='.', color='b')
fig.savefig('test.png')

I think we need to have two things:
(a) we should be able to import both basemap and the stack functions
(b) There might be a question about time and size. how large and how long does the new environment install take?

Thanks!

@heather999
Copy link
Contributor

Unfortunately I think any package our group is likely to be interested in will have dependencies on packages like numpy, scipy, etc - so basemap might be a very good example of what we would typically run into.

That said - while my initial run of the test failed, I now have it working. I had to re-install the conda env. It seems conda does not approve of my attempt to remove and then reintroduce a conda env with the same name. So instead of rb, the env is now named ame, that was conda's doing. The test.png file was generated and I could open it via xdg-open :)

@rbiswas4
Copy link
Member Author

@heather999 I tested this out and this works! A slight problem I encountered was this:

Fontconfig error: Cannot load default config file

but did not see a problem getting what I needed to do.

Now that we have a working solution, my question is what is the best way to do this on say the latest stack (preferably not increasing anybody's work):

  • If we provide a script that sets up the latest stack (that you already setup and run the source activate ame as part of the script, then does it ensure that no one can modify the production software and yet get the latest install?

Thanks!

@heather999
Copy link
Contributor

Hi @rbiswas4 Yes, I saw that same Fontconfig error. I found this reference http://www.stata.com/support/faqs/unix/fontconfig-error/, this may be something to ask the NERSC helpdesk about. I'm assuming it's rather innocuous, but hopefully we can fix that.

Concerning how to move ahead and use this - yes, we could have a setup script that points to this slightly updated python environment. That would avoid any possibility of inadvertently updating the production environment.
If someone wants to play around and add more packages - I'd recommend they clone the environment and have at it. Then put in a request to have it added to the official user environment, where we work out the details of avoiding any upgrades to DM-used python packages.

Is that flexible enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants