Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pythonPackages.mlflow: init at 1.4.0 #74091

Open
wants to merge 5 commits into
base: master
from

Conversation

@tbenst
Copy link
Contributor

tbenst commented Nov 25, 2019

Motivation for this change

add mlflow, an Open source platform for the machine learning lifecycle. Note that this package is only partially functional on NixOS, and is not intended to support features requiring conda.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Nov 25, 2019

Here's one issue I'm stuck on:

> nix-shell -I nixpkgs=. -p 'python3.buildEnv.override { extraLibs = [ python3Packages.mlflow ]; }'
$ mlflow server --host 0.0.0.0
Traceback (most recent call last):
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/bin/.gunicorn-wrapped", line 6, in <module>
    from gunicorn.app.wsgiapp import run
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 9, in <module>
    from gunicorn.app.base import Application
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/app/base.py", line 10, in <module>
    from gunicorn import util
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/util.py", line 26, in <module>
    import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'
Running the mlflow server failed. Please see the logs above for details.

This appears to be benoitc/gunicorn#1716. So I tried adding setuptools to propagatedBuildInputs, and also setuptools_scm to buildInputs. I tried adding these to both mlflow as well as gunicorn. but no change.

Also possibly related to #68314. Anyone know how to fix this?

@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Nov 25, 2019

I originally was going to post this in mlflow/mlflow, but after tracing the error it appears to be a NixOS specific error that is caused by a bad PATH...not sure what's going on...

tl;dr why are python3.7 paths being appended to my PATH for some packages instead of the relevant python3.7.5 paths?

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): NixOS
  • MLflow installed from (source or binary): source
  • MLflow version (run mlflow --version): 1.4.0
  • Python version: 3.7.5
  • Exact command to reproduce: mlflow server

Describe the problem

Provide the exact sequence of commands / steps that you executed before running into the problem.
I am trying to package mlflow for NixOS. I get the following error:

$ mlflow server
Traceback (most recent call last):
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/bin/.gunicorn-wrapped", line 6, in <module>
    from gunicorn.app.wsgiapp import run
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 9, in <module>
    from gunicorn.app.base import Application
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/app/base.py", line 10, in <module>
    from gunicorn import util
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/util.py", line 26, in <module>
    import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'
Running the mlflow server failed. Please see the logs above for details.

At first I thought this was benoitc/gunicorn#1716, so I added setuptools as a dependency, but the issue remained.

Other info / logs

Since there's no stacktrace back to mlflow, I next tried following the code base by eye to see how the error occurs. mlflow server is handled by https://github.com/mlflow/mlflow/blob/3fc02ff20938ac5f6eb5681fac9cb693f55e1a19/mlflow/cli.py#L237
and calls https://github.com/mlflow/mlflow/blob/3fc02ff20938ac5f6eb5681fac9cb693f55e1a19/mlflow/server/__init__.py#L62
This constructs full_command of ['gunicorn', '-b', '127.0.0.1:5000', '-w', '4', 'mlflow.server:app'] and calls https://github.com/mlflow/mlflow/blob/3fc02ff20938ac5f6eb5681fac9cb693f55e1a19/mlflow/utils/process.py#L9
cmd_env equals the following: https://gist.githubusercontent.com/tbenst/0dbeecd11a5d91b57577a1f3919110f1/raw/d0f668530c0ec57c2d06e84447d34a192b093bdf/cmd_env. The error appears to come from https://github.com/mlflow/mlflow/blob/3fc02ff20938ac5f6eb5681fac9cb693f55e1a19/mlflow/utils/process.py#L35

I'd like to note that

$ gunicorn -b 127.0.0.1:5000 -w 4 mlflow.server:app 
[2019-11-25 10:57:05 -0800] [16341] [INFO] Starting gunicorn 19.9.0
[2019-11-25 10:57:05 -0800] [16341] [INFO] Listening at: http://127.0.0.1:5000 (16341)
[2019-11-25 10:57:05 -0800] [16341] [INFO] Using worker: sync
[2019-11-25 10:57:05 -0800] [16368] [INFO] Booting worker with pid: 16368
[2019-11-25 10:57:05 -0800] [16378] [INFO] Booting worker with pid: 16378
[2019-11-25 10:57:05 -0800] [16388] [INFO] Booting worker with pid: 16388
[2019-11-25 10:57:05 -0800] [16432] [INFO] Booting worker with pid: 16432

succeeds. Additionally, in a python shell, I can do the following without any issue:

>>> import os
>>> cmd_env = os.environ.copy()
>>> cwd = None
>>> import subprocess
>>> cmd = ['gunicorn', '-b', '127.0.0.1:5000', '-w', '4', 'mlflow.server:app']
>>> cwd = None
>>> child = subprocess.Popen(cmd, env=cmd_env, cwd=cwd, universal_newlines=True,
...                                  stdin=subprocess.PIPE)
>>> [2019-11-25 11:00:43 -0800] [2241] [INFO] Starting gunicorn 19.9.0
[2019-11-25 11:00:43 -0800] [2241] [INFO] Listening at: http://127.0.0.1:5000 (2241)
[2019-11-25 11:00:43 -0800] [2241] [INFO] Using worker: sync
[2019-11-25 11:00:43 -0800] [2297] [INFO] Booting worker with pid: 2297
[2019-11-25 11:00:43 -0800] [2304] [INFO] Booting worker with pid: 2304
[2019-11-25 11:00:43 -0800] [2309] [INFO] Booting worker with pid: 2309
[2019-11-25 11:00:43 -0800] [2321] [INFO] Booting worker with pid: 2321

However, if I copy mlflow_cmd_env (see gist), then I recreate the error:

>>> mlflow_cmd_env = { ....very long.... }
>>> child = subprocess.Popen(cmd, env=mlflow_cmd_env,  cwd=cwd, universal_newlines=True, stdin=subprocess.PIPE)
>>> Traceback (most recent call last):
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/bin/.gunicorn-wrapped", line 6, in <module>
    from gunicorn.app.wsgiapp import run
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 9, in <module>
    from gunicorn.app.base import Application
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/base.py", line 12, in <module>
    from gunicorn import util
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/util.py", line 12, in <module>
    import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'

Next, look at the differences:

>>> for k,v in cmd_env.items():
...   if mlflow_cmd_env[k]!=v:
...     print(k)
... 
HOST_PATH
out
buildInputs
buildCommandPath
NIX_CFLAGS_COMPILE
NIX_LDFLAGS
PATH

Now, let's figure out which is responsible:

>>> new_cmd['HOST_PATH'] = mlflow_cmd_env['HOST_PATH']
>>> child = subprocess.Popen(cmd, env=new_cmd,  cwd=cwd, universal_newlines=True, stdin=subprocess.PIPE)
>>> [2019-11-25 11:35:39 -0800] [13682] [INFO] Starting gunicorn 19.9.0
[2019-11-25 11:35:39 -0800] [13682] [INFO] Listening at: http://127.0.0.1:5000 (13682)
[2019-11-25 11:35:39 -0800] [13682] [INFO] Using worker: sync
[2019-11-25 11:35:39 -0800] [13702] [INFO] Booting worker with pid: 13702
[2019-11-25 11:35:39 -0800] [13710] [INFO] Booting worker with pid: 13710
[2019-11-25 11:35:39 -0800] [13714] [INFO] Booting worker with pid: 13714
[2019-11-25 11:35:39 -0800] [13717] [INFO] Booting worker with pid: 13717

KeyboardInterrupt
>>> [2019-11-25 11:35:43 -0800] [13682] [INFO] Handling signal: int
[2019-11-25 11:35:43 -0800] [13714] [INFO] Worker exiting (pid: 13714)
[2019-11-25 11:35:43 -0800] [13702] [INFO] Worker exiting (pid: 13702)
[2019-11-25 11:35:43 -0800] [13717] [INFO] Worker exiting (pid: 13717)
[2019-11-25 11:35:43 -0800] [13710] [INFO] Worker exiting (pid: 13710)
[2019-11-25 11:35:43 -0800] [13682] [INFO] Shutting down: Master

KeyboardInterrupt
>>> new_cmd['PATH'] = mlflow_cmd_env['PATH']
>>> child = subprocess.Popen(cmd, env=new_cmd,  cwd=cwd, universal_newlines=True, stdin=subprocess.PIPE)
>>> Traceback (most recent call last):
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/bin/.gunicorn-wrapped", line 6, in <module>
    from gunicorn.app.wsgiapp import run
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 9, in <module>
    from gunicorn.app.base import Application
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/base.py", line 12, in <module>
    from gunicorn import util
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/util.py", line 12, in <module>
    import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'

So the issue is that PATH changes.

>>> for p in pyshell:
...   if p in mlflow:
...     continue
...   else:
...     print(p)
... 
/nix/store/9d4yqmgbc2a2bmh51h4bw4lbj223j4mz-python3-3.7.5-env/bin
>>> for p in mlflow:
...   if p in pyshell:
...     continue
...   else:
...     print(p)
... 
/nix/store/gpnm7i19lpj8p43mjrdw03d0hjalmskl-python3-3.7.5/bin
/nix/store/vip1apgf32s3ash2gmayjvsn6xw4slwi-python3.7-mlflow-1.4.0/bin
/nix/store/q9m23mv47gmaass0z259dbgg8xr862f9-python3.7-alembic-1.2.1/bin
/nix/store/jhcx4gb0f4l69xbc9kg16n0kzqqim98f-python3.7-Mako-1.1.0/bin
/nix/store/5088myssxnqwx0v0zi077cig4mxcdlz2-python3.7-setuptools-41.4.0/bin
/nix/store/a71ljrc2mi8f2hh28jw0gg7xbqn4zn3b-python3.7-chardet-3.0.4/bin
/nix/store/3d7qbv3gwaj6gjkahall8c50s0bq2csh-python3.7-Flask-1.1.1/bin
/nix/store/x4xgjqhpsgck4qqk9h0gyja4z2z1jm50-python3.7-numpy-1.17.3/bin
/nix/store/mnhf2sayqn6xhy3dis0k84d3jvw477lh-python3.7-xlrd-1.2.0/bin
/nix/store/bq1vk78w92n9kk3ycgrhdx7r8rma50zf-python3.7-tables-3.6.1/bin
/nix/store/397kn759lcxm2x379hh5nbyg081bcf4h-python3.7-pbr-5.4.3/bin
/nix/store/5q3905k786hg0k8q58fxzh4h3gn9v27a-python3.7-python-gflags-3.1.2/bin
/nix/store/vljgxs2rf8zczzrl9dr7v44r2vm3zc5d-python3.7-websocket_client-0.56.0/bin
/nix/store/acp1rmqsfkph4rs6llrdiabgxaicg0fi-python3.7-databricks-cli-0.9.1/bin
/nix/store/nm0phx7v93dxr23yy6ayclx96d1y6r0m-python3.7-tabulate-0.8.5/bin
/nix/store/fw3b63z1s3qz0d6hkbd5wqqng4c7d3ni-python3.7-sqlparse-0.3.0/bin
/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/bin
/nix/store/lkwx29nvav1gm17licakszhgyzrir0vl-python3-3.7.5-env/bin

Huh, that's weird, I'm in python3.7.5, why are all these python3.7 paths here? Sure enough, if I try

Traceback (most recent call last):
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/bin/.gunicorn-wrapped", line 6, in <module>
    from gunicorn.app.wsgiapp import run
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 9, in <module>
    from gunicorn.app.base import Application
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/base.py", line 12, in <module>
    from gunicorn import util
  File "/nix/store/mpvq0adhzjsm3nznya786mcv1198zjm8-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/util.py", line 12, in <module>
    import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'

Whereas in shell:

$ which gunicorn
/nix/store/9d4yqmgbc2a2bmh51h4bw4lbj223j4mz-python3-3.7.5-env/bin/gunicorn
$ /nix/store/9d4yqmgbc2a2bmh51h4bw4lbj223j4mz-python3-3.7.5-env/bin/gunicorn
usage: gunicorn [OPTIONS] [APP_MODULE]
gunicorn: error: No application module specified.
@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Nov 25, 2019

$ mlflow server --host 0.0.0.0
Traceback (most recent call last):
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/bin/.gunicorn-wrapped", line 6, in <module>
    from gunicorn.app.wsgiapp import run
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 9, in <module>
    from gunicorn.app.base import Application
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/app/base.py", line 10, in <module>
    from gunicorn import util
  File "/nix/store/vl5qa9893ckq3pic1lnm3z1wgvzni989-python3.7-gunicorn-20.0.2/lib/python3.7/site-packages/gunicorn/util.py", line 26, in <module>
    import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'
Running the mlflow server failed. Please see the logs above for details.

this is failing because gunicorn needs setuptools, not your package

@jonringer jonringer mentioned this pull request Nov 25, 2019
5 of 10 tasks complete
@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Nov 25, 2019

@jonringer doh! I tried the same fix earlier but wrote propogatedBuildInputs facepalm.

Now I get a new error:

[nix-shell:~/code/nixpkgs]$ mlflow server
[2019-11-25 13:48:52 -0800] [19192] [INFO] Starting gunicorn 19.9.0
[2019-11-25 13:48:52 -0800] [19192] [INFO] Listening at: http://127.0.0.1:5000 (19192)
[2019-11-25 13:48:52 -0800] [19192] [INFO] Using worker: sync
[2019-11-25 13:48:52 -0800] [19195] [INFO] Booting worker with pid: 19195
[2019-11-25 13:48:52 -0800] [19195] [ERROR] Exception in worker process
Traceback (most recent call last):
  File "/nix/store/057kqmjj9zixqlsgzzmbjvmh3wwinb0l-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
    worker.init_process()
  File "/nix/store/057kqmjj9zixqlsgzzmbjvmh3wwinb0l-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/workers/base.py", line 129, in init_process
    self.load_wsgi()
  File "/nix/store/057kqmjj9zixqlsgzzmbjvmh3wwinb0l-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/nix/store/057kqmjj9zixqlsgzzmbjvmh3wwinb0l-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/nix/store/057kqmjj9zixqlsgzzmbjvmh3wwinb0l-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
    return self.load_wsgiapp()
  File "/nix/store/057kqmjj9zixqlsgzzmbjvmh3wwinb0l-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/nix/store/057kqmjj9zixqlsgzzmbjvmh3wwinb0l-python3.7-gunicorn-19.9.0/lib/python3.7/site-packages/gunicorn/util.py", line 350, in import_app
    __import__(module)
ModuleNotFoundError: No module named 'mlflow'
[2019-11-25 13:48:52 -0800] [19195] [INFO] Worker exiting (pid: 19195)
[2019-11-25 13:48:52 -0800] [19192] [INFO] Shutting down: Master
[2019-11-25 13:48:52 -0800] [19192] [INFO] Reason: Worker failed to boot.
Running the mlflow server failed. Please see the logs above for details.

And yet this works:

[nix-shell:~/code/nixpkgs]$ python 
Python 3.7.5 (default, Oct 14 2019, 23:08:55) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mlflow
>>> 

[nix-shell:~/code/nixpkgs]$ gunicorn -b 127.0.0.1:5000 -w 4 mlflow.server:app
[2019-11-25 14:03:31 -0800] [10056] [INFO] Starting gunicorn 19.9.0
[2019-11-25 14:03:31 -0800] [10056] [INFO] Listening at: http://127.0.0.1:5000 (10056)
[2019-11-25 14:03:31 -0800] [10056] [INFO] Using worker: sync
[2019-11-25 14:03:31 -0800] [10059] [INFO] Booting worker with pid: 10059
[2019-11-25 14:03:31 -0800] [10060] [INFO] Booting worker with pid: 10060
[2019-11-25 14:03:31 -0800] [10061] [INFO] Booting worker with pid: 10061
[2019-11-25 14:03:31 -0800] [10062] [INFO] Booting worker with pid: 10062
@tbenst tbenst force-pushed the tbenst:mlflow branch from 5a8b007 to ef67a38 Nov 25, 2019
@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Nov 25, 2019

the only thing i can think of, is that the new worker process isn't getting the same PYTHONPATH

@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Nov 25, 2019

if mlflow is meant to be an "application", rather a package. You could move it out of python-modules and use python.withPackage(...) to create an interpreter with the dependencies the server needs, otherwise I'm not sure of a good way to allow the worker processes access to those packages.

@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Nov 25, 2019

@jonringer I need to use mlflow both as a program mlflow server as well as importing it in another program. The problem seems to be that when I do which python in the nix-shell, I get /nix/store/8hmc3nfyqcbcc35khnpf54z7b8h0qrzi-python3-3.7.5-env/bin/python, but if I call which python from subprocess.Popen I get /nix/store/gpnm7i19lpj8p43mjrdw03d0hjalmskl-python3-3.7.5/bin/python, which is the system python and does not have mlflow installed

@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 3, 2019

Had a useful conversation with @jtojnar on irc, so figured I'd copy here if only to refer back later:

<tbenst> anyone have experience with subprocesses and PATH? having an issue 
         where PATH is changing in python when using subprocess
<tbenst> https://github.com/NixOS/nixpkgs/pull/74091#issuecomment-558367416
<tbenst> I tried adding `shell=True` to the `Popen` call but no dice
<tbenst> I need the `Popen` call to use python where it can import `mlflow`
<jtojnar> tbenst the python from shebang is not part of PATH
<jtojnar> so it will not be available outside of nix-shell
<tbenst> jtojnar, good to know. However I get the same problem with
         `nix-build -I nixpkgs=. -A python3Packages.mlflow && result/bin/mlflow server`
<jtojnar> tbenst if you want to use that, you either need to wrap the script and
          set PATH in the wrapper, or use the python through absolute pah
<jtojnar> maybe through something like sys.executable
<tbenst> jtojnar, the script has the correct PATH. it's just this subprocess call to gunicorn. If I call gunicorn from nix-shell all is good, but as soon as I call it through subprocess it no longer has correct PATH
<tbenst> interesting I'll take a look at sys.executable
<jtojnar> tbenst well, then it does not have the correct PATH
<jtojnar> tbenst sy.executable will not work if this comment is correct
          https://github.com/NixOS/nixpkgs/blob/d2b71c643a2ef49b57497987f847d4c366604c6f/pkgs/development/tools/pipenv/default.nix#L31-L35
<jtojnar> will need the direct path to interpreter
<tbenst> jtojnar, very cool, giving that a try, ty
<jtojnar> tbenst looking at https://github.com/NixOS/nixpkgs/pull/74091#issuecomment-558356275,
          what you want to do is have dependency pick up module from environment
<tbenst> jtojnar, sorry I didn't quite understand "dependency pick up module from
         environment", could you expand on that?
<tbenst> I think that means adding the PATH for mlflow to cmd_env["PATH"]
<jtojnar> tbenst mlflow provides a Python module and then runs its dependency
          gunicorn  and expects it to find the module
<jtojnar> tbenst for that you would need the subprocess to preserve PYTHONPATH
          env var (or possibly extend it with the value of `site.getsitepackages`)
<tbenst> jtojnar, I tried `cmd_env['PYTHONPATH'] = ':'.join(site.getsitepackages())`
         but didn't fix it
* jtojnar sent a long message:  < https://matrix.org/_matrix/media/r0/download/matrix.org/LHmQBUpAkMUrPcRzhujTGWOu >
<jtojnar> tbenst do you see it in gunicorn?
<tbenst> jtojnar, not sure I understand the question, or at least not sure how to
         answer it. Right now I'm trying to just call gunicorn directly by overriding the mlflow script, but finding it mighty challenging
<jtojnar> tbenst it would be nice to see if the env var/sitepackages are getting
          through the subprocess

I missed the long message at first, reposting here in case matrix deletes.

I did an experiment that confirms @jtojnar's suspicions that PYTHONPATH is somehow not being set in the subprocess:

patchPhase = ''
    substituteInPlace mlflow/utils/process.py --replace \
      "child = subprocess.Popen(cmd, env=cmd_env, cwd=cwd, universal_newlines=True," \
      "import site; py_path=':'.join(site.getsitepackages()); print('MAINPROC '+py_path); cmd_env['PYTHONPATH'] = py_path; child = subprocess.Popen(['echo', 'SUBPROCESS ', '$PYTHONPATH'], env=cmd_env, cwd=cwd, universal_newlines=True,"
  '';

Output: https://gist.github.com/tbenst/f13ad655e6ad4cc6dae31c49b9f3643a

A few things are very odd namely:

❯ nix-shell -p python3
$ python
>>> import os
>>> import site
>>> py_path=':'.join(site.getsitepackages())
>>> cmd_env = os.environ.copy()
>>> cmd_env["PYTHONPATH"] = py_path
>>> import subprocess
>>> subprocess.Popen(['echo', 'SUBPROCESS ', '$PYTHONPATH'], env=cmd_env, cwd=None, universal_newlines=True, stdin=subprocess.PIPE)
<subprocess.Popen object at 0x7f14357cda90>
SUBPROCESS  $PYTHONPATH
>>> subprocess.Popen('echo SUBPROCESS $PYTHONPATH', env=cmd_env, cwd=None, universal_newlines=True, stdin=subprocess.PIPE, shell=True)
<subprocess.Popen object at 0x7f14357d2278>
SUBPROCESS /nix/store/s5f3vpmig33nk4zyk228q55wdydd3pc2-python3-3.7.3/lib/python3.7/site-packages
  • secondly, even though I set PYTHONPATH it appears to have no effect. Once again, this appears to only happen in the mlflow script. It works fine if I call python3:
>>> cmd_env["PYTHONPATH"] = "hello"
>>> subprocess.Popen('echo SUBPROCESS $PYTHONPATH', env=cmd_env, cwd=None, universal_newlines=True, stdin=subprocess.PIPE, shell=True)
<subprocess.Popen object at 0x7f14357cdf60>
SUBPROCESS hello
@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 3, 2019

It appears that this issue is caused by #23676

@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Dec 3, 2019

What you're looking for is buildPythonApplication, I would recommend adding a mlflow = with python3Packages; toPythonApplication mlflow; to all-packages.nix, then just call nix-shell -p mlflow.

❯ nix-shell -p python3
$ python
>>> import os
>>> import site
>>> py_path=':'.join(site.getsitepackages())
>>> cmd_env = os.environ.copy()
>>> cmd_env["PYTHONPATH"] = py_path
>>> import subprocess
>>> subprocess.Popen(['echo', 'SUBPROCESS ', '$PYTHONPATH'], env=cmd_env, cwd=None, universal_newlines=True, stdin=subprocess.PIPE)
<subprocess.Popen object at 0x7f14357cda90>
SUBPROCESS  $PYTHONPATH
>>> subprocess.Popen('echo SUBPROCESS $PYTHONPATH', env=cmd_env, cwd=None, universal_newlines=True, stdin=subprocess.PIPE, shell=True)
<subprocess.Popen object at 0x7f14357d2278>
SUBPROCESS /nix/store/s5f3vpmig33nk4zyk228q55wdydd3pc2-python3-3.7.3/lib/python3.7/site-packages

you just happened to construct the only path that matters, the python standard lib (site packages) gets put on PYTHONPATH due to the interpreter being wrapped:

[nix-shell:~/projects/nixpkgs]$ python
Python 3.7.5 (default, Oct 14 2019, 23:08:55)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import subprocess
>>> subprocess.Popen(['echo $PYTHONPATH'], stdin=subprocess.PIPE, shell=True)
<subprocess.Popen object at 0x7f2331c81dd0>
>>> /nix/store/gpnm7i19lpj8p43mjrdw03d0hjalmskl-python3-3.7.5/lib/python3.7/site-packages:/nix/store/gpnm7i19lpj8p43mjrdw03d0hjalmskl-python3-3.7.5/lib/python3.7/site-packages
@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Dec 3, 2019

while your package is in python-modules, you will need to patch the source code to allow for the packages to call commands or reference store paths. If you're just an application, then you're free to use wrapping mechanisms to your pleasure

@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Dec 3, 2019

[11:12:46] jon@jon-workstation ~/projects/nixpkgs ((ef67a380caa...))
$ nix-shell -p "with import ./. {}; with python3Packages; toPythonApplication mlflow"

[nix-shell:~/projects/nixpkgs]$ mlflow --help
Usage: mlflow [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  artifacts    Upload, list, and download artifacts from an MLflow artifact...
  azureml      Serve models on Azure ML.
  db           Commands for managing an MLflow tracking database.
  experiments  Manage experiments.
  models       Deploy MLflow models locally.
  run          Run an MLflow project from the given URI.
  runs         Manage runs.
  sagemaker    Serve models on SageMaker.
  server       Run the MLflow tracking server.
  ui           Launch the MLflow tracking UI for local viewing of run...

[nix-shell:~/projects/nixpkgs]$ mlflow server
[2019-12-03 11:13:26 -0800] [20735] [INFO] Starting gunicorn 19.9.0
[2019-12-03 11:13:26 -0800] [20735] [INFO] Listening at: http://127.0.0.1:5000 (20735)
[2019-12-03 11:13:26 -0800] [20735] [INFO] Using worker: sync
[2019-12-03 11:13:26 -0800] [20738] [INFO] Booting worker with pid: 20738
[2019-12-03 11:13:26 -0800] [20739] [INFO] Booting worker with pid: 20739
[2019-12-03 11:13:26 -0800] [20740] [INFO] Booting worker with pid: 20740
[2019-12-03 11:13:26 -0800] [20741] [INFO] Booting worker with pid: 20741
@tbenst tbenst force-pushed the tbenst:mlflow branch from ef67a38 to 3cb9de5 Dec 3, 2019
@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 3, 2019

@jonringer ah, I didn't know about toPythonApplication, neat! I think this resolves the issue for now as I believe mlflow only uses subprocess when used as an cmdline app, but I'm about to do more testing to verify. If I'm right, this should be good for final review & ready for merge after gunicorn is merged from staging-next.

@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 4, 2019

Definitely progress as mlflow server now runs inside of a nix-shell, but unfortunately, mlflow server still fails with the import error if run the pythonApplication outside of nix-shell. I also tried setting NIX_PYTHONPATH but this did not affect subprocess path, either.

According to @FRidh's comment, I thought that setting NIX_PYTHONPATH would append to PYTHONPATH, but any value that I set for cmd_env is unset in subprocess. Thus, subprocess env seems to be completely broken inside of a wrapped application.

I basically want the initialization that happens with mlflow and .mlflow-wrapped to also occur in my subprocess, but outside of using nix-shell have no idea how to do this.

I suppose I could always run mlflow inside of a nix-shell, but that feels wrong?

@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Dec 4, 2019

if you want things set in that manner, you will want to use python.withPackages

@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 4, 2019

@jonringer not sure I understand--are you suggesting I add python.withPackages (ps: with ps; [mlflow]) to the derivation and then somehow set the PATH in subprocess? Wouldn't that be recursive? Also, currently I'm unable to set PATH in subprocess

Edit: didn't mean to sound like I have a preferred solution in the last comment. I'd simply be happy with mlflow server working without needing nix-shell

@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Dec 4, 2019

the python.withPackages(...) function will create a python interpreter which is wrapped with PYTHONPATH, so you don't have to worry about what is or isn't present when you go to call python

@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 4, 2019

@jonringer I think this is what you mean? Doesn't apply to subprocess :/:

configuration.nix:

let
  python = pkgs.python37.withPackages (ps: with ps; [ mlflow ])
in 
{
  environment.systemPackages = [ python ];
}
> mlflow server
ModuleNotFoundError: No module named 'mlflow'

Edit: I suppose I could write my own gunicorn command...? And avoid the vars set by gunicorn and .gunicorn-wrapped

@jonringer

This comment has been minimized.

Copy link
Contributor

jonringer commented Dec 4, 2019

That should work.... as it will load all transitive dependencies into a different env derivation:

$ nix-shell -p "with import ./. {}; python37.withPackages (ps: with ps; [ requests ])"

[nix-shell:/home/jon/projects/nixpkgs]$ echo $PYTHONPATH


[nix-shell:/home/jon/projects/nixpkgs]$ python
Python 3.7.5 (default, Oct 14 2019, 23:08:55)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print(sys.path)
['', '/nix/store/drr8qcgiccfc5by09r5zc30flgwh1mbx-python3-3.7.5/lib/python37.zip', '/nix/store/drr8qcgiccfc5by09r5zc30flgwh1mbx-python3-3.7.5/lib/python3.7', '/nix/store/drr8qcgiccfc5by09r5zc30flgwh1mbx-python3-3.7.5/lib/python3.7/lib-dynload', '/nix/store/drr8qcgiccfc5by09r5zc30flgwh1mbx-python3-3.7.5/lib/python3.7/site-packages', '/nix/store/2hfkdy309kzfmnm8qksm5ms12i40qji1-python3-3.7.5-env/lib/python3.7/site-packages']
>>>
KeyboardInterrupt
>>>

[nix-shell:/home/jon/projects/nixpkgs]$ ls /nix/store/2hfkdy309kzfmnm8qksm5ms12i40qji1-python3-3.7.5-env/lib/python3.7/site-packages
__pycache__                                    cffi                     cryptography                OpenSSL                   pyasn1-0.4.8.dist-info      pyparsing-2.4.5.dist-info  requests                   six.py           urllib3-1.25.7.dist-info
_cffi_backend.cpython-37m-x86_64-linux-gnu.so  cffi-1.13.2.dist-info    cryptography-2.8.dist-info  packaging                 pycparser                   pyparsing.py               requests-2.22.0.dist-info  socks.py
certifi                                        chardet                  idna                        packaging-19.2.dist-info  pycparser-2.19.dist-info    PySocks-1.7.0.dist-info    sitecustomize.py           sockshandler.py
certifi-2019.9.11.dist-info                    chardet-3.0.4.dist-info  idna-2.8.dist-info          pyasn1                    pyOpenSSL-19.0.0.dist-info  README.txt                 six-1.12.0.dist-info       urllib3
@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 4, 2019

@jonringer any idea how to add mlflow from within the derivation?

  # TODO need to add mlflow here
  mlPython = python.withPackages(ps: with ps; [ gunicorn]);
  patchPhase = ''
    substituteInPlace mlflow/utils/process.py --replace \
      "child = subprocess.Popen(cmd, env=cmd_env, cwd=cwd, universal_newlines=True," \
      "cmd[0]='${mlPython.interpreter}'; cmd.insert(1, '$out/bin/gunicornMlflow'); child = subprocess.Popen(cmd, env=cmd_env, cwd=cwd, universal_newlines=True,"
  '';

  gunicornScript = writeText "gunicornMlflow"
    ''
      import re
      import sys
      from gunicorn.app.wsgiapp import run
      if __name__ == '__main__':
        sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', ''', sys.argv[0])
        sys.exit(run())
    '';
    
    postInstall = ''
      ln -s ${gunicornScript} $out/bin/gunicornMlflow
      wrapPythonProgramsIn $out/bin/gunicornMlflow
  '';
@tbenst tbenst force-pushed the tbenst:mlflow branch from 3cb9de5 to bb63bb8 Dec 4, 2019
@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 4, 2019

Got it working! Trick was to manually replicate the basic gunicorn script that is generated from setup.py, and allow wrapPython to do it's thing in postFixup.

edit: might be a more elegant approach in https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/version-management/sourcehut/default.nix but not clear if that would work, so I'm content for now

@tbenst tbenst force-pushed the tbenst:mlflow branch 2 times, most recently from 5670584 to ffd7467 Dec 4, 2019
Copy link
Contributor

jonringer left a comment

personally, i would move mflow out of python-packages, and use buildPythonApplication instead. It would solve most of your issues.

];

checkPhase = "pytest tests";
# tests folder is missing in PyPI

This comment has been minimized.

Copy link
@jonringer

jonringer Dec 4, 2019

Contributor

please check out from github so we can get some unit tests

This comment has been minimized.

Copy link
@jonringer

jonringer Dec 5, 2019

Contributor

do you have an example of how you would like to use mlflow? probably help me gauge what would be reflected in the expressions

This comment has been minimized.

Copy link
@tbenst

tbenst Dec 5, 2019

Author Contributor

definitely, here's a basic one: https://gist.github.com/tbenst/e793704190322915e9641c64351322a2

Essentially, I use mlflow to track model performance (similar to tensorboard if you're familiar) and store model parameters. I have a box on aws running mlflow server, and then use mlflow in local python scripts to log to the server.

Really appreciate all of your help!

This comment has been minimized.

Copy link
@tbenst

tbenst Dec 5, 2019

Author Contributor

should also mention that databricks-cli is only needed for using their SaaS offering of mlflow server. I don't use this, so at least as far as I'm concerned could just remove the import, but thought I'd leave it in in case another nix user wants it + only have goodwill towards DataBricks for open-sourcing everything.

patchPhase = ''
substituteInPlace mlflow/utils/process.py --replace \
"child = subprocess.Popen(cmd, env=cmd_env, cwd=cwd, universal_newlines=True," \
"cmd[0]='$out/bin/gunicornMlflow'; child = subprocess.Popen(cmd, env=cmd_env, cwd=cwd, universal_newlines=True,"
'';

gunicornScript = writeText "gunicornMlflow"
''
#!/usr/bin/env python
import re
import sys
from gunicorn.app.wsgiapp import run
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', ''', sys.argv[0])
sys.exit(run())
'';

postInstall = ''
gpath=$out/bin/gunicornMlflow
cp ${gunicornScript} $gpath
chmod 555 $gpath
'';
Comment on lines +37 to +58

This comment has been minimized.

Copy link
@jonringer

jonringer Dec 4, 2019

Contributor

this should be unneeded

This comment has been minimized.

Copy link
@tbenst

tbenst Dec 5, 2019

Author Contributor

Even if I move out of python-modules and use buildPythonApplication this is still needed. Without, gunicorn will not be able to import mlflow, which is necessary.

Note that this is a subtle hack--I'm leveraging

# Rewrite "#! .../env python" to "#! /nix/store/.../python".

];

checkPhase = "python querystring_parser/tests.py";
# one test fails due to https://github.com/bernii/querystring-parser/issues/35

This comment has been minimized.

Copy link
@jonringer

jonringer Dec 4, 2019

Contributor

you just want to run unit tests, you shoudl be able to exclude integration tests

This comment has been minimized.

Copy link
@tbenst

tbenst Dec 5, 2019

Author Contributor
@tbenst

This comment has been minimized.

Copy link
Contributor Author

tbenst commented Dec 5, 2019

personally, i would move mflow out of python-packages, and use buildPythonApplication instead. It would solve most of your issues.

I am using mlflow = with python3Packages; toPythonApplication mlflow; in all-packages.nix; would buildPythonApplication do something different? If so, I suppose I could make a separate nix derivation for python-modules and the application itself. In order to use mlflow, need both A) a server running via mlflow server and B) need to import mlflow in your python code.

Edit: no difference in behavior using buildPythonApplication. subprocess is still broken and cannot pass PYTHONPATH. I think the current solution is only viable option given the current site-initialization approach in nixpkgs.

here's the branch for buildPythonApplication: https://github.com/tbenst/nixpkgs/tree/mlflow-app

@tbenst tbenst force-pushed the tbenst:mlflow branch from ffd7467 to 21bd378 Dec 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.