Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot install a wheel package in a directory with whitespaces #228

Closed
simnalamburt opened this issue Feb 16, 2017 · 10 comments · Fixed by #238
Closed

Cannot install a wheel package in a directory with whitespaces #228

simnalamburt opened this issue Feb 16, 2017 · 10 comments · Fixed by #238
Labels
Type: Bug 🐛 This issue is a bug.

Comments

@simnalamburt
Copy link
Contributor

simnalamburt commented Feb 16, 2017

Summary

  • If the name of current working directory contains the whitespaces
  • You won't be able to install a package with the wheel format.

This bug occurs regardless of the value of PIPENV_VENV_IN_PROJECT environment variable.

How to reproduce this error

Please comment if you cannot reproduce the error.

$ mkdir 'directory with spaces'
$ cd 'directory with spaces/'


$ pipenv --three
Creating a Pipfile for this project...
Creating a virtualenv for this project...
⠋Running virtualenv with interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/simnalamburt/.local/share/virtualenvs/directory with spaces/bin/python3
Also creating executable in /home/simnalamburt/.local/share/virtualenvs/directory with spaces/bin/python
Installing setuptools, pip, wheel...done.

Virtualenv location:


$ pipenv install numpy
Creating a virtualenv for this project...
⠋Using base prefix '/usr'
New python executable in /home/simnalamburt/.local/share/virtualenvs/directory with spaces/bin/python
Traceback (most recent call last):
  File "/usr/bin/virtualenv", line 11, in <module>
    sys.exit(main())
  File "/usr/lib/python3.6/site-packages/virtualenv.py", line 713, in main
    symlink=options.symlink)
  File "/usr/lib/python3.6/site-packages/virtualenv.py", line 925, in create_environment
    site_packages=site_packages, clear=clear, symlink=symlink))
  File "/usr/lib/python3.6/site-packages/virtualenv.py", line 1370, in install_python
    os.symlink(py_executable_base, full_pth)
FileExistsError: [Errno 17] File exists: 'python' -> '/home/simnalamburt/.local/share/virtualenvs/directory with spaces/bin/python3.6'

Virtualenv location:
Installing numpy...
Collecting numpy
  Using cached numpy-1.12.0-cp36-cp36m-manylinux1_x86_64.whl
Installing collected packages: numpy

Error:  An error occurred while installing numpy!
Exception:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python3.6/site-packages/pip/commands/install.py", line 342, in run
    prefix=options.prefix_path,
  File "/usr/lib/python3.6/site-packages/pip/req/req_set.py", line 784, in install
    **kwargs
  File "/usr/lib/python3.6/site-packages/pip/req/req_install.py", line 851, in install
    self.move_wheel_files(self.source_dir, root=root, prefix=prefix)
  File "/usr/lib/python3.6/site-packages/pip/req/req_install.py", line 1064, in move_wheel_files
    isolated=self.isolated,
  File "/usr/lib/python3.6/site-packages/pip/wheel.py", line 345, in move_wheel_files
    clobber(source, lib_dir, True)
  File "/usr/lib/python3.6/site-packages/pip/wheel.py", line 316, in clobber
    ensure_dir(destdir)
  File "/usr/lib/python3.6/site-packages/pip/utils/__init__.py", line 83, in ensure_dir
    os.makedirs(path)
  File "/usr/lib/python3.6/os.py", line 220, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/usr/lib/python3.6/site-packages/numpy-1.12.0.dist-info'

screenshot

*(Click to see the screenshot)*

My Environment

I could reproduce the error in multiple environments. Please comment if you want more information.

$ uname -a
Linux 4.9.6-1-ARCH #1 SMP PREEMPT Thu Jan 26 09:22:26 CET 2017 x86_64 GNU/Linux
$ python3 --version
Python 3.6.0
$ pipenv --version
pipenv, version 3.4.1
$ uname -a
Darwin 16.4.0 Darwin Kernel Version 16.4.0: Thu Dec 22 22:53:21 PST 2016; root:xnu-3789.41.3~3/RELEASE_X86_64 x86_64
$ python3 --version
Python 3.6.0
$ pipenv --version
pipenv, version 3.4.1
@simnalamburt
Copy link
Contributor Author

simnalamburt commented Feb 16, 2017

I want to fix this issue right now, so I brought some plans.

@kennethreitz @nateprewitt Please review and comment. If you choose a plan, I'll send a PR right away. Thanks!


What causes the issue?

When the directory name has the whitespaces, so the name of virtualenv will.

/home/user/.local/share/virtualenvs/directory with spaces/bin/pip

And if you try to execute the pip with the format like below:

delegator.run('{0} install pip'.format(which_pip()))

The shell will understand this command like this

/home/user/.local/share/virtualenvs/directory with spaces/bin/pip install pip
                                              ^^^^
                                              Word splitting has occurred!

Since there's no such executable named /home/user/.local/share/virtualenvs/directory, it'll fail.

So you should double quote the path like this

delegator.run('"{0}" install pip'.format(which_pip()))

But this is the only a single pattern that I've found. There're multiple points which causes the whitespace issue, and we literally need to quote the path "wherever possible" to fix this issue. I'm not familiar with the codebase of pipenv, so I'm having trouble with quoting the all the paths. Someone need to help me out.


How can we solve this issue?

Plan A. Slugify the directory name

--- a/pipenv/project.py
+++ b/pipenv/project.py
@@ -6,6 +6,7 @@ import pipfile
 import toml

 import delegator
+from slugify import slugify
 from requests.compat import OrderedDict

 from .utils import format_toml, mkdir_p
@@ -26,7 +27,7 @@ class Project(object):
     @property
     def name(self):
         if self._name is None:
-            self._name = self.pipfile_location.split(os.sep)[-2]
+            self._name = slugify(self.pipfile_location.split(os.sep)[-2])
         return self._name

     @property

Pros

  1. It is the simplest solution which prevents 'whitespaces in path' error once and for all.

Cons

  1. It adds two additional dependencies; python-slugify and Unidecode . This can be undesirable for some people, since pipenv should be installed globally.
  2. Only issues with no PIPENV_VENV_IN_PROJECT environmental variables are resolved.
  3. This can be undesirable for a few users who cares the name of the virtualenv. This can be a somewhat serious issue since it'll completely change the project name with the non-alphabet characters. For example, '한글' will be turned into 'hangeul'

Plan B. Double quote the ALL paths to prevent word splitting.

EDIT: Sorry, we can't choose this option. Please see the next comment for why.

You might think this problem can be solved with escaping paths with backslashes, but that's just temporal solution. You'll have to properly escape the all special characters like |, ^, &, etc and handle the original backslash properly, which is complicated and easy to make a mistake. Double quoting is the easiest way.

2017-02-16 1 34 26
(You don't want to handle this properly by your own hand)

Pros

  1. This is the standard solution. You should always double quote the variable when you use the shell script.
  2. It'll resolve the issue regardless of the value of PIPENV_VENV_IN_PROJECT environment variable.

Cons

  1. There's no good way to ensure whether the all paths are double quoted properly. Human must check it with their own eye.
  2. I'm not familiar with the pipenv codebase. So someone should help me, or it'll take some time for me to fix this otherwise.

Plan C. Do both Plan A and B

@simnalamburt
Copy link
Contributor Author

simnalamburt commented Feb 16, 2017

I found that plan B cannot solve this problem.

I tried to solve this issue with both plan A and B, and plan A just worked perfect. But plan B could not solve the issue due to the bug of the pip. Please see pypa/pip#923 for the further details.

Since there's no easy way left else plan A to solve this problem, I made a pull request with the plan A. Please comment if you have another good idea. Thanks.

Use the command below to try patched version of pipenv.

pip install git+https://github.com/simnalamburt/pipenv.git@plan-a
References

@nateprewitt
Copy link
Sponsor Member

Closing this for the reasons discussed here in #230. Please let us know if you find more information here that will allow us to proceed with a globally workable solution. Thanks again @simnalamburt!

@simnalamburt
Copy link
Contributor Author

Reopened by https://github.com/kennethreitz/pipenv/pull/230#issuecomment-280413728

Anyone who wants an alternative design, please comment! 😄 Let's solve this problem elegantly with our collective intelligence!

@nateprewitt nateprewitt added the Type: Bug 🐛 This issue is a bug. label Feb 17, 2017
@simnalamburt
Copy link
Contributor Author

simnalamburt commented Feb 20, 2017

Summary:

  • Let's solve this problem only without PIPENV_VENV_IN_PROJECT environment variable.
  • I'll solve multiple issues with a single compatibility breakage.
  • To do so, we need to design a new naming scheme of virtualenv directories.
  • Question: do we have to care of the readability of the virtualenv's name?
  • If we don't have to, I'll just hash the project's full path with blake2, and use it as a name of virtualenv

This comment is long. Dear project owner and collaborators, please read this and advice. Thanks!



At first, I'll focus on solving this problem without PIPENV_VENV_IN_PROJECT environment variable being enabled. Because of the pypa/virtualenv#53, there's no possible way to solve this problem without abandoning pypa/virtualenv or fixing pypa/virtualenv#53. Personally, I do not think pypa/virtualenv#53 is an irresolvable problem. With making a tiny native executable binary with C, we'll be able to workaround the "whitespaces in the shebang" issue. But that is an another problem and changing behavior of virtualenv will require tons of debate and thousands of user inconvenience, so I'll reserve that issue for another day. For now, I'll focus on solving this error without PIPENV_VENV_IN_PROJECT environment.

As I said, I want to solve as many problems as possible at once if we need to introduce a backward compatibility breakage at least once. The confusion and misconception that "pipenv is unstable, and it breaks things" due to a frequent compatibility breakage is something that I don't want.

The problems related to the name of the virtualenv that I know are:

  1. Cannot install a wheel package in a directory with whitespaces (Cannot install a wheel package in a directory with whitespaces #228)
  2. If multiple pipenv projects have the same directory name, they'll overwrite each other's environment.

Please comment if there's an additional issue with a virtualenv's name. Let's solve all these problems at once.

Currently, we already have some "same project name" issue here. At present, those directories will be considered as same projects.

  • /home/user/my-project
  • /home/user/workspace/my-project
  • /home/user/another workspace/my-project

But with https://github.com/kennethreitz/pipenv/pull/230 being merged, this problem gets even worse. All those directories will be considered as same projects.

  • /home/user/my-project
  • /home/user/My,project
  • /home/user/MY_PROJECT
  • /home/user/workspace/my-project
  • /home/user/another workspace/my-project
  • /home/user/another workspace/My Project
  • etc

To get rid of these "name problems" once and for all, I think we need to design a new naming scheme of virtualenv directories together.

There can be multiple solutions to solve this problem, but it differs by a "requirement of the virutualenv's name". So here's an important question. Do we have to care the readability of the virtualenv's name?

Personally, I think we don't have to. Since pipenv totally encapsulate the virtualenv, users rarely need to care the name and details of virtualenvs. They just automatically created in somewhere else the project directory, and user do not need to know the name of it since users can fully access the virtualenv using the pipenv run ~ and pipenv shell command as a proxy. Pipenv effectively hides the virtualenv from the users.

When users enter the pipenv shell command is the only time when users care the name of the virtualenv since the name will be prompted in the shell. But still the users won't need to type the name manually, so I think we don't need to care the name of it.

The reason that I ask such question is, it becomes much easier to solve these problems if we don't need to care the name of the virtualenv!

Proposal 1. Use a cryptographic hash function then base64 it.

Hash the full project path with a cryptographic hash function. Then encode it with the URL-safe base64 or hexadecimal numeric string. And let's use it as a name of the virtualenv.

Using a cryptographic hash function to identify something is a widely accepted solution to multiple software like git. Cryptographic hash functions have Pre-image resistance, second pre-image resistance, and collision resistance. Users are not going to make a lot of virtualenvs (more than 2**32) so we're just OK to use a cryptographic hash function to identify the full project path.

I want to use BLAKE2 since it's the fastest cryptographic hash function at present and it has been included as a standard library in Python 3.6. I'll use pyblake2 package since blake2 project owner wants to transfer its name to the pyblake2 author. (I heard it from the project owner directly)

So it'll be like this.

import sys
import base64
if sys.version_info >= (3, 6):
    from hashlib import blake2b
else:
    from pyblake2 import blake2b

path = b'/home/Hyeon Kim/My Workspace/My Project'
base64.urlsafe_b64encode(blake2b(path, digest_size=9).digest())
# b'05AKATXSLXYA'

Now that 05AKATXSLXYA will be the name of our virtualenv. It'll be placed at ~/.local/share/virtualenvs/05AKATXSLXYA

It's short enough and unique enough. Furthermore, it'll help pipenv to work in Windows since there'll be no more strange artifacts in the path of virtualenv. What we need to do is just detect the os and change the path to %USERPROFILE%\AppData\Local\pipenv or somewhere.


... but what if we should care of the readability of virtualenv's name?

Proposal 2. Make a big change.

Instead of ~/.local/share/virtualenvs/<name of venv>, store virtualenvs at ~/.local/share/virtualenvs/<ID of venv>/<readable name of venv>. <ID of venv> is a base64 or hash of virtualenv's full path, and <readable name of venv> is a readable name of virtualenv without whitespaces. We'll have to handle the various special characters too to make pipenv function properly on multiple platforms.

Currently I don't have enough time to elaborate this solution and honestly, I don't like this solution. It may require a modification of pew or even we'll need to abandon pew.


Thank you for reading the long comment and I really want the proposal 1 to be accepted. Please review and advice. Thanks! :shipit:

@simnalamburt
Copy link
Contributor Author

I have finished implementing proposal 1. Please try it with the command below:

pip install --upgrade git+https://github.com/simnalamburt/pipenv.git@issue-228

I have tested it with Arch Linux, and OS X, and It was just fine.

image

@kennethreitz
Copy link
Contributor

send a pull request!

@simnalamburt
Copy link
Contributor Author

@kennethreitz Yes sir

@kennethreitz
Copy link
Contributor

i don't personally think this is the right approach, having the project virtualenv named after the project directory is important

@kennethreitz
Copy link
Contributor

but we can do it it in exceptional cases, perhaps, like when there's a space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug 🐛 This issue is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants