New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount `pip_cache_path` in Docker container #3556

Merged
merged 3 commits into from Feb 15, 2018

Conversation

Projects
None yet
5 participants
@humitos
Member

humitos commented Jan 26, 2018

This allow us to have permanent pip cache between different builds /
projects to work faster locally and under slow internet connections.

Closes #3553

Mount `pip_cache_path` in Docker container
This allow us to have permanent pip cache between different builds /
projects to work faster locally and under slow internet connections.
@safwanrahman

This comment has been minimized.

Member

safwanrahman commented Jan 26, 2018

I actually dont like having pip cache in build.
One may put unvalidate package in the cache and that may invalid everything!
Most of the project do not include hash in their requirements.
So I think it may make build fails more often!

@humitos

This comment has been minimized.

Member

humitos commented Jan 26, 2018

@safwanrahman we were discussing this in the chat to "fix the problem" of working under intermitent/slow internet connections. This would be only for development and not a setting for production actually.

I think the examples you propose (the "why is not a good idea") are valid. So, maybe we should limit this line https://github.com/rtfd/readthedocs.org/pull/3299/files#diff-d61f6aff9736141f30b581471d5d55ceR444 only when DEBUG=True. What do you think?

@safwanrahman

This comment has been minimized.

Member

safwanrahman commented Jan 26, 2018

@humitos Yeah. I think putting it in development mode is totally a great idea!

@@ -449,7 +449,7 @@ def checkout_path(self, version=LATEST):
@property
def pip_cache_path(self):
"""Path to pip cache."""
if getattr(settings, 'GLOBAL_PIP_CACHE', False):
if getattr(settings, 'GLOBAL_PIP_CACHE', False) and settings.DEBUG:

This comment has been minimized.

@ericholscher

ericholscher Jan 30, 2018

Member

Is there a default GLOBAL_PIP_CACHE in dev.py?

This comment has been minimized.

@humitos

humitos Jan 30, 2018

Member

No. By default this feature is deactivated. It could be something like:

GLOBAL_PIP_CACHE = os.path.join(SITE_ROOT, '.cache', 'pip')

what do you think?

This comment has been minimized.

@agjohnson

agjohnson Jan 31, 2018

Contributor

I'd say leave deactivated so our development environments are close to production use.

@@ -728,6 +728,10 @@ def create_container(self):
'bind': self.project.doc_path,
'mode': 'rw'
},
self.project.pip_cache_path: {

This comment has been minimized.

@agjohnson

agjohnson Jan 30, 2018

Contributor

This is slightly redundant, as the project pip cache path is under doc_path already.

Also, normally to share a pip cache, we'd set CACHE_DIR env variable, or --cache-dir argument on pip. To use underlying storage like this, but then repeated remount this path to different locations through docker might cause problems if pip stores and metadata that includes the full path. I don't know enough about pip internals to answer this.

It seems like the most correct option would be to explicitly set CACHE_DIR on the pip commands if using a global cache, and to mount the global cache conditionally on the binds argument here if GLOBAL_PIP_CACHE is True.

This comment has been minimized.

@humitos

humitos Jan 30, 2018

Member

This is slightly redundant, as the project pip cache path is under doc_path already.

Mmm... Yes. This is redundant when the GLOBAL_PIP_CACHE is not set. So, maybe we should add this path only when it's set (as you said, conditionally).

Also, normally to share a pip cache, we'd set CACHE_DIR env variable, or --cache-dir argument on pip

We are using the --cache-dir in the command at python_environments.py file, and that dir points to the project.pip_cache_path

I prefer to use the argument option instead of the env variable (since it's less explicit and more complicated to debug later)

I don't know enough about pip internals to answer this

As far as I know, pip doesn't save metadata, but creates a hash of the package downloaded and use it as filename --but use the real one for wheels:

├── http
│   ├── 0
│   │   ├── 0
│   │   │   ├── 3
│   │   │   │   └── f
│   │   │   │       └── 8
│   │   │   │           └── 003f82596103cb4b5d71b69aeb8119f406b1bc78110a190848c4fb24
│   │   │   ├── 6
│   │   │   │   └── 6
│   │   │   │       └── b
│   │   │   │           └── 0066b726b642ce2657f32a7c7b9f63ae698a2a16695cda32f18eb806
└── wheels
    ├── 02
    │   └── 2e
    │       └── 0d
    │           └── 30ef9919ce3a07990f191c164d07b083761535fb721a1af3cf
    │               ├── ip_associations_python_novaclient_ext-0.2-cp27-none-any.whl
    │               └── ip_associations_python_novaclient_ext-0.2-cp36-none-any.whl
    ├── 06
    │   └── 3d
    │       └── 74
    │           └── f64c3ce3266df2598031818ecc86bc25c8ccede0723973f119
    │               └── anyjson-0.3.3-cp27-none-any.whl

I'd say that mounting conditionally under the binds depending on this setting is the missing piece here and the rest is safe.

This comment has been minimized.

@agjohnson

agjohnson Jan 31, 2018

Contributor

This sounds good. I don't think there is any metadata dependent on the path either, so this is most likely safe I'm sure.

Mount the GLOBAL_PIP_CACHE conditionally
Refactor, by creating a new method, where the binds are created to be
easier to read.
@humitos

This comment has been minimized.

Member

humitos commented Jan 30, 2018

I just refactor the creation of bindings by creating a new method in that class which will mount the global cache conditionally.

@agjohnson

Changes look 👍

@safwanrahman

Looks good.
r+

@safwanrahman

This comment has been minimized.

Member

safwanrahman commented Feb 13, 2018

@ericholscher Can I merge this Pull request?

@safwanrahman safwanrahman merged commit 1376860 into master Feb 15, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@stsewd stsewd deleted the humitos/cache/pip-docker branch Jul 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment