Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount pip_cache_path in Docker container #3556

Merged
merged 3 commits into from Feb 15, 2018
Merged

Conversation

@humitos
Copy link
Member

@humitos humitos commented Jan 26, 2018

This allow us to have permanent pip cache between different builds /
projects to work faster locally and under slow internet connections.

Closes #3553

This allow us to have permanent pip cache between different builds /
projects to work faster locally and under slow internet connections.
@safwanrahman
Copy link
Member

@safwanrahman safwanrahman commented Jan 26, 2018

I actually dont like having pip cache in build.
One may put unvalidate package in the cache and that may invalid everything!
Most of the project do not include hash in their requirements.
So I think it may make build fails more often!

Loading

@humitos
Copy link
Member Author

@humitos humitos commented Jan 26, 2018

@safwanrahman we were discussing this in the chat to "fix the problem" of working under intermitent/slow internet connections. This would be only for development and not a setting for production actually.

I think the examples you propose (the "why is not a good idea") are valid. So, maybe we should limit this line https://github.com/rtfd/readthedocs.org/pull/3299/files#diff-d61f6aff9736141f30b581471d5d55ceR444 only when DEBUG=True. What do you think?

Loading

@safwanrahman
Copy link
Member

@safwanrahman safwanrahman commented Jan 26, 2018

@humitos Yeah. I think putting it in development mode is totally a great idea!

Loading

@@ -449,7 +449,7 @@ def checkout_path(self, version=LATEST):
@property
def pip_cache_path(self):
"""Path to pip cache."""
if getattr(settings, 'GLOBAL_PIP_CACHE', False):
if getattr(settings, 'GLOBAL_PIP_CACHE', False) and settings.DEBUG:
Copy link
Member

@ericholscher ericholscher Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a default GLOBAL_PIP_CACHE in dev.py?

Loading

Copy link
Member Author

@humitos humitos Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. By default this feature is deactivated. It could be something like:

GLOBAL_PIP_CACHE = os.path.join(SITE_ROOT, '.cache', 'pip')

what do you think?

Loading

Copy link
Contributor

@agjohnson agjohnson Jan 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say leave deactivated so our development environments are close to production use.

Loading

@@ -728,6 +728,10 @@ def create_container(self):
'bind': self.project.doc_path,
'mode': 'rw'
},
self.project.pip_cache_path: {
Copy link
Contributor

@agjohnson agjohnson Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly redundant, as the project pip cache path is under doc_path already.

Also, normally to share a pip cache, we'd set CACHE_DIR env variable, or --cache-dir argument on pip. To use underlying storage like this, but then repeated remount this path to different locations through docker might cause problems if pip stores and metadata that includes the full path. I don't know enough about pip internals to answer this.

It seems like the most correct option would be to explicitly set CACHE_DIR on the pip commands if using a global cache, and to mount the global cache conditionally on the binds argument here if GLOBAL_PIP_CACHE is True.

Loading

Copy link
Member Author

@humitos humitos Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly redundant, as the project pip cache path is under doc_path already.

Mmm... Yes. This is redundant when the GLOBAL_PIP_CACHE is not set. So, maybe we should add this path only when it's set (as you said, conditionally).

Also, normally to share a pip cache, we'd set CACHE_DIR env variable, or --cache-dir argument on pip

We are using the --cache-dir in the command at python_environments.py file, and that dir points to the project.pip_cache_path

I prefer to use the argument option instead of the env variable (since it's less explicit and more complicated to debug later)

I don't know enough about pip internals to answer this

As far as I know, pip doesn't save metadata, but creates a hash of the package downloaded and use it as filename --but use the real one for wheels:

├── http
│   ├── 0
│   │   ├── 0
│   │   │   ├── 3
│   │   │   │   └── f
│   │   │   │       └── 8
│   │   │   │           └── 003f82596103cb4b5d71b69aeb8119f406b1bc78110a190848c4fb24
│   │   │   ├── 6
│   │   │   │   └── 6
│   │   │   │       └── b
│   │   │   │           └── 0066b726b642ce2657f32a7c7b9f63ae698a2a16695cda32f18eb806
└── wheels
    ├── 02
    │   └── 2e
    │       └── 0d
    │           └── 30ef9919ce3a07990f191c164d07b083761535fb721a1af3cf
    │               ├── ip_associations_python_novaclient_ext-0.2-cp27-none-any.whl
    │               └── ip_associations_python_novaclient_ext-0.2-cp36-none-any.whl
    ├── 06
    │   └── 3d
    │       └── 74
    │           └── f64c3ce3266df2598031818ecc86bc25c8ccede0723973f119
    │               └── anyjson-0.3.3-cp27-none-any.whl

I'd say that mounting conditionally under the binds depending on this setting is the missing piece here and the rest is safe.

Loading

Copy link
Contributor

@agjohnson agjohnson Jan 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good. I don't think there is any metadata dependent on the path either, so this is most likely safe I'm sure.

Loading

Refactor, by creating a new method, where the binds are created to be
easier to read.
@humitos
Copy link
Member Author

@humitos humitos commented Jan 30, 2018

I just refactor the creation of bindings by creating a new method in that class which will mount the global cache conditionally.

Loading

Copy link
Contributor

@agjohnson agjohnson left a comment

Changes look 👍

Loading

Copy link
Member

@safwanrahman safwanrahman left a comment

Looks good.
r+

Loading

@safwanrahman
Copy link
Member

@safwanrahman safwanrahman commented Feb 13, 2018

@ericholscher Can I merge this Pull request?

Loading

@safwanrahman safwanrahman merged commit 1376860 into master Feb 15, 2018
1 check passed
Loading
@stsewd stsewd deleted the humitos/cache/pip-docker branch Jul 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants