Skip to content

Copy ~/.cache/x into Docker #320

@yeldarby

Description

@yeldarby

Every time I do cog predict, the huggingface package is downloading the 500MB GPT2 pretrained weights into ~/.cache/huggingface which is quite slow & bandwidth intensive.

I can't figure out how to bundle these into the Docker.

So far I've tried

  • Using run in cog.yaml (doesn't work because it doesn't have access to the file yet)
  • Copying ~/.cache/huggingface into my working dir and having my predict.py copy the files at the top (which doesn't work for unknown reasons; possibly everything outside of /src is read-only) os.system("cp -r /src/cache/huggingface /root/.cache/huggingface")

Update
Fixed this for my use-case by overriding the transformers cache location with an environment variable:

import os
os.environ['TRANSFORMERS_CACHE'] = '/src/cache'

Since /src is mounted from my local filesystem it looks like it saved the weights in there the first time I ran it and loaded them the next time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions