Every time I do cog predict, the huggingface package is downloading the 500MB GPT2 pretrained weights into ~/.cache/huggingface which is quite slow & bandwidth intensive.
I can't figure out how to bundle these into the Docker.
So far I've tried
- Using
run in cog.yaml (doesn't work because it doesn't have access to the file yet)
- Copying
~/.cache/huggingface into my working dir and having my predict.py copy the files at the top (which doesn't work for unknown reasons; possibly everything outside of /src is read-only) os.system("cp -r /src/cache/huggingface /root/.cache/huggingface")
Update
Fixed this for my use-case by overriding the transformers cache location with an environment variable:
import os
os.environ['TRANSFORMERS_CACHE'] = '/src/cache'
Since /src is mounted from my local filesystem it looks like it saved the weights in there the first time I ran it and loaded them the next time.
Every time I do
cog predict, thehuggingfacepackage is downloading the 500MB GPT2 pretrained weights into~/.cache/huggingfacewhich is quite slow & bandwidth intensive.I can't figure out how to bundle these into the Docker.
So far I've tried
runincog.yaml(doesn't work because it doesn't have access to the file yet)~/.cache/huggingfaceinto my working dir and having mypredict.pycopy the files at the top (which doesn't work for unknown reasons; possibly everything outside of/srcis read-only)os.system("cp -r /src/cache/huggingface /root/.cache/huggingface")Update
Fixed this for my use-case by overriding the
transformerscache location with an environment variable:Since
/srcis mounted from my local filesystem it looks like it saved the weights in there the first time I ran it and loaded them the next time.