Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model download returns 0 on failure #1714

Closed
mvollrath opened this issue Dec 11, 2017 · 2 comments
Closed

Model download returns 0 on failure #1714

mvollrath opened this issue Dec 11, 2017 · 2 comments
Labels
enhancement Feature requests and improvements

Comments

@mvollrath
Copy link

While building a Docker image containing spaCy, there was an issue downloading the basic model with python -m spacy download en:

ReadTimeoutError: HTTPSConnectionPool(host='github-production-release-asset-2e65be.s3.amazonaws.com', port=443): Read timed out.

The download returned 0 so the build continued. Next, while building an Rasa NLU model, it couldn't find the model (as you might expect):

IOError: Can't find model 'en'

So I thought, if the download doesn't report failure, maybe I should use the validate command to make sure there's a model present before continuing the build? But validate also returns 0 when no models are present:

    Installed models (spaCy v2.0.5)
    /usr/local/lib/python2.7/dist-packages/spacy


    No models found in your current environment.

I would expect the download tool to return non-zero when it fails to finish downloading a model.

Your Environment

  • Python version: 2.7.12
  • Platform: Linux-4.4.0-98-generic-x86_64-with-Ubuntu-16.04-xenial
  • spaCy version: 2.0.5
  • Environment Information: building an image with Docker 17.09.0
@ines ines added the enhancement Feature requests and improvements label Dec 12, 2017
@ines
Copy link
Member

ines commented Dec 12, 2017

Thanks – good point, I didn't even realise the download command currently behaves like this – this should definitely be fixed. The validate command was originally intended as more of a user-facing utility that prints nicely formatted and helpful info about the models, so we didn't really consider the automated usage and exit codes here... but we might as well do it properly, so thanks for the suggestion!

Btw, if you're downloading models as part of an automated process, you can also just run pip install directly, and use the URL of the model archive (see the model releases). This lets you download the exact model and model version you need, and saves you the extra roundtrip to the spaCy compatibility table. Instead of calling spacy.load(), you can also import the model as a module:

import en_core_web_sm
nlp = en_core_web_sm.load()

In some cases, it might be a little nicer to get a more "native" ImportError if the model isn't installed, rather than a spaCy error somewhere down the line. (But this also depends on your personal preference. You can find more details on this in the "Using models in production" section in the docs.)

@lock
Copy link

lock bot commented May 8, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators May 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Feature requests and improvements
Projects
None yet
Development

No branches or pull requests

2 participants