Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mindmeld Docker Containers Currently Unrunnable #378

Closed
Zozman opened this issue Oct 18, 2021 · 1 comment
Closed

Mindmeld Docker Containers Currently Unrunnable #378

Zozman opened this issue Oct 18, 2021 · 1 comment
Assignees

Comments

@Zozman
Copy link
Contributor

Zozman commented Oct 18, 2021

Currently, the docker containers provided in the docker_containers directory of this repo are unrunnable. When attempting to run them, the following errors occur:

warning: en_core_web_sm not found on disk. Downloading the model.
/usr/bin/python: No module named spacy

In an attempt to rectify this, I attempted to install both spaCy as well as the en_core_web_sm by adding the following to here in the Dockerfile to install these:

# Install spaCy and the proper pipeline used by mindmeld for this language
RUN pip3 install -U spacy
RUN export LC_ALL=C.UTF-8 && \
    export LANG=C.UTF-8 && \
    python3 -m spacy download en_core_web_sm

This solves the original issue but then the following is seen when running the container:

KeyError: "[E002] Can't find factory for 'tok2vec'. This usually happens when spaCy calls `nlp.create_pipe` with a component name that's not built in - for example, when constructing the pipeline from a model's meta.json. If you're using a custom component, you can write to `Language.factories['tok2vec']` or remove it from the model meta and add it via `nlp.add_pipe` instead."

From here it seems that mindmeld is downgrading spacy when it's being installed so I think the required version at

"spacy~=2.3,!=2.3.6", # avoid 2.3.6 because it was yanked from PyPI
(and then maybe all you would need is to install the correct pipeline).

This is officially over my head so therefore filing this issue.

Steps to reproduce:

  1. Go to docker_containers/mindmeld_docker directory of this repo.
  2. Run docker-compose build --no-cache.
@Zozman
Copy link
Contributor Author

Zozman commented Oct 19, 2021

Provided a possible fix at #380 but not sure if this is the better way to fix it or changing what version of spacy is installed by mindmeld.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants