Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow string argument for disable/enable/exclude #11406

Merged
merged 9 commits into from
Aug 31, 2022

Conversation

svlandeg
Copy link
Member

Description

When loading an nlp pipeline, we assumed and documented that disable, enable and exclude should be lists, but this wasn't enforced or checked, resulting in silent odd behavior when giving these arguments a single string:

nlp = spacy.load("en_core_web_md", disable="tok2vec")
print(nlp.pipe_names)
#  ['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']
print(nlp._disabled)
# {'t', 'e', 'senter', 'c', 'v', 'o', 'k', '2'}

exclude="tok2vec" did in fact work, by accident, because "tok2vec" in "tok2vec" equals True because of substring comparisons.

In this PR, we broaden the typing of the loading functions to include str as an option, and turn the string into a list before anything else. This makes the functionality consistent with the current implementation of nlp.select_pipes, which also caters for both a single string and a list. Note that the implementation of _resolve_component_status was allowing for disable to be str but the typing didn't reflect that, so I updated that as well.

As part of the docs update, I also added the 3.4 tags for the new enable argument from #10784

Types of change

bug/consistency fix

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@svlandeg svlandeg added feat / serialize Feature: Serialization, saving and loading feat / pipeline Feature: Processing pipeline and components labels Aug 30, 2022
spacy/language.py Outdated Show resolved Hide resolved
spacy/language.py Outdated Show resolved Hide resolved
spacy/language.py Outdated Show resolved Hide resolved
svlandeg and others added 3 commits August 30, 2022 16:28
@adrianeboyd adrianeboyd merged commit 8fc0efc into explosion:master Aug 31, 2022
@svlandeg svlandeg deleted the fix/disable_str branch August 31, 2022 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / pipeline Feature: Processing pipeline and components feat / serialize Feature: Serialization, saving and loading
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants