Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: pipe() got an unexpected keyword argument 'n_threads' #1

Open
aecio opened this issue Feb 25, 2021 · 4 comments
Open

TypeError: pipe() got an unexpected keyword argument 'n_threads' #1

aecio opened this issue Feb 25, 2021 · 4 comments

Comments

@aecio
Copy link

aecio commented Feb 25, 2021

TypeError is happening in an internal code that invokes nlp.pipe from the Spacy library. The error happens in line for idx, doc in enumerate(nlp.pipe(texts, n_threads=16, batch_size=100)):, and removing n_threads=16 seems to make it work in the spacy version that I'm using.

VisualTextAnalyzer.plot_text_summary(yelp_data, category_column='category', text_column='comments')
Word Frequency:
Analyzing 69 documents (positive category)
Analyzing 65 documents (negative category)
Named Entity Recognition:
Analyzing 69 documents (positive category)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-2065f93d9da3> in <module>
----> 1 VisualTextAnalyzer.plot_text_summary(yelp_data, category_column='category', text_column='comments')

~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in plot_text_summary(data, category_column, text_column, positive_label, negative_label, words_entities)
    343     processed_data = {}
    344     if words_entities is None:
--> 345         processed_data = get_words_entities(data,category_column, text_column, positive_label, negative_label)
    346         global_processed_data = processed_data
    347     else:

~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in get_words_entities(data, category_column, text_column, positive_label, negative_label)
    261     processed_data["words"] =  get_words (positive_texts, negative_texts, labels)
    262     print('Named Entity Recognition:')
--> 263     processed_data["entities"] = get_entities (positive_texts, negative_texts, labels)
    264     raw_text = {}
    265     raw_text['positive_texts'] = positive_texts

~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in get_entities(positive_texts, negative_texts, labels)
    219 
    220 def get_entities (positive_texts, negative_texts, labels):
--> 221     positive_entities = get_entities_frequency(positive_texts, labels['pos'])
    222     negative_entities = get_entities_frequency(negative_texts, labels['neg'])
    223 

~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in get_entities_frequency(texts, label)
    191     alias = {'ORG':'ORGANIZATION', 'LOC':'PLACE', 'GPE':'CITY/COUNTRY', 'NORP':'GROUP', 'FAC':'BUILDING'}
    192     unique_entities = {}
--> 193     for idx, doc in enumerate(nlp.pipe(texts, n_threads=16, batch_size=100)):
    194         for entity in doc.ents:
    195             if entity.label_ in {'CARDINAL', 'ORDINAL', 'QUANTITY'}:

TypeError: pipe() got an unexpected keyword argument 'n_threads'

Spacy version:

$ pip show spacy
Name: spacy
Version: 3.0.3
Summary: Industrial-strength Natural Language Processing (NLP) in Python
Home-page: https://spacy.io
Author: Explosion
Author-email: contact@explosion.ai
License: MIT
Location: ~/miniconda2/envs/myenv/lib/python3.6/site-packages
Requires: preshed, tqdm, typer, pathy, srsly, requests, importlib-metadata, murmurhash, cymem, thinc, setuptools, pydantic, jinja2, packaging, spacy-legacy, typing-extensions, wasabi, numpy, catalogue, blis
Required-by: en-core-web-sm, text-labeling, visual-text-explorer
@ginward
Copy link

ginward commented Jun 12, 2021

I think n_threads has been replaced with n_process, as per described here:

https://spacy.io/usage/processing-pipelines

@yang-peilin
Copy link

I think n_threads has been replaced with n_process, as per described here:

https://spacy.io/usage/processing-pipelines

COOL!!! It works.

@tinltan
Copy link

tinltan commented Oct 15, 2021

I encountered this same error now when using lda2vec-tensorflow (by nateraw) in this part of the code:

`Initialize a preprocessor
P = Preprocessor(df, "texts", max_features=30000, maxlen=10000, min_count=30)

Run the preprocessing on your dataframe
P.preprocess()`

Here is the specific error:
`~.conda\envs\env4\lib\site-packages\lda2vec\nlppipe.py`` in tokenize_and_process(self)
61 print("\n---------- Tokenizing Texts ----------")
62 # Iterate over all uncleaned documents
---> 63 for i, doc in tqdm(enumerate(self.nlp.pipe(texts, n_threads=4))):
64 # Variable for holding cleaned tokens (to be joined later)
65 doc_texts = []

TypeError: pipe() got an unexpected keyword argument 'n_threads' `

How can I repair this if the typo error is within the package?

I tried to downgrade my Spacy version in my conda environment to Spacy=2.3.7, but I encountered a lot of conflicts with the other packages. How to edit the "n_threads" in lda2vec nlppipe?

Thank you in advance! :)

@yang-peilin
Copy link

yang-peilin commented Nov 15, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants