Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_length parameter error with the latest version #202

Closed
shinshinsakasaka opened this issue Jul 14, 2022 · 1 comment
Closed

max_length parameter error with the latest version #202

shinshinsakasaka opened this issue Jul 14, 2022 · 1 comment

Comments

@shinshinsakasaka
Copy link

Thank you for developing a great tool.

I'm facing a max_length parameter error. I installed pke by pip install git+https://github.com/boudinfl/pke.git

  • Python and Spacy version

Python 3.9.12


✔ Loaded compatibility table

================= Installed pipeline packages (spaCy v3.4.0) =================
ℹ spaCy installation: C:\Users\shins\anaconda_new\lib\site-packages\spacy

NAME             SPACY            VERSION
en_core_web_sm   >=3.4.0,<3.5.0   3.4.0     ✔

  • I'm listing the load_document parameters and errors I got below.
extractor.load_document(input = text,language = 'en',normalization = None)

ValueError: [E088] Text of length 1210306 exceeds maximum of 1000000. The parser and NER models 
 require roughly 1GB of temporary memory per 100,000 characters in the input. This means long texts 
 may cause memory allocation errors. If you're not using the parser or NER, it's probably safe to 
 increase the `nlp.max_length` limit. The limit is in number of characters, so you can check whether 
 your inputs are too long by checking `len(text)`
extractor.load_document(input = text,language = 'en',max_length = 1210310, normalization = None)  

TypeError: load_document() got an unexpected keyword argument 'max_length'

How can I fix this problem? I appreciate your help.

@ygorg
Copy link
Collaborator

ygorg commented Sep 30, 2022

Hi, thanks for using pke.
You can load your document using spacy, modify the max_length and then pass it to pke like so:

document = 'My text'
nlp = spacy.load('en_core_web_sm')
nlp.max_length = 1000000000000000000000000
preproc_doc = nlp(document)

extractor = pke.unsupervised.MultipartiteRank()
extractor.load_document(preproc_doc)

If you face this error maybe pke is not the right tool for you (cf. pke#131)

@ygorg ygorg closed this as completed Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants