Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError #307

Closed
olegs opened this issue Jan 17, 2022 · 0 comments
Closed

ValueError #307

olegs opened this issue Jan 17, 2022 · 0 comments
Labels
bug Something isn't working

Comments

@olegs
Copy link
Member

olegs commented Jan 17, 2022

Try paper search: Semi-supervised peak calling with SPAN and JBR Genome Browser

[2022-01-17 20:43:56,132: INFO/ForkPoolWorker-2] Searching for a publication with title=Semi-supervised peak calling with SPAN and JBR Genome Browser
[2022-01-17 20:43:56,497: INFO/ForkPoolWorker-2] Analyzing 1 paper(s) from Pubmed
[2022-01-17 20:43:56,498: INFO/ForkPoolWorker-2] Expanding related papers by references
[2022-01-17 20:43:56,512: INFO/ForkPoolWorker-2] Loading publication data
[2022-01-17 20:43:56,526: INFO/ForkPoolWorker-2] Found 1 papers in database
[2022-01-17 20:43:56,527: INFO/ForkPoolWorker-2] Analyzing title and abstract texts
[2022-01-17 20:43:56,527: INFO/ForkPoolWorker-2] Building corpus from 1 papers
[2022-01-17 20:43:56,527: INFO/ForkPoolWorker-2] Processing stemming for all papers
[2022-01-17 20:43:56,556: INFO/ForkPoolWorker-2] Creating global shortest stem to word map
[2022-01-17 20:43:56,556: INFO/ForkPoolWorker-2] Creating stemmed corpus
[2022-01-17 20:43:56,566: ERROR/ForkPoolWorker-2] Task analyze_search_paper[08f50970-dd88-4da1-a8b2-86aefa187cad] raised unexpected: ValueError('After pruning, no terms remain. Try a lower min_df or a higher max_df.')
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/pubtrends/lib/python3.8/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/user/miniconda3/envs/pubtrends/lib/python3.8/site-packages/celery/app/trace.py", line 650, in __protected_call__
    return self.run(*args, **kwargs)
  File "/home/user/pysrc/celery/tasks_main.py", line 149, in analyze_search_paper
    return _analyze_id_list(
  File "/home/user/pysrc/celery/tasks_main.py", line 123, in _analyze_id_list
    analyzer.analyze_papers(ids, query, topics, test=test, task=task)
  File "/home/user/pysrc/papers/analyzer.py", line 150, in analyze_papers
    self.corpus, self.corpus_tokens, self.corpus_counts = vectorize_corpus(
  File "/home/user/pysrc/papers/analysis/text.py", line 48, in vectorize_corpus
    counts = vectorizer.fit_transform([list(chain(*sentences)) for sentences in papers_sentences_corpus])
  File "/home/user/miniconda3/envs/pubtrends/lib/python3.8/site-packages/sklearn/feature_extraction/text.py", line 1221, in fit_transform
    X, self.stop_words_ = self._limit_features(X, vocabulary,
  File "/home/user/miniconda3/envs/pubtrends/lib/python3.8/site-packages/sklearn/feature_extraction/text.py", line 1092, in _limit_features
    raise ValueError("After pruning, no terms remain. Try a lower"
ValueError: After pruning, no terms remain. Try a lower min_df or a higher max_df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant