-
Notifications
You must be signed in to change notification settings - Fork 5
Text Snowball
petermr edited this page Aug 18, 2021
·
7 revisions
Goal. To create a list of terms (words and phrases) which are effective at searching the literature for a desired topic.
Why not arxiv?
(We
- query pygetpapers with a relevant query (AND OR NOT) (maybe fairly general) and TERMS (cyclic voltammetry) NOT generator
- download small corpus (<= 100 papers)
- rapidly inspect these for frequent relevant terms
- RAKE / YAKE (adjust phrase length)
- human eyeballs
- SPacy
- triage the list
- create/append to list of terms
- repeat until we get enough, or give up
now we have a list of terms
- create dictionary from terms
Set up on Google Collab
Choose voltammetry
Use EPMC