## Co-Occurence of Term Analysis

Co-occurence of terms analysis: check how often pre-selected cognitive terms appear in abstracts with ERP terms. 

This analysis searches through pubmed for papers that contain specified ERP and COG terms. Data extracted is the count of the number of papers with both terms. This is used to infer what cognitive terms each ERP is affiliated with. 

NOTE:
- COG terms here are a somewhat arbitrary selection: need a better set of terms, less arbitrarily selected. 

In [1]:
# Import custom code
from erpsc.count import Count
from erpsc.core.io import save_pickle_obj, load_pickle_obj

ImportError: No module named erpsc.count

In [2]:
# Initialize object for term count co-occurences. 
counts = Count()

In [3]:
# Load ERPS and terms from file
counts.set_erps_file()
counts.set_terms_file('cognitive')

In [4]:
# OR: Set small set of ERPs and terms for tests

# Small test set of words
erps = ['N400', 'P600']
cog_terms = ['language', 'memory'] 

# Add ERPs and terms
counts.set_erps(erps)
counts.set_terms(cog_terms)

Unloading previous ERP words.
Unloading previous terms words.


In [5]:
# Scrape the co-occurence of terms data
counts.scrape_data(db='pubmed')

In [7]:
# Check the most commonly associated COG term for each ERP
counts.check_cooc_erps()

For the  N400  the most common association is 	 language   with 	 %45.72
For the  P600  the most common association is 	 language   with 	 %58.95


In [8]:
# Check the most commonly associated ERP for each term
counts.check_cooc_terms()

For  language     the strongest associated ERP is 	 P600  with 	 %58.95
For  memory       the strongest associated ERP is 	 N400  with 	 %21.94


In [9]:
# Check the terms with the most papers
counts.check_top()

The most studied ERP is  N400    with     1914 papers
The most studied term is  memory  with   225924  papers


In [10]:
# Check how many papers were found for each term - ERPs
counts.check_counts('erp')

N400  -     1914
P600  -      553


In [11]:
# Check how many papers were found for each term - COGs
counts.check_counts('term')

language           -     146923
memory             -     225924


In [None]:
# Save pickle file of results
save_pickle_obj(counts, 'test2')

In [None]:
# Load from pickle file
counts = load_pickle_obj('test2_counts')