## 1.1. Search PubMed

Search PubMed for papers

https://www.ncbi.nlm.nih.gov/pubmed/

https://www.ncbi.nlm.nih.gov/books/NBK25499/

In [15]:
import lcp.reuse as reuse
from Bio import Entrez
from IPython.display import display

### Trying to refine search query

ie. Searching the most general term: `mimic-ii OR mimic-iii`, gives this false positive: https://www.ncbi.nlm.nih.gov/pubmed/12403307. "*Synthesis of a new antischistosomally active and toxicologically tolerant C-12 monothione surrogate of the universal antihelmintic praziquantel*".

In [21]:
entrez_email = 'mimic-support@physionet.org'

search_strings = [
    'mimic-ii OR mimic-iii',
    '(mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu OR physionet)',
    '(mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu)',
    '(mimic-ii OR mimic-iii) AND (database OR clinical OR waveform)',
    '(mimic-ii OR mimic-iii) AND (database OR clinical)',
    '(mimic-ii OR mimic-iii) AND (database)',
    '(mimic-ii OR mimic-iii) AND (clinical)',
]

In [22]:
search_results = reuse.search_list(search_strings, entrez_email)

In [23]:
for ss in search_strings:
    search_result = search_results[ss]
    print('%s:\n - Count: %s' % (search_result.search_string, search_result.count))
    print('\n')

mimic-ii OR mimic-iii:
 - Count: 123


(mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu OR physionet):
 - Count: 120


(mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu):
 - Count: 119


(mimic-ii OR mimic-iii) AND (database OR clinical OR waveform):
 - Count: 116


(mimic-ii OR mimic-iii) AND (database OR clinical):
 - Count: 116


(mimic-ii OR mimic-iii) AND (database):
 - Count: 101


(mimic-ii OR mimic-iii) AND (clinical):
 - Count: 76




### Inspecting Differences

In [18]:
def showdiff(results_a, results_b):
    
    print('Query A: '+results_a.search_string+', '+results_a.count+'results')
    print('Query B: '+results_b.search_string+', '+results_b.count+'results')
    print('(A - B):')
    display(set(results_a.paper_titles) - set(results_b.paper_titles))
    print('(B - A):')
    display(set(results_b.paper_titles) - set(results_a.paper_titles))
    
    print('\n')
    
    return

In [25]:
for i in range(7):
    showdiff(search_results[search_strings[i]], search_results[search_strings[i+1]])
    

Query A: mimic-ii OR mimic-iii, 123results
Query B: (mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu OR physionet), 120results
(A - B):


{'Automated Diagnosis Coding with Combined Text Representations.',
 'HIV fusion peptide penetrates, disorders, and softens T-cell membrane mimics.',
 'Synthesis of a new antischistosomally active and toxicologically tolerant C-12 monothione surrogate of the universal antihelmintic praziquantel.'}

(B - A):


set()



Query A: (mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu OR physionet), 120results
Query B: (mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu), 119results
(A - B):


{'Photoplethysmography-Based Method for Automatic Detection of Premature Ventricular Contractions.'}

(B - A):


set()



Query A: (mimic-ii OR mimic-iii) AND (database OR clinical OR waveform OR icu), 119results
Query B: (mimic-ii OR mimic-iii) AND (database OR clinical OR waveform), 116results
(A - B):


{'An early respiratory distress detection method with Markov models.',
 'Prediction using patient comparison vs. modeling: a case study for mortality prediction.',
 'Wavelet based time series forecast with application to acute hypotensive episodes prediction.'}

(B - A):


set()



Query A: (mimic-ii OR mimic-iii) AND (database OR clinical OR waveform), 116results
Query B: (mimic-ii OR mimic-iii) AND (database OR clinical), 116results
(A - B):


set()

(B - A):


set()



Query A: (mimic-ii OR mimic-iii) AND (database OR clinical), 116results
Query B: (mimic-ii OR mimic-iii) AND (database), 101results
(A - B):


{'A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences.',
 'A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text.',
 'A computational approach to early sepsis detection.',
 'A computational approach to mortality prediction of alcohol use disorder inpatients.',
 'Assessing the Comorbidity Gap between Clinical Studies and Prevalence in Elderly Patient Populations.',
 'Diagnosis code assignment: models and evaluation metrics.',
 'False ventricular tachycardia alarm suppression in the ICU based on the discrete wavelet transform in the ECG signal.',
 'Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.',
 'Performance comparison of multi-label learning algorithms on clinical data for chronic diseases.',
 'Prediction of Sepsis in the Intensive Care Unit With Minimal Electronic Health Record Data: A Machine Learning Approach.',
 'Psychiatric symptom recognit

(B - A):


set()



Query A: (mimic-ii OR mimic-iii) AND (database), 101results
Query B: (mimic-ii OR mimic-iii) AND (clinical), 76results
(A - B):


{'A data mining approach to reduce the false alarm rate of patient monitors.',
 'A methodology for prediction of acute hypotensive episodes in ICU via a risk scoring model including analysis of ST-segment variations.',
 'A physiological time series dynamics-based approach to patient monitoring and outcome prediction.',
 'Adaptive online monitoring for ICU patients by combining just-in-time learning and principal component analysis.',
 'Comparison between invasive and non-invasive blood pressure in young, middle and old age.',
 'Discriminative and Distinct Phenotyping by Constrained Tensor Factorization.',
 'Effect of Antipyretic Therapy on Mortality in Critically Ill Patients with Sepsis Receiving Mechanical Ventilation Treatment.',
 'Efficient and sparse feature selection for biomedical text classification via the elastic net: Application to ICU risk stratification from nursing notes.',
 'Evaluation of monitoring cardiac output by long time interval analysis of a radial arterial blood

(B - A):


{'A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences.',
 'A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text.',
 'A computational approach to early sepsis detection.',
 'A computational approach to mortality prediction of alcohol use disorder inpatients.',
 'Assessing the Comorbidity Gap between Clinical Studies and Prevalence in Elderly Patient Populations.',
 'Diagnosis code assignment: models and evaluation metrics.',
 'False ventricular tachycardia alarm suppression in the ICU based on the discrete wavelet transform in the ECG signal.',
 'Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.',
 'Performance comparison of multi-label learning algorithms on clinical data for chronic diseases.',
 'Prediction of Sepsis in the Intensive Care Unit With Minimal Electronic Health Record Data: A Machine Learning Approach.',
 'Psychiatric symptom recognit





IndexError: list index out of range

### Conclusion

Only the most general query results in any number of false positives. 

Between the most and second most general queries, the false positives are:
- 'HIV fusion peptide penetrates, disorders, and softens T-cell membrane mimics.' https://www.ncbi.nlm.nih.gov/pubmed/20655315
- 'Synthesis of a new antischistosomally active and toxicologically tolerant C-12 monothione surrogate of the universal antihelmintic praziquantel.' https://www.ncbi.nlm.nih.gov/pubmed/12403307

The false negatives (missed results) are:
- 'Automated Diagnosis Coding with Combined Text Representations.' https://www.ncbi.nlm.nih.gov/pubmed/28423783



## 1.2 Searching Google Scholar

Query: `("mimic ii" OR "mimic iii") AND ("database" OR "clinical" OR "waveform" OR ICU)`

https://scholar.google.com/scholar?q=%28%22mimic+ii%22+OR+%22mimic+iii%22%29+AND+%28%22database%22+OR+%22clinical%22%29&btnG=&hl=en&as_sdt=1%2C22&as_vis=1

https://scholar.google.com/scholar/help.html

In [None]:
#import scholarly

In [None]:
print(next(scholarly.search_author('Steven A. Cholewiak')))