# ScienceIO API Demo

In this demo, we'll:
- Log in with our user account
- Make our first request
- Put the request in a pandas dataframe and analyze

In [6]:
import pandas as pd
import yaml
from IPython.display import display, JSON

from analytics import *
from scienceio import ScienceIO

## Initialize client

In [7]:
scio = ScienceIO()

If the account already exists (based on the login), you will be notified and can proceed to logging in.

## Make a request to the API

In [8]:
query_text = (
    """
    The patient is a 21-day-old Caucasian male here for 2 days of congestion - 
    mom has been suctioning yellow discharge from the patient's nares, plus she has noticed 
    some mild problems with his breathing while feeding (but negative for any perioral cyanosis or retractions). 
    One day ago, mom also noticed a tactile temperature and gave the patient Tylenol. Baby also has 
    had some decreased p.o. intake. His normal breast-feeding is down from 20 minutes q.2h. 
    to 5 to 10 minutes secondary to his respiratory congestion. He sleeps well, but has been more tired 
    and has been fussy over the past 2 days. The parents noticed no improvement with albuterol treatments given
    in the ER. His urine output has also decreased; normally he has 8 to 10 wet and 5 dirty diapers per 24 hours,
    now he has down to 4 wet diapers per 24 hours. Mom denies any diarrhea. His bowel movements are yellow colored and soft in nature.
    """
)


# Make a request
response_query = scio.annotate(text=query_text)

<class 'str'>


In [9]:
# We can pass `spans` directly to a dataframe
df = pd.DataFrame(response_query['annotations'])
display(df)
report(df)

Unnamed: 0,annotation_id,concept_id,concept_name,concept_type,pos_end,pos_start,score_id,score_type,text
0,1,UMLS:C0086418,Homo sapiens,species_&_viruses,13,6,0.99999,0.999998,patient
1,2,UMLS:C0007457,Caucasoid Race,context,39,30,0.99991,0.99987,Caucasian
2,3,UMLS:C0086582,Males,anatomy_&_physiology,44,40,0.999974,0.999978,male
3,4,UMLS:C0700148,Congestion,medical_conditions,74,64,0.999889,0.999847,congestion
4,5,UMLS:C0221205,Yellow color,context,109,103,0.999917,0.999898,yellow
5,6,UMLS:C0030685,Patient Discharge,medical_procedures,119,110,0.996726,0.995705,discharge
6,7,UMLS:C0086418,Homo sapiens,species_&_viruses,136,129,0.999974,0.999986,patient
7,8,UMLS:C0595944,Anterior nares,anatomy_&_physiology,144,139,0.999976,0.99979,nares
8,9,UMLS:C0010520,Cyanosis,medical_conditions,260,243,0.99922,0.998834,perioral cyanosis
9,10,UMLS:C1704938,FBXW7 wt Allele,genetics,291,288,0.999992,0.999987,ago


Found 24 mentions of healthcare information (20 unique).
Found 20 unique concepts, spanning 7 categories.


In [10]:
# This is the text we'll send to the API
query_text = (
    """
    The COVID-19 pandemic has shown a markedly low proportion of 
    cases among children. Age disparities in observed COVID-19 cases could be 
    explained by children having lower susceptibility to infection, lower 
    propensity to show clinical symptoms or both. COVID-19 has shown to induce
    Kawasaki syndrome in some children. Kawasaki (KD), or Mucocutaneous Lymph Node Syndrome
    as it is often called, is an acute illness that affects mostly children 6 months to 5 years 
    old—median age in the KD cohort was 25 months—and features inflammation of small and medium 
    blood vessels. Although the cause of KD is unknown, it is believed to occur in genetically 
    predisposed children after exposure to an environmental trigger such as an infection.
    """
)

# Make a request
response_query = scio.annotate(text=query_text)

<class 'str'>


In [11]:
# We can pass `spans` directly to a dataframe
df = pd.DataFrame(response_query['annotations'])
display(df)

Unnamed: 0,annotation_id,concept_id,concept_name,concept_type,pos_end,pos_start,score_id,score_type,text
0,1,UMLS:C5203670,COVID-19,medical_conditions,14,6,0.999316,0.999914,COVID-19
1,2,UMLS:C1615608,Pandemics,anatomy_&_physiology,23,15,0.999836,0.998517,pandemic
2,3,UMLS:C0008059,Child,context,85,77,0.581539,0.505265,children
3,4,UMLS:C0001779,Age,anatomy_&_physiology,90,87,0.999856,0.999865,Age
4,5,UMLS:C5203670,COVID-19,medical_conditions,123,115,0.998266,0.999884,COVID-19
5,6,UMLS:C0008059,Child,context,162,154,0.549657,0.500652,children
6,7,UMLS:C0009450,Communicable Diseases,medical_conditions,203,194,0.922211,0.999919,infection
7,8,UMLS:C5203670,COVID-19,medical_conditions,267,259,0.999292,0.999893,COVID-19
8,9,UMLS:C0026691,Mucocutaneous Lymph Node Syndrome,medical_conditions,306,289,0.999932,0.999624,Kawasaki syndrome
9,10,UMLS:C0008059,Child,species_&_viruses,323,315,0.539449,0.52799,children


## Viewing the results

In [12]:
report(df)

Found 26 mentions of healthcare information (17 unique).
Found 15 unique concepts, spanning 4 categories.


In [13]:
# top mentions
get_top(df, "text")

Unnamed: 0,text,mentions
0,children,5
1,KD,3
2,COVID-19,3
3,infection,2
4,environmental,1
5,pandemic,1
6,inflammation,1
7,illness,1
8,features,1
9,Age,1


In [14]:
# top concepts
get_top(df, "concept_name")

Unnamed: 0,concept_name,mentions
0,Mucocutaneous Lymph Node Syndrome,5
1,Child,4
2,COVID-19,3
3,Age,2
4,Communicable Diseases,2
5,Blood Vessel,1
6,Characteristics,1
7,Chronic disease,1
8,Cohort Studies,1
9,Environment,1


In [15]:
# top concept types
get_top(df, "concept_type")

Unnamed: 0,concept_type,mentions
0,medical_conditions,12
1,context,7
2,anatomy_&_physiology,4
3,species_&_viruses,3
