# ScienceIO API Analytics

In this demo, we'll:
- Log in with our user account
- Make our first request
- Put the request in a pandas dataframe and analyze

In [4]:
import pandas as pd
import yaml
from IPython.display import display, JSON

from analytics import *
from scienceio import ScienceIO

## Initialize client

In [5]:
scio = ScienceIO()

If the account already exists (based on the login), you will be notified and can proceed to logging in.

## Make a request to the API

In [18]:
# This is the text we'll send to the API
query_text = (
    """
    The COVID-19 pandemic has shown a markedly low proportion of 
    cases among children. Age disparities in observed COVID-19 cases could be 
    explained by children having lower susceptibility to infection, lower 
    propensity to show clinical symptoms or both. COVID-19 has shown to induce
    Kawasaki syndrome in some children. Kawasaki (KD), or Mucocutaneous Lymph Node Syndrome
    as it is often called, is an acute illness that affects mostly children 6 months to 5 years 
    old—median age in the KD cohort was 25 months—and features inflammation of small and medium 
    blood vessels. Although the cause of KD is unknown, it is believed to occur in genetically 
    predisposed children after exposure to an environmental trigger such as an infection.
    """
)

# Make a request
response_query = scio.annotate(text=query_text)


<class 'str'>


In [19]:
# We can pass `annotations` directly to a dataframe
df = pd.DataFrame(response_query['annotations'])
df = quantize_scores(df)
display(df.head())

Unnamed: 0,annotation_id,concept_id,concept_name,concept_type,pos_end,pos_start,score_id,score_type,text
0,1,UMLS:C5203670,COVID-19,medical_conditions,14,6,Very High,Very High,COVID-19
1,2,UMLS:C1615608,Pandemics,anatomy_&_physiology,23,15,Very High,Very High,pandemic
2,3,UMLS:C0008059,Child,context,85,77,Low,Low,children
3,4,UMLS:C0001779,Age,anatomy_&_physiology,90,87,Very High,Very High,Age
4,5,UMLS:C5203670,COVID-19,medical_conditions,123,115,Very High,Very High,COVID-19


## Viewing the results

In [None]:
report(df)

In [5]:
# top mentions
get_top_dict(df, "text")

{'text': ['children',
  'KD',
  'COVID-19',
  'infection',
  'environmental',
  'pandemic',
  'inflammation',
  'illness',
  'features',
  'Age'],
 'mentions': [5, 3, 3, 2, 1, 1, 1, 1, 1, 1]}

In [6]:
# top concepts
get_top_dict(df, "concept_name")

{'concept_name': ['Mucocutaneous Lymph Node Syndrome',
  'Child',
  'COVID-19',
  'Age',
  'Communicable Diseases',
  'Blood Vessel',
  'Characteristics',
  'Chronic disease',
  'Cohort Studies',
  'Environment'],
 'mentions': [5, 4, 3, 2, 2, 1, 1, 1, 1, 1]}

In [7]:
# top concept types
get_top_dict(df, "concept_type")

{'concept_type': ['Medical Conditions',
  'Context',
  'Anatomy & Physiology',
  'Species & Viruses'],
 'mentions': [12, 7, 4, 3]}

In [8]:
# count for score_id
get_score_dict(df)

{'Very High': 18, 'Low': 6, 'High': 1, 'Moderate': 1, 'Very Low': 0}