# Accessing clinical profiles through Python

First, we'll import `ClinicalProfileServer` from the `clinicalprofiles` client library. This also includes `Variable` and `ClinicalProfile` classes, but you can ignore them for now.

In [1]:
from clinicalprofiles import ClinicalProfileServer

We'll set the base URL for our profile server.

> Note: This will eventually change to https://hapi.clinicalprofiles.org.

In [2]:
BASE_URL = "https://hapi.clinicalprofiles.org"

Now we construct the `ClinicalProfileServer`, which will give us access to a list of all profiles as they appear on the server.

In [3]:
C = ClinicalProfileServer(BASE_URL)

To see a list of available profile names, you can use the `.keys()` accessor method. Just like a dictionary, you can access a `ClinicalProfileServer` with square-bracket operators to access a particular item:

In [4]:
C.keys()

['jhu-asthma-profile-33-labs',
 'jhu-asthma-profile-1-labs',
 'jhu-asthma-profile-63-diagnoses',
 'jhu-asthma-profile-77-diagnoses',
 'jhu-asthma-profile-88-diagnoses',
 'jhu-asthma-profile-17-labs',
 'jhu-asthma-profile-16-labs',
 'jhu-asthma-profile-89-diagnoses',
 'jhu-asthma-profile-76-diagnoses',
 'jhu-asthma-profile-62-diagnoses']

In [5]:
p = C['jhu-asthma-profile-17-labs']
print(p)

<ClinicalProfile type=[Lab Results] 'jhu-asthma-profile-17-labs'>


## Using clinical profiles after downloading

You have now downloaded a `ClinicalProfile` object! We can look at what's inside it:

In [6]:
[s for s in vars(p) if not s.startswith("_")]

['name',
 'last_updated',
 'date',
 'version',
 'url',
 'status',
 'population',
 'cohort',
 'source',
 'reporter',
 'type',
 'profile']

As you can see, we have access to simple metadata like `last_updated` (converted to a Python `datetime` object for your viewing pleasure) as well as `reporter`:

In [7]:
print(p.last_updated)
print(f"Reported by {p.reporter['display']}")

2019-01-22 13:25:51.038000+00:00
Reported by Johns Hopkins School of Medicine


We also have access to the variables directly, using the `get_variables` method:

In [8]:
variables = p.get_variables()
print("The first few variables in this profile are:")
print(variables[:5])

The first few variables in this profile are:
[<Variable 'Glucose [Mass/volume] in Serum or Plasma' μ=121.78 s=65.46>, <Variable 'Glucose [Mass/volume] in Blood' μ=132.44 s=76.43>, <Variable 'Glucose [Moles/volume] in Serum or Plasma' μ=130.25 s=65.67>, <Variable 'Fasting glucose [Mass/volume] in Serum or Plasma' μ=96.41 s=38.93>, <Variable 'Potassium [Moles/volume] in Venous blood' μ=4.23 s=0.56>]


If you prefer to work with data in a pandas DataFrame object instead of in the `Variable` object, you can request the variables as a dataframe, using the `as_dataframe` method. This takes the optional argument of `columns`, which can be anything in the list of:

* min
* max
* mean
* stdDev
* deciles

If you include `deciles`, the dataframe will contain columns of name `decile_X`, where `X` ∈ {0..9}.

In [9]:
df = p.as_dataframe(["min", "max", "deciles"])

In [10]:
df.head()

Unnamed: 0,decile_10,decile_20,decile_30,decile_40,decile_50,decile_60,decile_70,decile_80,decile_90,max,min
Glucose [Mass/volume] in Serum or Plasma,79.0,86.0,90.0,95.0,101.0,109.0,120.0,141.0,186.0,1436.0,3.0
Glucose [Mass/volume] in Blood,84.0,91.0,96.0,101.0,107.0,116.0,130.0,153.0,208.0,907.0,36.0
Glucose [Moles/volume] in Serum or Plasma,84.0,91.0,96.0,101.0,108.0,116.0,131.0,157.0,208.0,571.0,30.0
Fasting glucose [Mass/volume] in Serum or Plasma,79.0,80.0,81.0,86.0,87.0,88.0,91.0,94.0,100.0,275.0,71.0
Potassium [Moles/volume] in Venous blood,3.6,3.8,4.0,4.1,4.2,4.3,4.5,4.6,4.9,9.4,1.5
