# Getting started with JupyterHealthClient

First, you'll want to create a `JupyterHealthClient`.
In a managed deployment, credentials are typically loaded from the `$JHE_TOKEN` and `$JHE_URL` environment variables.

In [None]:
from jupyterhealth_client import Code, JupyterHealthClient

# use anonymize=True to allow output in documentation
jh_client = JupyterHealthClient(anonymize=True)
# or jh_client = JupyterHealthClient(url=url, token=token)

## Retrieving information

### Getting the current user

First, we can see who we are logged in as:

In [None]:
jh_client.get_user()

### Getting study information

We can list all the studies I currently have access to,
including the organization they are associated with.

`study_id` will be useful for retrieving observations .ater.

In [None]:
print("All my studies:")
for study in jh_client.list_studies():
    print(f"  - [{study['id']}] {study['name']} org:{study['organization']['name']}")

And we can get a single study by id:

In [None]:
jh_client.get_study(study["id"])

### Getting patient information

We can list patients we have access to with `list_patients()`,
and see which studies they have shared data with using `get_patient_consents`.

`list` endpoints all return _generators_ and should handle pagination automatically when there are a lot of results.

In [None]:
# show all the patients with study data I have access to:
print("Patients with data I have access to:")

for patient in jh_client.list_patients():
    consents = jh_client.get_patient_consents(patient["id"])
    if not consents["studies"] and not consents["studiesPendingConsent"]:
        continue
    print(
        f"[{patient['id']}] {patient['nameFamily']}, {patient['nameGiven']} ({patient['telecomEmail']})"
    )
    for study in consents["studies"]:
        for scope in study["scopeConsents"]:
            if scope["consented"]:
                # remember which patients have which data for later in the demo
                if scope["code"]["codingCode"] == Code.BLOOD_GLUCOSE.value:
                    cgm_patient_id = patient["id"]
                    cgm_study_id = study["id"]
                if scope["code"]["codingCode"] == Code.BLOOD_PRESSURE.value:
                    bp_patient_id = patient["id"]
                    bp_study_id = study["id"]
                print(f"  - [{study['id']}] {study['name']} ({scope['code']['text']})")
    for study in consents["studiesPendingConsent"]:
        print(f"  - (not consented) [{study['id']}] {study['name']}")

## Retrieving Observations

`list_observations_df` retrieves all observations into a pandas 
You can filter by:

- `study_id` - fetch data authorized to a single study
- `patient_id` - fetch data for a single patient
- `code` - a `Code` filter to select only a single measurement type (e.g. `Code.BLOOD_PRESSURE`)

At least one of `study_id` or `patient_id` must be specified.
`code` is always optional.

To get all blood pressure data for a single study:

In [None]:
bp_iter = jh_client.list_observations(study_id=bp_study_id, code=Code.BLOOD_PRESSURE)
bp_iter

In [None]:
observation = next(iter(bp_iter))
observation

The interesting data is in `valueAttachment`, which is a base64-encoded JSON blob. We can extract it:

In [None]:
import base64
import json

json.loads(base64.decodebytes(observation["valueAttachment"]["data"].encode()).decode())

Or we can use `tidy_observation` to turn the nested structure of an Observation into one more suitable for DataFrames.

`tidy_observation` takes nested fields and turns them into a single flat dictionary, so

```python
{"a": "b": 5}}
```

becomes

```python
{"a_b": 5}
```

`tidy_observation` also understands the structure of the `valueAttachment`, so it handles the base64/json bit, too:

In [None]:
from jupyterhealth_client import tidy_observation

tidy_observation(observation)

### Loading observations into a DataFarme

`list_observations_df` takes the same arguments as `list_observations`, but returns a DataFrame instead of a generator.
The observations are passed through` tidy_observation`, so the keys above are the columns of the DataFrame.

The same data:

In [None]:
# get all blood pressure data
full_bp = jh_client.list_observations_df(study_id=bp_study_id, code=Code.BLOOD_PRESSURE)
full_bp.columns

The data frame preserves all fields recorded by JHE, which is a lot.
You can thin this out by selecting columns to make things more manageable.

Generally the most informative columns are:

- `code` - the code identifying the data type for the row (if `code` isn't filtered; always matches the input `code`, if given)
- `subject_reference` - the `Patient/$id` identifier (useful when you have retrieved data for multiple patients)
- `effective_time_frame_date_time` - the effective time of the Observation in UTC. Also available as `effective_time_frame_date_time_local` if the local time-of-day at the time and place of measurement is useful.
- `*_value` columns - the actual measurements, e.g. `systolic_blood_pressure_value`, `blood_glucose_value`, etc.

Now we can use that and `groupby("subject_reference")` in case we have more than one patient.

In [None]:
bp = full_bp[
    [
        "subject_reference",
        "effective_time_frame_date_time",
        "systolic_blood_pressure_value",
        "diastolic_blood_pressure_value",
    ]
]
bp

In [None]:
bp.groupby("subject_reference").plot(
    x="effective_time_frame_date_time",
    y=["systolic_blood_pressure_value", "diastolic_blood_pressure_value"],
    style="o",
)

### Continuous Glucose Monitor (CGM) data for a single patient

We can do the same with CGM data.
This time, we use `patient_id` and `code` to retrieve CGM data for a single patient.

In [None]:
# get all cgm data
full_cgm = jh_client.list_observations_df(
    patient_id=cgm_patient_id, code=Code.BLOOD_GLUCOSE
)
full_cgm.columns

We can transform the data to have the columns expected by `cgmquantify` and plot it:

In [None]:
import cgmquantify

cgm = full_cgm.loc[:, ["effective_time_frame_date_time_local", "blood_glucose_value"]]
# define columns cgmquantify expects
cgm["Time"] = cgm.effective_time_frame_date_time_local
cgm["Glucose"] = cgm.blood_glucose_value
cgm["Day"] = cgm["Time"].dt.date
cgmquantify.plotglucosebounds(cgm)