In [1]:
%load_ext autoreload

In [2]:
%autoreload 2

In [3]:
%load_ext chapyter



In [4]:
#https://translational-medicine.biomedcentral.com/articles/10.1186/s12967-020-02620-5

In [7]:
import os
os.environ["NOTEBOOK_NAME"] = "10_SepsisByAge.ipynb"

In [8]:
%%mimicSQL

Retrieve all data from PATIENTS table where patients database. I want the ones only that have “sepsis”, “severe sepsis” and “septic shock”.

To retrieve the data you're asking for, we first need to identify the patients with "sepsis", "severe sepsis", and "septic shock". These conditions are typically recorded in the DIAGNOSES_ICD table. Here's a SQL query that should get you started:

```sql
SELECT p.*
FROM "mimic"."patients" p
JOIN "mimic"."diagnoses_icd" d ON p.subject_id = d.subject_id
WHERE d.icd9_code IN ('99591', '99592', '78552')
```

In this query, '99591', '99592', and '78552' are the ICD-9 codes for "sepsis", "severe sepsis", and "septic shock" respectively. Please note that this query will return all patients who have ever been diagnosed with these conditions, not just those who were diagnosed on their most recent visit.


Unnamed: 0,row_id,subject_id,gender,dob,dod,dod_hosp,dod_ssn,expire_flag
0,235,250,F,2164-12-27 00:00:00.000,2188-11-22 00:00:00.000,2188-11-22 00:00:00.000,,1
1,235,250,F,2164-12-27 00:00:00.000,2188-11-22 00:00:00.000,2188-11-22 00:00:00.000,,1
2,252,269,M,2130-09-30 00:00:00.000,,,,0
3,252,269,M,2130-09-30 00:00:00.000,,,,0
4,650,689,F,2128-12-27 00:00:00.000,,,,0


In [9]:
%%mimicPython

How many men and women are there?

You can use the `value_counts()` function in pandas to count the number of men and women in the dataframe. Here's how you can do it:

```python
df['gender'].value_counts()
```

This will return the count of each unique value in the 'gender' column of the dataframe.


Result : gender
M    4267
F    3496
Name: count, dtype: int64


In [10]:
%%mimicSQL

Now retrieve the same dataset from MIMIC-III, except only patients over 18.

To retrieve the same dataset for patients over 18, we need to calculate the age of the patients at the time of their first admission. We can do this by subtracting the date of birth (dob) from the date of their first admission (admittime) in the ADMISSIONS table. Here's a SQL query that should get you started:

```sql
SELECT p.*
FROM "mimic"."patients" p
JOIN "mimic"."admissions" a ON p.subject_id = a.subject_id
JOIN "mimic"."diagnoses_icd" d ON p.subject_id = d.subject_id
WHERE d.icd9_code IN ('99591', '99592', '78552')
AND DATE_PART('year', AGE(a.admittime, p.dob)) > 18
```

This query will return all patients over the age of 18 who have ever been diagnosed with "sepsis", "severe sepsis", or "septic shock". Please note that this query will return all patients who have ever been diagnosed with these conditions, not just those who were diagnosed on their most recent visit.
Query failed! State: FAILED
Reason: FUNCTION_NOT_FOUND: line 6:23: Function 'age' not registered
SQL RETRIEVAL FAI

In [11]:
%%mimicSQL

Try again.

I apologize for the confusion. AWS Athena does not support the AGE function. We can instead calculate the age by subtracting the year of birth from the year of admission. Here's the corrected SQL query:

```sql
SELECT p.*
FROM "mimic"."patients" p
JOIN "mimic"."admissions" a ON p.subject_id = a.subject_id
JOIN "mimic"."diagnoses_icd" d ON p.subject_id = d.subject_id
WHERE d.icd9_code IN ('99591', '99592', '78552')
AND EXTRACT(year FROM a.admittime) - EXTRACT(year FROM p.dob) > 18
```

This query will return all patients over the age of 18 who have ever been diagnosed with "sepsis", "severe sepsis", or "septic shock". Please note that this query will return all patients who have ever been diagnosed with these conditions, not just those who were diagnosed on their most recent visit.


Unnamed: 0,row_id,subject_id,gender,dob,dod,dod_hosp,dod_ssn,expire_flag
0,235,250,F,2164-12-27 00:00:00.000,2188-11-22 00:00:00.000,2188-11-22 00:00:00.000,,1
1,235,250,F,2164-12-27 00:00:00.000,2188-11-22 00:00:00.000,2188-11-22 00:00:00.000,,1
2,252,269,M,2130-09-30 00:00:00.000,,,,0
3,252,269,M,2130-09-30 00:00:00.000,,,,0
4,650,689,F,2128-12-27 00:00:00.000,,,,0


In [12]:
%%mimicPython

How many men and women are there?

To find out the number of men and women in the dataframe, you can use the `value_counts()` function on the 'gender' column of the dataframe. Here's how you can do it:

```python
df['gender'].value_counts()
```

This will return the count of each unique value in the 'gender' column, which should be 'M' for male and 'F' for female.


Result : gender
M    9590
F    7811
Name: count, dtype: int64


In [28]:
%%mimicSQL

Fine. Give me the same table above, but only unique subject_ids.

To retrieve the same dataset with unique subject_ids, you can use the DISTINCT keyword in SQL. Here's how you can modify the previous query:

```sql
SELECT DISTINCT p.*
FROM "mimic"."patients" p
JOIN "mimic"."admissions" a ON p.subject_id = a.subject_id
JOIN "mimic"."diagnoses_icd" d ON p.subject_id = d.subject_id
WHERE d.icd9_code IN ('99591', '99592', '78552')
AND EXTRACT(year FROM a.admittime) - EXTRACT(year FROM p.dob) > 18
```

This query will return all unique patients over the age of 18 who have ever been diagnosed with "sepsis", "severe sepsis", or "septic shock".


Unnamed: 0,row_id,subject_id,gender,dob,dod,dod_hosp,dod_ssn,expire_flag
0,669,711,M,2100-03-06 00:00:00.000,2185-05-26 00:00:00.000,,2185-05-26 00:00:00.000,1
1,728,773,M,2051-06-10 00:00:00.000,2109-03-17 00:00:00.000,2109-03-17 00:00:00.000,2109-03-17 00:00:00.000,1
2,755,801,F,2151-02-17 00:00:00.000,2198-08-25 00:00:00.000,2198-08-25 00:00:00.000,2198-08-25 00:00:00.000,1
3,55,62,M,2044-05-08 00:00:00.000,,,,0
4,57,64,F,2116-06-27 00:00:00.000,,,,0


In [29]:
%%mimicPython

How many men and women are there?

You can use the `value_counts()` function in pandas to count the number of men and women in the dataframe. Here's how you can do it:

```python
df['gender'].value_counts()
```

This will return the count of each unique value in the 'gender' column of the dataframe.


Result : gender
M    2591
F    2083
Name: count, dtype: int64


In [30]:
%%mimicPython

What percentage of men and women here died?

To calculate the percentage of men and women who died, you can use the `value_counts()` function on the 'gender' column of the dataframe where 'expire_flag' is 1 (indicating death). Here's how you can do it:

```python
deceased = df[df['expire_flag'] == 1]['gender'].value_counts()
total = df['gender'].value_counts()

percentage_deceased = (deceased / total) * 100
```

This will return the percentage of men and women who died.


Result : gender
M    64.260903
F    61.353817
Name: count, dtype: float64
