In [None]:
%pip install ipython-sql psycopg2

In [2]:
%load_ext sql

In [3]:
%load_ext dotenv
%dotenv
import os
dbpw = os.environ.get("postgress_pw")


In [4]:
%sql postgresql://postgres:{dbpw}@localhost:5432/health_db

# INTERMEDIATE QUERIES

1. Unique patients who were admitted to the ICU atleast 5 times during their hospital stay, along with the
count of their ICU stays(Number of times they were admitted to the ICU). Filter out the top 1000 patients
who had the longest ICU stays.
Output: subject id, count
Order: count (Descending), subject id (Descending)


In [None]:
%%sql
SELECT subject_id, COUNT(*) as count
FROM icustays 
GROUP BY subject_id
HAVING COUNT(*) >= 5
ORDER BY count DESC, subject_id DESC
LIMIT 1000;

2. Top 1000 most prescribed medications during the first 12 hours of admission for patients.
Output: drug, prescription count
Order: prescription count (Descending), drug (Descending)


In [None]:
%%sql
SELECT p.drug AS drug, COUNT(*) AS prescription_count
FROM admissions a
JOIN prescriptions p ON a.subject_id = p.subject_id AND a.hadm_id = p.hadm_id
WHERE p.starttime BETWEEN a.admittime AND a.admittime + INTERVAL '12 hours'
GROUP BY p.drug
ORDER BY prescription_count DESC, drug DESC
LIMIT 1000;

3. Patients with multiple admissions, and the number of times they were diagnosed with any ailment related
to the term ’ALCOHOLIC’ (case insensitive) from the drgcodes table’s description column(The diagnosis
description should have the term ’alcoholic’ in it, remember the term is case insensitive). Use only the
dgrcodes table.
Output: subject id, diagnoses count
Order: diagnoses count (Descending), subject id (Descending)


In [None]:
%%sql
SELECT subject_id, COUNT(*) as diagnoses_count
FROM drgcodes 
WHERE lower(description) LIKE '%alcoholic%'
GROUP by subject_id
HAVING COUNT(*) > 1
ORDER BY diagnoses_count DESC, subject_id DESC;

4. Patients with an admission type of ”URGENT” (Case Sensitive) who died during their hospital stay. Also
mention the long titles of the ailments that they had D ICD DIAGNOSES.long title). Only display the first
1000 such records.
Output: subject id, hadm id, icd code, long title
Order: subject id (Descending), hadm id (Descending), icd code (Descending), long title (Descending)


In [None]:
%%sql
SELECT a.subject_id, a.hadm_id, d_icd.icd_code, d_icd.long_title
FROM admissions a JOIN diagnoses_icd diag_icd ON a.subject_id = diag_icd.subject_id AND a.hadm_id = diag_icd.hadm_id
JOIN d_icd_diagnoses d_icd ON diag_icd.icd_code = d_icd.icd_code
WHERE a.admission_type = 'URGENT' AND a.hospital_expire_flag = 1 AND d_icd.long_title IS NOT NULL
ORDER BY a.subject_id DESC, a.hadm_id DESC, d_icd.icd_code DESC, d_icd.long_title DESC
limit 1000;

##### HOPEFULLY RIGHT - AS PER PIAZZA


5. Average duration(days) of ICU stays for patients who had a particular laboratory test (e.g.Labevents
ITEMID=50878) during their stay. Include patient’s subject id and average duration of stay in  Only
return the first 1000 records. (While grouping columns make sure that records with different subject id
and hadm id are counted separately as 2 different records. Since the subject id for 2 records may be same
but hadm id might vary). Further, only consider records where the LOS column of ICUSTAYS table is not
NULL.
Output: subject id, avg stay duration
Order: avg stay duration (Descending), subject id (Descending)


In [None]:
%%sql
SELECT icu.subject_id, AVG(icu.los) as avg_stay_duration
FROM icustays icu 
JOIN labevents lab ON icu.subject_id = lab.subject_id AND icu.hadm_id = lab.hadm_id
WHERE icu.los IS NOT NULL AND lab.itemid = 50878
GROUP BY icu.subject_id, icu.hadm_id
ORDER BY avg_stay_duration DESC, icu.subject_id DESC
LIMIT 1000;

6. Patients who had at least 1 admission with the diagnosis code ’5723’ (use ICD CODE column in DI-
AGNOSES ICD). Include the total number of distinct admissions for each patient(use column ’admis-
sions.hadm id’ for this), along with the earliest and latest admit times (admissions.admittime). Addition-
ally, from this result, give the count of distinct records where the patient was diagnosed with ’5723’(column
name ’diagnosis count’ in the resulting table. Use the ICD CODE column of diagnoses icd table). Ensure
that the results only include patients who had at least one such admission. Also limit your results to the
first 1000 records. Make sure that when you group columns use only the subject id column.
Output: subject id, gender, total admissions, last admission, first admission, diagnosis count
Order: total admissions (Descending), diagnosis count (Descending), last admission (Descending), first admission
(Descending), gender (Descending), subject id (Descending)


In [None]:
%%sql
SELECT p.subject_id, p.gender, COUNT(DISTINCT a.hadm_id) AS total_admissions, 
MAX(a.admittime) AS last_admission,MIN(a.admittime) AS first_admission,
SUM(CASE WHEN diag_icd.icd_code = '5723' THEN 1 ELSE 0 END) AS diagnosis_count
FROM patients p 
JOIN admissions a ON p.subject_id = a.subject_id
LEFT JOIN diagnoses_icd diag_icd ON a.subject_id = diag_icd.subject_id AND a.hadm_id = diag_icd.hadm_id
GROUP BY p.subject_id, p.gender
HAVING SUM(CASE WHEN diag_icd.icd_code = '5723' THEN 1 ELSE 0 END) > 0
ORDER BY total_admissions DESC, diagnosis_count DESC, last_admission DESC, first_admission DESC, gender DESC, p.subject_id DESC
LIMIT 1000;

7. Patients who had at least 5 ICU stays(Distinct stay ids in ICUs). Include the total number of ICU stays,
and the average length of stay across all ICU admissions. Additionally, filter the results to only include
patients who had an ICU stay in any kind of MICU(Medical Intensive Care Unit) (FIRST CAREUNIT or
LAST CAREUNIT of ICUSTAYS table must have term ’MICU’ in their name case sensitive), and limit
the output to the top 500 patients. Make sure that whenever you group records, records with different
subject id must be considered as 2 seperate records.
Output: subject id, total stays, avg length of stay
Order: avg length of stay (Descending), total stays (Descending), subject id (Descending)


In [None]:
%%sql
SELECT subject_id, COUNT(DISTINCT stay_id) as total_stays, AVG(los) as avg_length_of_stay
FROM icustays 
WHERE los IS NOT NULL AND (first_careunit LIKE '%MICU%' OR last_careunit LIKE '%MICU%')
GROUP BY subject_id
HAVING COUNT(DISTINCT stay_id) >= 5
ORDER BY avg_length_of_stay DESC, total_stays DESC, subject_id DESC
LIMIT 500;


8. Patients with a history of heart-related diagnoses (DIAGNOSES ICD.icd code should start from ’V4’ Case
Sensitive) who were prescribed a specific medication (PRESCRIPTIONS.DRUG should have ’prochlorper-
azine’ or ’bupropion’ in its name, case insensitive. So any drug named ’BUPROpion’ or ’buto bupropion
amine’ or ’prochlorperazine 60’ should be included by your query) Finally, filter the results to only include
patients with more than one distinct diagnoses count (use DISTINCT DIAGNOSES ICD.icd code). Make
sure that when you group columns in the resulting table, records having different distinct diagnoses count,
subject id and drug must be treated as separate records
Output: subject id, hadm id, distinct diagnoses count, drug
Order: distinct diagnoses count (Descending), subject id (Descending), hadm id (Descending), drug (Ascending)


In [None]:
%%sql
SELECT d.subject_id, d.hadm_id, COUNT(DISTINCT d.icd_code) AS distinct_diagnoses_count, p.drug
FROM DIAGNOSES_ICD d
JOIN PRESCRIPTIONS p ON d.subject_id = p.subject_id AND d.hadm_id = p.hadm_id
WHERE d.icd_code LIKE 'V4%' AND (LOWER(p.drug) LIKE '%prochlorperazine%' OR LOWER(p.drug) LIKE '%bupropion%')
GROUP BY d.subject_id, d.hadm_id, p.drug
HAVING COUNT(DISTINCT d.icd_code) > 1
ORDER BY distinct_diagnoses_count DESC, d.subject_id DESC, d.hadm_id DESC, p.drug ASC;


9. Patients who were diagnosed with a heart condition (DIAGNOSES ICD˙ICD CODE starts with ’I21’ case
sensitive) during their first admission and were readmitted afterwards so the second admission’s admit-
time must be greater than the first admission’s discharge time. Retrieve only the first 1000 rows.
Output: subject id
Order: subject id (Descending)


In [None]:
%%sql
SELECT a1.subject_id 
FROM admissions a1 
JOIN admissions a2 ON a1.subject_id = a2.subject_id AND a1.hadm_id != a2.hadm_id
JOIN diagnoses_icd diag_icd ON a1.subject_id = diag_icd.subject_id AND a1.hadm_id = diag_icd.hadm_id
WHERE a1.dischtime < a2.admittime AND diag_icd.icd_code LIKE 'I21%'
GROUP BY a1.subject_id
ORDER BY a1.subject_id DESC
LIMIT 1000;

##### Not Sure whether the query for 10 is right

10. Patients who have been prescribed the same medication during multiple admissions, along with details
of the drug(PRESCRIPTIONS.DRUG). Retrieve only the first 1000 rows. Whenever you group columns,
remember that records with different subject id and drug columns(Prescriptions table) are treated as sep-
arate.
Output: subject id, anchor year, drug
Order: subject id (Descending), anchor year (Descending), drug (Descending).

In [None]:
%%sql
SELECT p.subject_id, pt.anchor_year, p.drug
FROM (
    SELECT subject_id, drug
    FROM prescriptions
    GROUP BY subject_id, drug
    HAVING COUNT(DISTINCT hadm_id) > 1
) p
JOIN patients pt ON p.subject_id = pt.subject_id
ORDER BY p.subject_id DESC, pt.anchor_year DESC, p.drug DESC
LIMIT 1000;