In [2]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [3]:
%sql mysql+pymysql://root:your_password_here@localhost:XYZ/healthcare_db

This section demonstrates how to connect to the MySQL database safely without exposing sensitive credentials. The connection uses placeholder values ​​for the username and password, so the notebook can be shared publicly on GitHub. To run it locally, you only need to replace the placeholders with your own credentials. This approach ensures security while keeping the workflow reproducible and easy to follow.

In [7]:
# You can run this in Jupyter Notebook
# No password is stored here

import mysql.connector

# Fill in your credentials locally
db_user = "your_username_here"
db_password = "your_password_here"
db_host = "localhost"
db_name = "healthcare_db"

# Connect to the database
try:
    conn = mysql.connector.connect(
        user=db_user,
        password=db_password,
        host=db_host,
        database=db_name
    )
    cursor = conn.cursor()
    print("✅ Database connection successful!")
except mysql.connector.Error as err:
    print("❌ Error: ", err)

ModuleNotFoundError: No module named 'mysql'

In [11]:
%%sql
SELECT
*
FROM
patients_clean
LIMIT 1;

patient_id,age,gender,blood_type_code,medical_condition,admission_date,doctor_id,hospital_normalized,hospital_id,insurance_provider,billing_amount,room_number,admission_type_id,discharge_date,medication,test_results,needs_date_review
P75010,30,Male,B-,Cancer,2024-01-31,26505,sons and miller,34331,Blue Cross,18856.0,328,1,2024-02-02,Paracetamol,Normal,0


1. DATASET OVERVIEW

LET START BY COUNTING THE NUMBER OF PATIENT IN THE CLEANED DATASET

I begin the analysis by calculating the total number of patients in the cleaned dataset. This provides a baseline understanding of the data

In [12]:
%%sql
SELECT
    COUNT(*) AS Number_of_patients 
FROM
    Patients_clean;

Number_of_patients
52466


Next, I identify the number of unique hospitals, doctors, and admission types represented in the dataset. This helps me understand the scope of healthcare providers involved and the diversity of patient admission categories.

In [14]:
%%sql
SELECT
    COUNT(DISTINCT H.Hospital_name) AS cnt_Hospitals,
    COUNT(DISTINCT Doctor_id) AS Cnt_doctor,
    COUNT(DISTINCT Admission_type_id) AS cnt_admission_type
FROM
    Patients_clean P
JOIN
    Hospital_ref H
ON
    P.Hospital_id=H.Hospital_id;

cnt_Hospitals,Cnt_doctor,cnt_admission_type
38305,38791,3


Let us find how many patients need date review.

In [33]:
%%sql
SELECT
    SUM(needs_date_review) AS Date_to_be_reviewed
FROM
    Patients_clean;

Date_to_be_reviewed
6277


Patients Requiring Date Review: I check how many patient records are flagged for date inconsistencies. This step helps assess data quality and highlights entries that may need correction before deeper analysis.

In [36]:
%%sql
SELECT
    *
FROM
    Patients_clean
WHERE 
    patient_id IS NULL OR Patient_id LIKE '' --This one checks if there is a NULL value or an empty cell.
    OR	age IS NULL OR age LIKE ''
    OR gender IS NULL OR gender LIKE ''
    OR blood_type_code IS NULL OR blood_type_code LIKE ''
    OR medical_condition IS NULL OR medical_condition LIKE ''
    OR admission_date IS NULL OR admission_date LIKE ''
    OR doctor_id IS NULL OR doctor_id LIKE ''
    OR hospital_normalized IS NULL OR hospital_normalized LIKE '' 
    OR hospital_id IS NULL OR hospital_id LIKE ''
    OR insurance_provider IS NULL OR insurance_provider LIKE ''
    OR billing_amount IS NULL OR billing_amount LIKE ''
    OR room_number IS NULL OR room_number LIKE ''
    OR admission_type_id IS NULL OR admission_type_id LIKE ''
    OR discharge_date IS NULL OR discharge_date LIKE ''
    OR medication IS NULL OR medication LIKE ''
    OR test_results IS NULL OR test_results LIKE ''
    OR needs_date_review IS NULL OR needs_date_review LIKE '';

patient_id,age,gender,blood_type_code,medical_condition,admission_date,doctor_id,hospital_normalized,hospital_id,insurance_provider,billing_amount,room_number,admission_type_id,discharge_date,medication,test_results,needs_date_review


SECTION 2: PATIENT DEMOGRAPHIC

Patient Age Distribution

To understand the demographic spread of the dataset, I calculate the youngest, oldest, and average patient age. This provides a quick overview of the age range represented in the hospital records.

In [38]:
%%sql
SELECT
    MIN(Age) AS Youngest_Patient_Age,
    MAX(Age) AS Oldest_Patient_Age,
    AVG(Age) AS Avg_age
FROM
    Patients_clean;

Youngest_Patient_Age,Oldest_Patient_Age,Avg_age
13,89,51.5164


Gender

I examine the number of male and female patients in the dataset to understand the gender breakdown within

In [39]:
%%sql
SELECT
    Gender,
    COUNT(Patient_id) AS Patients_per_gender
FROM
    Patients_clean
GROUP BY 
    Gender;

Gender,Patients_per_gender
Male,26280
Female,26186


Identifying Age Outliers

I check for unusual or extreme age values, such as negatives or ages over 130, which could skew analysis or indicate data entry errors.

In [46]:
%%sql
SELECT
Age
FROM
Patients_clean
WHERE Age<0 --for negative age
    OR Age>130; --for higher unusual age numbers(more than 130)

Age


SECTION 3: HOSPITAL ANALYSIS

Hospitals With the Highest Patient Volume

I analyze which hospitals receive the most patients to understand facility workload and identify high-traffic healthcare centers within the dataset.

In [122]:
%%sql
SELECT
    DISTINCT H.Hospital_name,
    COUNT(Patient_id) AS Patient_per_hospital
FROM
    Patients_clean P
JOIN
    Hospital_ref H
ON
    P.Hospital_id=H.Hospital_id
GROUP BY
    H.Hospital_name
HAVING
    COUNT(Patient_id)>1
ORDER BY
    COUNT(Patient_id) DESC;

Hospital_name,Patient_per_hospital
llc smith,42
ltd smith,38
johnson plc,37
smith ltd,35
smith plc,33
smith group,32
smith inc,32
smith llc,32
group smith,31
johnson inc,31


Patient Distribution by Hospital and Admission Type

I break down patient counts for each hospital by admission type to understand how different facilities manage various categories of patient admissions.

In [5]:
%%sql
SELECT
    H.Hospital_name,
    A.admission_type ,
    COUNT(P.Patient_id) AS Patient_per_hospital
FROM
    Patients_clean P
JOIN Admission_type_ref A
    ON P.admission_type_id=A.admission_type_id
JOIN
    Hospital_ref H
ON
    P.Hospital_id=H.Hospital_id
GROUP BY
    H.Hospital_name,
    A.admission_type
ORDER BY
    H.Hospital_name,
    A.admission_type,
    Patient_per_hospital DESC;

Hospital_name,admission_type,Patient_per_hospital
abbott and thompson sullivan,Elective,1
abbott inc,Elective,1
abbott ltd,Emergency,1
abbott moore and williams,Emergency,1
abbott peters and hoffman,Urgent,2
abbott vazquez bautista and,Elective,1
abbottcastillo,Urgent,1
abbottcoleman,Emergency,1
abbottferrell,Urgent,1
abbotthill,Elective,1


Section 4: Doctor Analysis

Doctors With the Highest Patient Load

I analyze which doctors handle the most patients, giving insight into workload distribution and identifying highly engaged healthcare providers in the dataset.

In [21]:
%%sql
SELECT
    A.Doctor_name,
    COUNT(Patient_id) AS Sum_of_patients
FROM
    Doctor_ref A
JOIN
    Patients_clean P
ON
    A.doctor_id=P.doctor_id
GROUP BY
    A.Doctor_name
ORDER BY
    COUNT(Patient_id) DESC;

Doctor_name,Sum_of_patients
Michael Smith,25
John Smith,22
Michael Johnson,19
David Smith,19
Robert Johnson,19
Robert Smith,19
Michael Williams,18
James Smith,18
John Johnson,17
Christopher Smith,17


Doctor Workload by Admission Type

I examine how many patients each doctor treats across different admission types. This helps reveal specialization patterns and variations in workload based on patient categories.

In [115]:
%%sql
SELECT
    D.Doctor_name,
    A.Admission_type,
    COUNT(Patient_id) AS Sum_of_patients
FROM
    Doctor_ref D
JOIN
    Patients_clean P
ON
    D.doctor_id=P.doctor_id
JOIN
    Admission_type_ref A
ON
    A.Admission_type_id=P.Admission_type_id
GROUP BY 
    D.Doctor_name,
    A.Admission_type
ORDER BY 
    D.Doctor_name,
    A.Admission_type;

Doctor_name,Admission_type,Sum_of_patients
Aaron Acevedo,Elective,1
Aaron Adams,Urgent,1
Aaron Aguilar,Emergency,1
Aaron Alexander,Urgent,1
Aaron Anderson,Elective,1
Aaron Arnold,Elective,1
Aaron Baker,Emergency,2
Aaron Barker,Urgent,1
Aaron Barrett,Elective,1
Aaron Barry,Emergency,1


Section 5: Admission Type Analysis

Patient Count by Admission Type

I analyze how many patients fall under each admission type to understand the most common reasons patients enter the hospital and identify major categories of care.

In [117]:
%%sql
SELECT
    A.Admission_type,
    COUNT(Patient_id) AS Patients_per_admission_type
FROM
    Admission_type_ref A
JOIN
    Patients_clean P
ON
    A.admission_type_id=P.Admission_type_id
GROUP BY
    A.Admission_type
ORDER BY
    COUNT(Patient_id) DESC;

Admission_type,Patients_per_admission_type
Elective,17642
Urgent,17531
Emergency,17293


Average Billing and Length of Stay by Admission Type

I calculate the average billing amount and the average hospital stay for each admission type. This highlights cost patterns and differences in patient care duration across categories.

In [122]:
%%sql
SELECT
    A.Admission_type,
    AVG(P.Billing_amount),
    AVG(DATEDIFF(P.Discharge_date,P.admission_date)) AS Avg_stay
FROM
    Admission_type_ref A
JOIN
    Patients_clean P
ON
    A.admission_type_id=P.Admission_type_id
GROUP BY
    A.Admission_type;

Admission_type,AVG(P.Billing_amount),Avg_stay
Urgent,25537.896469,24.8467
Emergency,25571.213265,25.7299
Elective,25655.466727,25.3673


Section 6: Billing and Room Data

Billing Amount Overview

I assess the lowest, highest, and average billing amounts in the dataset to understand the overall cost distribution and identify potential outliers in hospital charges.

In [40]:
%%sql
SELECT
    MIN(Billing_amount) AS Lowest_Billing_amount,
    MAX(Billing_amount) AS Highest_Billing_amount,
    AVG(Billing_amount)AS Avg_Billing_amount
FROM
    Patients_clean;

Lowest_Billing_amount,Highest_Billing_amount,Avg_Billing_amount
9.0,52764.0,25588.411505


Identifying Billing Outliers

I check for suspicious or extreme billing amounts, such as negative values ​​or unusually high charges, to spot potential data entry errors or anomalies.

In [137]:
%%sql
SELECT
    Billing_amount
FROM
    Patients_clean
WHERE 
    Billing_amount<0 OR Billing_amount>100000;

Billing_amount


Room Utilization Analysis

I analyze the number of patients per room each year to identify patterns of underutilization or overutilization, helping to assess how hospital resources are being used over time.

In [43]:
%%sql
SELECT
    YEAR(Admission_date) AS Time,
    Room_number,
    COUNT(Patient_id) AS Tot_patient_per_room-- This one will help us to  Know the total number of patient that used the room
FROM
    Patients_clean
GROUP BY 
    Room_number,
    YEAR(Admission_date)--We have to understand the utilization of room by each year
ORDER BY 
    YEAR(Admission_date),
    COUNT(Patient_id) DESC;

Time,Room_number,Tot_patient_per_room
2019,433,31
2019,497,29
2019,481,28
2019,439,28
2019,423,28
2019,286,27
2019,432,27
2019,171,27
2019,161,26
2019,367,26


Average Room Utilization Per Year

I calculate the yearly average number of patients per room to track trends in room usage over time, highlighting periods of higher or lower hospital resource demand.

In [52]:
%%sql
SELECT
   Time, --We add time to have a yearly view of the utilization of the rooms
    AVG(Tot_patient_per_room) AS AVG_patient_per_room
FROM(
SELECT
    YEAR(Admission_date) AS Time,
    Room_number,
    COUNT(patient_id) AS Tot_patient_per_room
FROM
    Patients_clean
GROUP BY
    Room_number,
    YEAR(Admission_date))AS t
GROUP BY
    Time
ORDER BY
    AVG_patient_per_room DESC;


Time,AVG_patient_per_room
2019,16.98
2020,26.665
2021,25.74
2022,26.0475
2023,26.245
2024,9.4875


Room Utilization Classification

I categorize rooms based on patient counts as over-utilized, under-utilized, or well-utilized. This helps quickly identify which rooms may require resource adjustments or better management.

In [53]:
%%sql
SELECT
    Room_number,
    Tot_patient_per_room,
CASE 
    WHEN Tot_patient_per_room>20 THEN 'over_utilized'
    WHEN Tot_patient_per_room<10 THEN 'under_utilized'
    ELSE 'well_utilized' END AS Utilization
FROM(
SELECT
    YEAR(Admission_date) AS Time,
    Room_number,
    COUNT(patient_id) AS Tot_patient_per_room
FROM
    Patients_clean
GROUP BY 
    Room_number,
    Time) AS t
ORDER BY 
    Tot_patient_per_room DESC
LIMIT 10;


Room_number,Tot_patient_per_room,Utilization
449,43,over_utilized
431,43,over_utilized
274,42,over_utilized
368,42,over_utilized
284,42,over_utilized
420,42,over_utilized
197,42,over_utilized
491,42,over_utilized
458,41,over_utilized
393,41,over_utilized


Section 7: Blood Type and Medical Conditions

I examine the number of patients for each blood type to understand the prevalence of different blood groups within the hospital population.

In [185]:
%%sql
SELECT
    Blood_type_code AS Blood_group,
    COUNT(patient_id) AS Nbr_of_patients
FROM
    Patients_clean
GROUP BY
    Blood_type_code
ORDER BY
    Nbr_of_patients DESC;

Blood_group,Nbr_of_patients
A+,6594
AB+,6573
A-,6566
O+,6560
B-,6556
B+,6556
AB-,6550
O-,6511


Most Common Medical Conditions

I analyze the frequency of medical conditions among patients to identify the most prevalent health issues treated in the hospital.

In [186]:
%%sql
SELECT
    Medical_condition,
    COUNT(patient_id) AS Nbr_of_patients
FROM
    Patients_clean
GROUP BY
    Medical_condition
ORDER BY
    Nbr_of_patients DESC;

Medical_condition,Nbr_of_patients
Diabetes,8822
Arthritis,8800
Hypertension,8754
Cancer,8705
Asthma,8702
Obesity,8683


Medical Conditions by Hospital

I examine how medical conditions are distributed across hospitals to detect patterns and identify which facilities handle specific health issues more frequently.

In [22]:
%%sql
SELECT
    H.Hospital_name,
    P.Medical_condition,
    COUNT(P.patient_id) AS Nbr_of_patients
FROM 
    Patients_clean P
JOIN
    Hospital_ref H
ON
    P.Hospital_id=H.Hospital_id
GROUP BY 
    P.Medical_condition,
    H.Hospital_name
ORDER BY
    COUNT(P.patient_id) DESC,
    H.Hospital_name
LIMIT 10;

Hospital_name,Medical_condition,Nbr_of_patients
smith plc,Arthritis,12
llc smith,Diabetes,11
llc smith,Hypertension,11
ltd smith,Obesity,11
smith llc,Arthritis,11
group smith,Diabetes,10
plc williams,Cancer,10
inc smith,Arthritis,9
johnson group,Hypertension,9
johnson inc,Cancer,9


Medical Conditions by Doctor

I analyze which doctors treat specific medical conditions most frequently to identify patterns in specialization and workload distribution.

In [192]:
%%sql
SELECT
    D.Doctor_name,
    P.Medical_condition,
    COUNT(P.patient_id) AS Nbr_of_patients
FROM
    Patients_clean P
JOIN
    Doctor_ref D
ON
    P.Doctor_id=D.Doctor_id
GROUP BY 
    P.Medical_condition,
    D.Doctor_name
ORDER BY 
    COUNT(P.patient_id) DESC,
    D.Doctor_name
LIMIT 10;

Doctor_name,Medical_condition,Nbr_of_patients
John Smith,Arthritis,8
Andrew Williams,Obesity,7
Michael Smith,Hypertension,7
Christopher Brown,Diabetes,6
James Johnson,Hypertension,6
Michael Johnson,Cancer,6
Michael Smith,Diabetes,6
Amy Martin,Arthritis,5
Christopher Williams,Cancer,5
Daniel Jones,Arthritis,5


Section 8: Admission & Discharge Dates

Patient Length of Stay Analysis

I calculate the shortest, longest, and average lengths of stay to understand patient hospitalization patterns and assess hospital resource usage.

In [56]:
%%sql
SELECT
    MAX(Stay_length) AS Longest_stay,
    MIN(Stay_length) AS Shortest_stay,
    AVG(Stay_length) AS Average_stay
FROM
    (SELECT
    Patient_id,
    DATEDIFF(Discharge_date,Admission_date) AS Stay_length
FROM
    Patients_clean
WHERE Admission_date<discharge_date)AS Stay;

Longest_stay,Shortest_stay,Average_stay
354,1,43.0974


Pending Date Reviews

I check how many patient records still require date verification, ensuring data accuracy before further analysis or reporting.

In [204]:
%%sql
SELECT
    SUM(needs_date_review) AS Patient_date_to_be_reviewed
FROM
    Patients_clean;

Patient_date_to_be_reviewed
6277


Monthly Admission Patterns

I analyze average patient stay lengths by month and year to identify seasonal trends and patterns in hospital admissions.

In [58]:
%%sql
SELECT
    YEAR(Admission_date) AS YEARS,
    MONTH(Admission_date) AS Months,
    AVG(DATEDIFF(Discharge_date,Admission_date)) AS Avg_Stay_length
FROM
    Patients_clean
WHERE
    Admission_date<discharge_date 
GROUP BY 
    YEAR(Admission_date),
    MONTH(Admission_date)
ORDER BY 
    YEARS ASC,
    Avg_stay_length DESC;



YEARS,Months,Avg_Stay_length
2019,12,63.163
2019,5,41.4326
2019,6,36.9423
2019,7,30.61
2019,8,26.0466
2019,9,20.4178
2019,10,18.8145
2019,11,15.8195
2020,1,70.9834
2020,2,62.4814


Average Patient Stay

I calculate the overall average length of stay to identify hospitals or cases associated with unusually long patient stays, which may indicate resource strain or complex treatments.

In [63]:
%%sql
SELECT
    AVG(DATEDIFF(Discharge_date,Admission_date)) AS Avg_Stay_length
FROM
    Patients_clean;

Avg_Stay_length
25.3129


Hospitals With Unusually Long Stays

I identify hospitals where the average patient stay exceeds 180 days per year, highlighting cases with unusually long hospitalizations that may indicate complex treatments or operational issues.

In [82]:
%%sql
SELECT
    Years,
    Hospital,
    Avg_stay_length
FROM
    (SELECT
    H.Hospital_name AS Hospital,
    YEAR(P.Admission_date) AS Years,
    AVG(DATEDIFF(P.Discharge_date,P.Admission_date)) AS Avg_Stay_length
FROM
    Patients_clean P
JOIN
    Hospital_ref H
ON
    P.Hospital_id=H.Hospital_id
GROUP BY
    H.Hospital_name,
    YEAR(P.Admission_date --I grouped by Year to make it easy to spot by yearly which hospital had unusual length of stay
    ))AS t
WHERE Avg_stay_length>180 -- Adding 180 for a stay over 180 days considered as unusual stay
ORDER BY
    Years,-- I order by Year to make it easy to spot by yearly which hospital had unusual length of stay
    Avg_Stay_length DESC;

Years,Hospital,Avg_stay_length
2019,and castaneda romero powell,352.0
2019,and williams hart lucas,351.0
2019,barker sanders thomas and,349.0
2019,diaz and lewis moody,349.0
2019,jonessmith,349.0
2019,reed sullivan larson and,349.0
2019,allen harrington grant and,347.0
2019,nash and macias levine,346.0
2019,mora plc,345.0
2019,poole owens parsons and,345.0


Admission Type

Average Stay by Admission Type

I calculate the average length of stay for each admission type to understand how patient stay durations vary based on the nature of their admission.

In [88]:
%%sql
SELECT
    A.Admission_type,
    AVG(DATEDIFF(P.Discharge_date,P.Admission_date)) AS Avg_Stay_length
FROM
    Patients_clean P
JOIN 
    Admission_type_ref A
ON 
    P.Admission_type_id=A.Admission_type_id
GROUP BY 
   A.Admission_type;

Admission_type,Avg_Stay_length
Urgent,24.8467
Emergency,25.7299
Elective,25.3673


Admission Types With Unusually Long Stays

I identify admission types where the average patient stay exceeds 35 days per year, highlighting cases that may require special attention or indicate more complex care requirements.

In [95]:
%%sql
SELECT
    Years,
    Admission_type,
    Avg_stay_length
FROM
    (SELECT
    A.Admission_type AS Admission_type,
    YEAR(P.Admission_date) AS Years,
    AVG(DATEDIFF(P.Discharge_date,P.Admission_date)) AS Avg_Stay_length
FROM
    Patients_clean P
JOIN
    Admission_type_ref A
ON
    P.Admission_type_id=A.Admission_type_id
GROUP BY
    A.Admission_type,
    YEAR(P.Admission_date --I grouped by Year to make it easy to spot by yearly which admission_type had unusual length of stay
    ))AS t
WHERE Avg_stay_length>35 -- Adding 35 for a stay over 35 days considered as unusual stay
ORDER BY
    Years,-- I order by Year to make it easy to spot by yearly which Admission_type had unusual length of stay
    Avg_Stay_length DESC;

Years,Admission_type,Avg_stay_length
2024,Emergency,55.2899
2024,Urgent,54.8155
2024,Elective,48.7048


Section 9: Medication and Test Results

Yearly Medication Usage

I analyze which medications are most frequently prescribed each year to identify trends in treatment patterns and commonly used drugs over time.

In [100]:
%%sql
SELECT
    YEAR(Admission_date) AS Years,
    Medication,
    COUNT(Patient_id) AS Number_of_patient
FROM
    Patients_clean
GROUP BY
    Years,
    Medication
ORDER BY
    Years,
    Number_of_patient DESC;

Years,Medication,Number_of_patient
2019,Penicillin,1433
2019,Lipitor,1365
2019,Aspirin,1346
2019,Paracetamol,1344
2019,Ibuprofen,1304
2020,Paracetamol,2195
2020,Penicillin,2139
2020,Lipitor,2134
2020,Ibuprofen,2105
2020,Aspirin,2093


Medication Patterns by Medical Condition

I examine which medications are prescribed for specific medical conditions each year to uncover treatment patterns and trends in patient care.

In [103]:
%%sql
SELECT
    YEAR(Admission_date) AS Years,
    Medical_condition,
    Medication,
    COUNT(Medication) AS Times
FROM
    Patients_clean
GROUP BY
    Years,
    Medical_condition,
    Medication
ORDER BY
    Years,
    Medical_condition,
    Times DESC;

Years,Medical_condition,Medication,Times
2019,Arthritis,Penicillin,235
2019,Arthritis,Aspirin,223
2019,Arthritis,Lipitor,222
2019,Arthritis,Ibuprofen,214
2019,Arthritis,Paracetamol,213
2019,Asthma,Aspirin,269
2019,Asthma,Paracetamol,242
2019,Asthma,Penicillin,239
2019,Asthma,Lipitor,238
2019,Asthma,Ibuprofen,201


Medication Patterns by Admission Type

I analyze which medications are most frequently used for each admission type annually to uncover trends in treatment approaches based on patient admission categories.

In [106]:
%%sql
SELECT
    YEAR(P.Admission_date) AS Years,
    A.Admission_type,
    P.Medication,
    COUNT(P.Medication) AS Amount
FROM
    Patients_clean P
JOIN
    Admission_type_ref A
ON
  P.Admission_type_id=A.Admission_type_id  
GROUP BY
    Years,
    A.admission_type,
    P.Medication
ORDER BY
    Years,
    A.Admission_type,
    Amount DESC;

Years,Admission_type,Medication,Amount
2019,Elective,Penicillin,513
2019,Elective,Ibuprofen,469
2019,Elective,Aspirin,454
2019,Elective,Lipitor,443
2019,Elective,Paracetamol,442
2019,Emergency,Lipitor,445
2019,Emergency,Penicillin,445
2019,Emergency,Aspirin,435
2019,Emergency,Paracetamol,421
2019,Emergency,Ibuprofen,401


Test Result Patterns by Medical Condition

I examine how test results vary for different medical conditions each year to identify diagnostic trends and patterns in patient care.

In [108]:
%%sql
SELECT
    YEAR(Admission_date) AS Years,
    Medical_condition,
    Test_results,
    COUNT(Test_results) AS Times
FROM
    Patients_clean
GROUP BY
    Years,
    Medical_condition,
    Test_results
ORDER BY
    Years,
    Medical_condition,
    Times DESC;

Years,Medical_condition,Test_results,Times
2019,Arthritis,Inconclusive,383
2019,Arthritis,Abnormal,375
2019,Arthritis,Normal,349
2019,Asthma,Normal,412
2019,Asthma,Abnormal,412
2019,Asthma,Inconclusive,365
2019,Cancer,Abnormal,409
2019,Cancer,Inconclusive,408
2019,Cancer,Normal,363
2019,Diabetes,Normal,389


Test Result Patterns by Admission Type

I analyze how test results vary across different admission types each year to uncover trends in diagnostics and patient treatment patterns.

In [109]:
%%sql
SELECT
    YEAR(P.Admission_date) AS Years,
    A.Admission_type,
    P.Test_results,
    COUNT(P.Test_results) AS Times
FROM
    Patients_clean P
JOIN
    Admission_type_ref A
ON
  P.Admission_type_id=A.Admission_type_id  
GROUP BY
    Years,--I have been using Years in group by to understand the pattern but you can even use month.
    A.admission_type,
    P.Test_results
ORDER BY
    Years,--I have been using Years in gOrder by to sort and understand the pattern but you can even use month.
    A.Admission_type,
    Times DESC;

Years,Admission_type,Test_results,Times
2019,Elective,Abnormal,784
2019,Elective,Normal,783
2019,Elective,Inconclusive,754
2019,Emergency,Abnormal,736
2019,Emergency,Normal,713
2019,Emergency,Inconclusive,698
2019,Urgent,Abnormal,779
2019,Urgent,Inconclusive,779
2019,Urgent,Normal,766
2020,Elective,Inconclusive,1171
