# **HEALTHCARE ANALYTICS**

You've joined HealthTech Analytics as a junior data engineer. The clinical team built a normalized transactional database (3NF), but analytics queries are slow.

Your job: analyze the OLTP schema, identify performance issues, then design and build an optimized star schema.

This mirrors real-world data engineering work.

## **IMPORT PACKAGES**

In [1]:
#Import os and sys
import os
import sys 
from pathlib import Path


#Extarcts the root path of the project and appends it to the sys path
project_root = Path().resolve().parent
sys.path.append(str(project_root))

#Imports loadEnv from config module
from config.config import loadEnv

#Imports read files from Read files module
from Read_Files.readFile import read_sql_file

## **BYPASS KEY ERROR**

In [2]:
#Resolve sthe KeyError by bypassing the missing DEFAULT key
%config SqlMagic.style = '_DEPRECATED_DEFAULT'

## **LOAD CREDENTIALS**

In [3]:
#Loads the password from .env
password = loadEnv("password")
print("\033[92mPassword successfully loaded\n") if password else print("Password Not Found\n")

#Load the database from .env
database = loadEnv("database")
print("\033[92mDatabase successfully loaded") if database else print("Database Not Found")

[92mPassword successfully loaded

[92mDatabase successfully loaded


## **SERVER CONNECTION AND DATABASE CREATION**

Connects to MySQL server, creates a database if the specified database is not in the already created databases and connects to that database

In [4]:
%load_ext sql
#Connects to MySQL server
%sql mysql+pymysql://root:$password@localhost:3306/

#Creates a python string to be passed to sql
sql = f"CREATE DATABASE IF NOT EXISTS `{database}`;"
%sql $sql

#Connects to the newly created database
%sql mysql+pymysql://root:$password@localhost:3306/$database


 * mysql+pymysql://root:***@localhost:3306/
1 rows affected.


## **CHECKS THE DATABASE MYSQL IS CONNECTED TO**

In [5]:
%%sql 

#Shows all the databases in the server and the database we are connected to
SELECT DATABASE();

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
1 rows affected.


DATABASE()
healthcare


## **ONLINE TRANSACTION PROCESSING (OLTP)**

## **TABLE CREATION**

Creates the patients, specialties, , departments, providers, diagnoses, encounter diagnoses, procedures, encounter procedures, and billing tables

#### **PATIENTS TABLE**

In [6]:
#Reads the sql script
patientsTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "patientsTable.sql")

#Creates the customers table
%sql $patientsTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **SPECIALTIES TABLE**

In [7]:
#Reads the sql script
specialtiesTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "specialtiesTable.sql")

#Creates the customers table
%sql $specialtiesTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **DEPARTMENTS TABLE**

In [8]:
#Reads the sql script
departmentsTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "departmentsTable.sql")

#Creates the customers table
%sql $departmentsTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **PROVIDERS TABLE**

In [9]:
#Reads the sql script
providersTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "providersTable.sql")

#Creates the customers table
%sql $providersTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **ENCOUNTERS TABLE**

In [10]:
#Reads the sql script
encountersTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "encountersTable.sql")

#Creates the customers table
%sql $encountersTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **DIAGNOSES TABLE**

In [11]:
#Reads the sql script
diagnosesTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "diagnosesTable.sql")

#Creates the customers table
%sql $diagnosesTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **ENCOUNTER DIAGNOSES TABLE**

In [12]:
#Reads the sql script
encounterDiagnosesTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "encounterDiagnosesTable.sql")

#Creates the customers table
%sql $encounterDiagnosesTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **PROCEDURES TABLE**

In [13]:
#Reads the sql script
proceduresTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "proceduresTable.sql")

#Creates the customers table
%sql $proceduresTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **ENCOUNTER PROCEDURES TABLE**

In [14]:
#Reads the sql script
encounterProceduresTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "encountersProceduresTable.sql")

#Creates the customers table
%sql $encounterProceduresTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **BILLING TABLE**

In [15]:
#Reads the sql script
billingTableContent = read_sql_file(project_root / "OLTP" / "DDL" / "Create_Tables" / "billingTable.sql")

#Creates the customers table
%sql $billingTableContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
0 rows affected.


[]

### **DISPLAY ALL TABLES IN THE DATABASE**

In [16]:
%sql SHOW TABLES;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
10 rows affected.


Tables_in_healthcare
billing
departments
diagnoses
encounter_diagnoses
encounter_procedures
encounters
patients
procedures
providers
specialties


## **INSERT VALUES INTO THE DATABASE**

### **INSERT INTO SPECIALTIES**

In [17]:
#Reads the sql script
specialtiesContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoSpecialties.sql")

#Creates the orders table
%sql $specialtiesContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


[]

In [18]:
%%sql

#Reads all the data in the specialities table

SELECT * FROM `specialties`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


specialty_id,specialty_name,specialty_code
1,Cardiology,CARD
2,Internal Medicine,IM
3,Emergency,ER


### **INSERT INTO DEPARTMENTS**

In [19]:
#Reads the sql script
departmentsContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoDepartments.sql")

#Creates the orders table
%sql $departmentsContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


[]

In [20]:
%%sql

#Reads all the data in the departments table

SELECT * FROM `departments`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


department_id,department_name,floor,capacity
1,Cardiology Unit,3,20
2,Internal Medicine,2,30
3,Emergency,1,45


### **INSERT INTO PROVIDERS**

In [21]:
#Reads the sql script
providersContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoProviders.sql")

#Creates the orders table
%sql $providersContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


[]

In [22]:
%%sql

#Reads all the data in the providers table

SELECT * FROM `providers`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


provider_id,first_name,last_name,credential,specialty_id,department_id
101,James,Chen,MD,1,1
102,Sarah,Williams,MD,2,2
103,Michael,Rodriguez,MD,3,3


### **INSERT INTO PATIENTS**

In [23]:
#Reads the sql script
patientsContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoPatients.sql")

#Creates the orders table
%sql $patientsContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


[]

In [24]:
%%sql

#Reads all the data in the patients table

SELECT * FROM `patients`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


patient_id,first_name,last_name,date_of_birth,gender,mrn
1001,John,Doe,1955-03-15,M,MRN001
1002,Jane,Smith,1962-07-22,F,MRN002
1003,Robert,Johnson,1948-11-08,M,MRN003


### **INSERT INTO DIAGNOSES**

In [25]:
#Reads the sql script
diagnosesContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoDiagnoses.sql")

#Creates the orders table
%sql $diagnosesContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


[]

In [26]:
%%sql

#Reads all the data in the diagnoses table

SELECT * FROM `diagnoses`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


diagnosis_id,icd10_code,icd10_description
3001,I10,Hypertension
3002,E11.9,Type 2 Diabetes
3003,I50.9,Heart Failure


### **INSERT INTO PROCEDURES**

In [27]:
#Reads the sql script
proceduresContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoProcedures.sql")

#Creates the orders table
%sql $proceduresContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


[]

In [28]:
%%sql

#Reads all the data in the procedures table

SELECT * FROM `procedures`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
3 rows affected.


procedure_id,cpt_code,cpt_description
4001,99213,Office Visit
4002,93000,EKG
4003,71020,Chest X-ray


### **INSERT INTO ENCOUNTERS**

In [29]:
#Reads the sql script
encountersContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoEncounters.sql")

#Creates the orders table
%sql $encountersContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
4 rows affected.


[]

In [30]:
%%sql

#Reads all the data in the encounters table

SELECT * FROM `encounters`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
4 rows affected.


encounter_id,patient_id,provider_id,encounter_type,encounter_date,discharge_date,department_id
7001,1001,101,Outpatient,2024-05-10 10:00:00,2024-05-10 11:30:00,1
7002,1001,101,Inpatient,2024-06-02 14:00:00,2024-06-06 09:00:00,1
7003,1002,102,Outpatient,2024-05-15 09:00:00,2024-05-15 10:15:00,2
7004,1003,103,ER,2024-06-12 23:45:00,2024-06-13 06:30:00,3


### **INSERT INTO BILLING**

In [31]:
#Reads the sql script
billingContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoBilling.sql")

#Creates the orders table
%sql $billingContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
2 rows affected.


[]

In [32]:
%%sql

#Reads all the data in the billing table

SELECT * FROM `billing`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
2 rows affected.


billing_id,encounter_id,claim_amount,allowed_amount,claim_date,claim_status
14001,7001,350.0,280.0,2024-05-11,Paid
14002,7002,12500.0,10000.0,2024-06-08,Paid


### **INSERT INTO ENCOUNTERS DIAGNOSES**

In [33]:
#Reads the sql script
encounterDiagnosesContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoEncounterDiagnoses.sql")

#Creates the orders table
%sql $encounterDiagnosesContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
6 rows affected.


[]

In [34]:
%%sql

#Reads all the data in the encounter diagnoses table

SELECT * FROM `encounter_diagnoses`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
6 rows affected.


encounter_diagnosis_id,encounter_id,diagnosis_id,diagnosis_sequence
8001,7001,3001,1
8002,7001,3002,2
8003,7002,3001,1
8004,7002,3003,2
8005,7003,3002,1
8006,7004,3001,1


### **INSERT INTO ENCOUNTER PROCEDURES**

In [35]:
#Reads the sql script
encounterProceduresContent = read_sql_file(project_root / "OLTP" / "DML" / "Insert_Data" / "insertIntoEncounterProcedures.sql")

#Creates the orders table
%sql $encounterProceduresContent;



   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
4 rows affected.


[]

In [36]:
%%sql

#Reads all the data in the encounter procedures table

SELECT * FROM `encounter_procedures`;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
4 rows affected.


encounter_procedure_id,encounter_id,procedure_id,procedure_date
9001,7001,4001,2024-05-10
9002,7001,4002,2024-05-10
9003,7002,4001,2024-06-02
9004,7003,4001,2024-05-15


## **ANALYTICAL QUERIES**

### **MONTHLY ENCOUNTERS BY SPECIALITY**

In [37]:
#Reads the sql script
monthlyEncountersContent = read_sql_file(project_root / "OLTP" / "Analytical_Queries" / "monthlyEncounters.sql")

In [38]:
#Creates the orders table
%sql $monthlyEncountersContent;


   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
4 rows affected.


encounter_month,specialty_name,encounter_type,total_encounters,unique_patients
2024-05,Cardiology,Outpatient,1,1
2024-05,Internal Medicine,Outpatient,1,1
2024-06,Cardiology,Inpatient,1,1
2024-06,Emergency,ER,1,1


### **TOP DIAGNOSES-PROCEDURE PAIRS**

In [39]:
#Reads the sql script
topDiagnosesProceduresContent = read_sql_file(project_root / "OLTP" / "Analytical_Queries" / "topDiagnosesProcedures.sql")

#Creates the orders table
%sql $topDiagnosesProceduresContent;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
5 rows affected.


icd10_code,cpt_code,encounter_count
E11.9,99213,2
I10,99213,2
E11.9,93000,1
I10,93000,1
I50.9,99213,1


### **30 DAY READMISSION RATE**

In [40]:
#Reads the sql script
readmissionRateContent = read_sql_file(project_root / "OLTP" / "Analytical_Queries" / "readmissionRate.sql")

#Creates the orders table
%sql $readmissionRateContent;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
1 rows affected.


specialty_name,total_discharges,readmissions,readmission_rate_percent
Cardiology,1,0,0.0


### **REVENUE BY SPECIALTY AND MONTH**

In [41]:
#Reads the sql script
revenueBySpecialtyContent = read_sql_file(project_root / "OLTP" / "Analytical_Queries" / "revenueBySpecialty.sql")

#Creates the orders table
%sql $revenueBySpecialtyContent;

   mysql+pymysql://root:***@localhost:3306/
 * mysql+pymysql://root:***@localhost:3306/healthcare
2 rows affected.


claim_month,specialty_name,total_allowed_amount
2024-06,Cardiology,10000.0
2024-05,Cardiology,280.0


## **ONLINE ANALYTICAL PROCESSING (OLAP)**