In [2]:
import pandas as pd


In [4]:
import pandas as pd

df = pd.read_csv("blood_pressure_data.csv")
df.head()


Unnamed: 0,Date,Time,Patient First Name,Patient Last Name,PCP,Clinic Visit Provider,Clinic,Systolic BP reading,Diastolc BP reading
0,6/23/2021,11:41:00 AM,A,A,Dr. Clark,Dr. Clark,GIM,149,89
1,10/26/2021,1:45:00 PM,A,A,Dr. Clark,Dr. Basit,Parkland Cardiology,141,64
2,9/11/2021,1:05:00 PM,C,A,Dr. Clark,Dr. Basit,UTSW Cardiology,110,67
3,9/11/2021,1:07:00 PM,C,A,Dr. Clark,Dr. Basit,UTSW Cardiology,110,63
4,1/15/2021,10:33:00 AM,G,A,Dr. Fish,Dr. Basit,Parkland Cardiology,168,103


In [5]:
df.columns


Index(['Date', 'Time', 'Patient First Name', 'Patient Last Name', 'PCP',
       'Clinic Visit Provider', 'Clinic', 'Systolic BP reading',
       'Diastolc BP reading'],
      dtype='str')

In [6]:
df.shape


(154, 9)

In [7]:
df.isna().sum()


Date                     0
Time                     0
Patient First Name       0
Patient Last Name        0
PCP                      0
Clinic Visit Provider    0
Clinic                   0
Systolic BP reading      0
Diastolc BP reading      0
dtype: int64

In [8]:
# create a pt ID
df["Patient_ID"] = df["Patient First Name"] + "_" + df["Patient Last Name"]

df["Patient_ID"].head()


0    A_A
1    A_A
2    C_A
3    C_A
4    G_A
Name: Patient_ID, dtype: str

In [14]:
# categorize blood pressure and define normal, high or low
def bp_category(sbp, dbp):
    if (sbp < 90) or (dbp < 60):
        return "Low"
    if (sbp >= 130) or (dbp >= 80):
        return "Elevated"
    return "Normal"

df["BP_Category"] = df.apply(
    lambda row: bp_category(row["Systolic BP reading"], row["Diastolc BP reading"]),
    axis=1
)

df["BP_Category"].value_counts()



BP_Category
Elevated    101
Low          40
Normal       13
Name: count, dtype: int64

In [15]:
# Identify elevated readings
df["is_elevated"] = df["BP_Category"] == "Elevated"

# Group by patient and check if they EVER had an elevated reading
patient_elevated = (
    df.groupby("Patient_ID")["is_elevated"]
      .any()
)

total_patients = df["Patient_ID"].nunique()
elevated_patients = patient_elevated.sum()

print("Total patients:", total_patients)
print("Patients with ≥1 elevated reading:", elevated_patients)
print("Percent elevated:", round(100 * elevated_patients / total_patients, 1), "%")


Total patients: 74
Patients with ≥1 elevated reading: 58
Percent elevated: 78.4 %


## Part 1: How Many Patients Have Elevated Blood Pressure?

Blood pressure readings were categorized as Low, Normal, or Elevated using predefined systolic and diastolic thresholds. 

A patient was considered to have elevated blood pressure if they had at least one reading categorized as "Elevated." 

After grouping readings by patient and checking whether each patient ever had an elevated reading:

- Total unique patients: 74  
- Patients with ≥1 elevated reading: 58  
- Percent of patients with elevated blood pressure: 78.4%

This analysis moves from reading-level data to patient-level classification by using a group-by operation and checking whether any elevated readings occurred for each patient.
