In [None]:
import pandas as pd
import numpy as np

In [None]:
df = pd.read_csv('readmission.csv')

# Info for all variables

**Rosiglitazone**: Rosiglitazone is a medication used to treat type 2 diabetes. It belongs to a class of drugs called thiazolidinediones (TZDs) and helps improve insulin sensitivity in the body.

**Acarbose**: Acarbose is an oral medication used to treat type 2 diabetes. It works by slowing down the digestion and absorption of carbohydrates in the small intestine, helping to control blood sugar levels.

**Miglitol**: Miglitol is another oral medication used in the management of type 2 diabetes. It is an alpha-glucosidase inhibitor that slows the breakdown of carbohydrates in the intestines, reducing the post-meal rise in blood glucose.

**Troglitazone**: Troglitazone was a medication used to treat type 2 diabetes. However, it has been withdrawn from the market due to concerns about liver toxicity.

**Tolazamide**: Tolazamide is an oral medication used to lower blood sugar levels in people with type 2 diabetes. It belongs to the class of drugs known as sulfonylureas.

**Examide**: Examide appears to be a less common or generic medication, and I couldn't find specific information about it. It's possible that it may be a less-known or less-commonly used diabetes medication.

**Citoglipton**: Citoglipton also appears to be less common or generic, and specific information about it is not readily available. It might be a less-known medication or a misspelled term.

(Sitagliptin is used along with diet and exercise and sometimes with other medications to lower blood sugar levels in adults with type 2 diabetes (condition in which blood sugar is too high because the body does not produce or use insulin normally).)

**Insulin**: Insulin is a hormone produced by the pancreas that regulates blood sugar levels. In diabetes management, insulin is often used as a medication to supplement or replace the body's natural insulin production in individuals with type 1 diabetes and some cases of type 2 diabetes.


In [None]:
df['rosiglitazone'].info()

<class 'pandas.core.series.Series'>
RangeIndex: 101766 entries, 0 to 101765
Series name: rosiglitazone
Non-Null Count   Dtype 
--------------   ----- 
101766 non-null  object
dtypes: object(1)
memory usage: 795.2+ KB


#To determine the importance of each variable, we assess the percentage of patients who use **rosiglitazone, acarbose, miglitol, tolazamide, examide, citoglipton, and insulin** as their medication. If the result shows that fewer than 0.1% of patients took a particular medication, it means it may not be necessary to consider this variable in our analysis.

#According to the results, only the variable **"rosiglitazone, acarbose, and insulin"** have a percentage higher than 0.1%. Therefore, we will conduct a more detailed analysis to examine the effect of these specific medications.

The percentage of patients using **rosiglitazone** is approximately 6.3%





In [None]:
len(df[df['rosiglitazone'] != "No"])/len(df)

0.0625454473989348

The percentage of patients using **acarbose** is approximately 0.3%

In [None]:
len(df[df['acarbose'] != "No"])/len(df)

0.003026551107442564

The percentage of patients using **miglitol** is approximately 0.04%

In [None]:
len(df[df['miglitol'] != "No"])/len(df)

0.00037340565611304366

The percentage of patients using **tolazamide** is approximately 0.04%

In [None]:
len(df[df['tolazamide'] != "No"])/len(df)

0.0003832321207475974

The percentage of patients using **examide** is approximately 0%

In [None]:
len(df[df['examide'] != "No"])/len(df)

0.0

The percentage of patients using **citoglipton** is approximately 0%

In [None]:
len(df[df['citoglipton'] != "No"])/len(df)

0.0

The percentage of patients using **insulin** is approximately 53%

In [None]:
len(df[df['insulin'] != "No"])/len(df)

0.5343926262209382

In [None]:
df['insulin'].unique()

array(['No', 'Up', 'Steady', 'Down'], dtype=object)

#The number of different reactions for patients who take insulin includes **'No,' 'Up,' 'Steady,' and 'Down'**.

In [None]:
print(len(df[df['insulin']=='No']))
print(len(df[df['insulin']=='Up']))
print(len(df[df['insulin']=='Steady']))
print(len(df[df['insulin']=='Down']))

47383
11316
30849
12218


In [None]:
len(df[(df['insulin']=='No') & (df['readmitted'] != 'NO') ])

20705

#The percentage of patients who take '**insulin**' as their medication, categorized by different reactions.

In [None]:
(len(df[(df['insulin']=='No') & (df['readmitted'] != 'NO') ])) / (len(df['insulin']))

0.20345695025843602

In [None]:
(len(df[(df['insulin']=='Up') & (df['readmitted'] != 'NO') ])) / (len(df['insulin']))

0.057307941748717645

In [None]:
(len(df[(df['insulin']=='Steady') & (df['readmitted'] != 'NO') ])) / (len(df['insulin']))

0.13673525538981585

In [None]:
(len(df[(df['insulin']=='Down') & (df['readmitted'] != 'NO') ])) / (len(df['insulin']))

0.06338069689287189

In [None]:
df[['rosiglitazone', 'acarbose', 'insulin']] = df[['rosiglitazone', 'acarbose', 'insulin']].replace({'No': 0, 'Steady': 1, 'Up': 1, 'Down': 1})
df

Unnamed: 0,encounter_id,patient_nbr,race,gender,age,weight,admission_type_id,discharge_disposition_id,admission_source_id,time_in_hospital,...,citoglipton,insulin,glyburide-metformin,glipizide-metformin,glimepiride-pioglitazone,metformin-rosiglitazone,metformin-pioglitazone,change,diabetesMed,readmitted
0,2278392,8222157,Caucasian,Female,[0-10),?,6,25,1,1,...,No,0,No,No,No,No,No,No,No,NO
1,149190,55629189,Caucasian,Female,[10-20),?,1,1,7,3,...,No,1,No,No,No,No,No,Ch,Yes,>30
2,64410,86047875,AfricanAmerican,Female,[20-30),?,1,1,7,2,...,No,0,No,No,No,No,No,No,Yes,NO
3,500364,82442376,Caucasian,Male,[30-40),?,1,1,7,2,...,No,1,No,No,No,No,No,Ch,Yes,NO
4,16680,42519267,Caucasian,Male,[40-50),?,1,1,7,1,...,No,1,No,No,No,No,No,Ch,Yes,NO
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
101761,443847548,100162476,AfricanAmerican,Male,[70-80),?,1,3,7,3,...,No,1,No,No,No,No,No,Ch,Yes,>30
101762,443847782,74694222,AfricanAmerican,Female,[80-90),?,1,4,5,5,...,No,1,No,No,No,No,No,No,Yes,NO
101763,443854148,41088789,Caucasian,Male,[70-80),?,1,1,7,1,...,No,1,No,No,No,No,No,Ch,Yes,NO
101764,443857166,31693671,Caucasian,Female,[80-90),?,2,3,7,10,...,No,1,No,No,No,No,No,Ch,Yes,NO
