<a href="https://colab.research.google.com/github/alirezaghezavati/Git-Commands/blob/master/619_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting data from Google drive

In [0]:
from pydrive.drive import GoogleDrive
from pydrive.auth import GoogleAuth
from google.colab import auth
from oauth2client.client import GoogleCredentials

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive =GoogleDrive(gauth)

#https://drive.google.com/file/d/1ie8EIXf9w8tm5dppuhHvFC9J4YybaJAX/view?usp=sharing
download = drive.CreateFile({'id': '1ie8EIXf9w8tm5dppuhHvFC9J4YybaJAX'})
download.GetContentFile('result.csv')

#Loading data into pandas

In [0]:
import pandas as pd

df = pd.read_csv("result.csv")
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 133 entries, 0 to 132
Data columns (total 2 columns):
hadm_id    133 non-null int64
text       133 non-null object
dtypes: int64(1), object(1)
memory usage: 2.2+ KB


In [0]:
df.head()

Unnamed: 0,hadm_id,text
0,158463,[**2157-7-11**] 8:25 PM\n SCROTAL U.S. ...
1,158463,[**2157-7-12**] 2:45 PM\n CT ABD W&W/O C ...
2,158463,[**2157-7-11**] 1:05 PM\n CHEST (PA & LAT) ...
3,158463,[**2157-7-12**] 9:27 AM\n BILAT LOWER EXT VEIN...
4,158463,"[**Last Name (LF) 3176**],[**First Name3 (LF) ..."


In [0]:
df.shape
# We can see there are 133 rows which means 133 diferent admissions, but each patient can have 
#multiple admissions. 

(133, 2)

#Grouping all columns based on hadm_id

In [0]:
df = df.groupby(['hadm_id'])['text'].apply(lambda x: ','.join(x)).reset_index()
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24 entries, 0 to 23
Data columns (total 2 columns):
hadm_id    24 non-null int64
text       24 non-null object
dtypes: int64(1), object(1)
memory usage: 512.0+ bytes


In [0]:
df.head()

Unnamed: 0,hadm_id,text
0,102821,[**2169-8-28**] 2:06 AM\n CHEST (PORTABLE AP) ...
1,104485,[**2189-3-21**] 9:00 AM\n CHEST (PORTABLE AP) ...
2,117599,[**2130-2-22**] 5:01 AM\n CHEST (PORTABLE AP) ...
3,121896,"[**Last Name (LF) 8994**],[**First Name3 (LF) ..."
4,125785,[**2179-2-10**] 8:37 PM\n CHEST PORT. LINE PLA...


In [0]:
df.shape
# We can see there are 24 unique patients.

(24, 2)

#Finding Endotracheal Tube
Patients who are intubated will have endotracheal tube or ET tube in their notes. So, we will use regex to find who has got intubated. First we will look for endotracheal tube and next for ET tube.

In [0]:
df_endotracheal_tube = df[df.text.str.contains('Endotracheal tube', regex= True, na=False, case=False)]
df_endotracheal_tube.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 7 entries, 6 to 23
Data columns (total 2 columns):
hadm_id    7 non-null int64
text       7 non-null object
dtypes: int64(1), object(1)
memory usage: 168.0+ bytes


In [0]:
list(df_endotracheal_tube['text'])

['[**Last Name (LF) 385**],[**First Name3 (LF) 386**]                      NMED SICU-B                [**2134-11-8**]  5:21 AM\n CT ABDOMEN W/O CONTRAST; CT PELVIS W/O CONTRAST                 Clip # [**Clip Number (Radiology) 74851**]\n Reason: assess intraabdominal bleed\n Admitting Diagnosis: STROKE;TELEMETRY;TRANSIENT ISCHEMIC ATTACK\n ______________________________________________________________________________\n [**Hospital 2**] MEDICAL CONDITION:\n  74 year old man with increasing abd distention\n REASON FOR THIS EXAMINATION:\n  assess intraabdominal bleed\n No contraindications for IV contrast\n ______________________________________________________________________________\n                                  PFI REPORT\n Large left retroperitoneal hematoma (13 x 12 x 14 cm) with hematocrit level\n associated with anticoagulation.  Sigmoid dilated with stool.  Small bilateral\n effusions/atelectasis.  D/W Dr. [**First Name (STitle) **] ([**Doctor First Name **]) at time pt. left

In [0]:
df_endotracheal_tube

Unnamed: 0,hadm_id,text
6,132634,"[**Last Name (LF) 385**],[**First Name3 (LF) 3..."
11,155431,[**2188-11-8**] 1:48 PM\n CHEST (PORTABLE AP);...
13,158463,[**2157-7-11**] 8:25 PM\n SCROTAL U.S. ...
14,158711,[**2158-5-20**] 5:00 AM\n CHEST (PORTABLE AP) ...
16,164728,[**2154-11-9**] 12:51 PM\n RENAL U.S. ...
18,186208,[**2108-2-28**] 6:52 PM\n CHEST (PORTABLE AP);...
23,193816,[**2144-1-9**] 10:30 AM\n CHEST (PORTABLE AP) ...


In [0]:
df_endotracheal_tube.shape
# We can see 7 patients has endotracheal tube in their records.

(7, 2)

#Finding ET Tube
Now, we will look for ET tube.

In [0]:
df_et_tube = df[df.text.str.contains('ET tube', regex= True, na=False, case=False)]
df_et_tube.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 6 to 23
Data columns (total 2 columns):
hadm_id    5 non-null int64
text       5 non-null object
dtypes: int64(1), object(1)
memory usage: 120.0+ bytes


In [0]:
list(df_et_tube['text'])

['[**Last Name (LF) 385**],[**First Name3 (LF) 386**]                      NMED SICU-B                [**2134-11-8**]  5:21 AM\n CT ABDOMEN W/O CONTRAST; CT PELVIS W/O CONTRAST                 Clip # [**Clip Number (Radiology) 74851**]\n Reason: assess intraabdominal bleed\n Admitting Diagnosis: STROKE;TELEMETRY;TRANSIENT ISCHEMIC ATTACK\n ______________________________________________________________________________\n [**Hospital 2**] MEDICAL CONDITION:\n  74 year old man with increasing abd distention\n REASON FOR THIS EXAMINATION:\n  assess intraabdominal bleed\n No contraindications for IV contrast\n ______________________________________________________________________________\n                                  PFI REPORT\n Large left retroperitoneal hematoma (13 x 12 x 14 cm) with hematocrit level\n associated with anticoagulation.  Sigmoid dilated with stool.  Small bilateral\n effusions/atelectasis.  D/W Dr. [**First Name (STitle) **] ([**Doctor First Name **]) at time pt. left

In [0]:
df_et_tube

Unnamed: 0,hadm_id,text
6,132634,"[**Last Name (LF) 385**],[**First Name3 (LF) 3..."
12,157669,[**2115-12-7**] 10:11 AM\n C-SPINE NON-TRAUMA ...
13,158463,[**2157-7-11**] 8:25 PM\n SCROTAL U.S. ...
18,186208,[**2108-2-28**] 6:52 PM\n CHEST (PORTABLE AP);...
23,193816,[**2144-1-9**] 10:30 AM\n CHEST (PORTABLE AP) ...


In [0]:
df_et_tube.shape
# We can see 5 patients has ET tube in their records.

(5, 2)

# Conclusion
We found that 12 (7 +5) patients out of 24 has got intubated. So, in this limmited study we found that 50% of patients with cancer history has got intubated while they have been admitted in the ICU. 