### Chargemaster data format

Hospotal Data:
- EvergreenHealth (King)
- Kindred Hospotal Seattle - First Hill (King)
- UW Medicine - Harborview (King)
- Providence St.Peter Hospital (Thurston)
- Providence St.Mary Medical Center (Walla Walla)

#### Read in data and basic data formatting

In [149]:
import pandas as pd

1. EvergreenHealth

In [150]:
evergreen = pd.read_excel('evergreenhealth.xlsx')
evergreen.head()

Unnamed: 0,MSDRG 1,MSDRG 1 Name,Average TOTAL CHARGES
0,3.0,"ECMO OR TRACH W MV >96 HRS OR PDX EXC FACE, MO...",707407.966667
1,4.0,"TRACH W MV >96 HRS OR PDX EXC FACE, MOUTH & NE...",481304.005
2,20.0,INTRACRANIAL VASCULAR PROCEDURES W PDX HEMORRH...,298443.5875
3,23.0,CRANIOTOMY W MAJOR DEVICE IMPLANT OR ACUTE COM...,225376.578667
4,24.0,CRANIO W MAJOR DEV IMPL/ACUTE COMPLEX CNS PDX ...,105068.886667


In [151]:
# rename the columns
evergreen.columns = ['drg_code', 'name', 'price']
evergreen.head()

Unnamed: 0,drg_code,name,price
0,3.0,"ECMO OR TRACH W MV >96 HRS OR PDX EXC FACE, MO...",707407.966667
1,4.0,"TRACH W MV >96 HRS OR PDX EXC FACE, MOUTH & NE...",481304.005
2,20.0,INTRACRANIAL VASCULAR PROCEDURES W PDX HEMORRH...,298443.5875
3,23.0,CRANIOTOMY W MAJOR DEVICE IMPLANT OR ACUTE COM...,225376.578667
4,24.0,CRANIO W MAJOR DEV IMPL/ACUTE COMPLEX CNS PDX ...,105068.886667


In [152]:
# drop rows with NaN
evergreen = evergreen.dropna()

In [153]:
# change the datatype of drg_code to int
evergreen['drg_code'] = evergreen['drg_code'].astype(int)

In [154]:
# round the price to 4 decimal points
evergreen = evergreen.round({'price': 4})
evergreen.head()

Unnamed: 0,drg_code,name,price
0,3,"ECMO OR TRACH W MV >96 HRS OR PDX EXC FACE, MO...",707407.9667
1,4,"TRACH W MV >96 HRS OR PDX EXC FACE, MOUTH & NE...",481304.005
2,20,INTRACRANIAL VASCULAR PROCEDURES W PDX HEMORRH...,298443.5875
3,23,CRANIOTOMY W MAJOR DEVICE IMPLANT OR ACUTE COM...,225376.5787
4,24,CRANIO W MAJOR DEV IMPL/ACUTE COMPLEX CNS PDX ...,105068.8867


In [155]:
# add hospotal name / county to dataframe
evergreen.insert(0, 'hospital', 'EvergreenHealth')
evergreen.insert(1, 'hospital_size', 'Large')
evergreen.insert(2, 'county', 'King')
evergreen

Unnamed: 0,hospital,hospital_size,county,drg_code,name,price
0,EvergreenHealth,Large,King,3,"ECMO OR TRACH W MV >96 HRS OR PDX EXC FACE, MO...",707407.9667
1,EvergreenHealth,Large,King,4,"TRACH W MV >96 HRS OR PDX EXC FACE, MOUTH & NE...",481304.0050
2,EvergreenHealth,Large,King,20,INTRACRANIAL VASCULAR PROCEDURES W PDX HEMORRH...,298443.5875
3,EvergreenHealth,Large,King,23,CRANIOTOMY W MAJOR DEVICE IMPLANT OR ACUTE COM...,225376.5787
4,EvergreenHealth,Large,King,24,CRANIO W MAJOR DEV IMPL/ACUTE COMPLEX CNS PDX ...,105068.8867
...,...,...,...,...,...,...
527,EvergreenHealth,Large,King,981,EXTENSIVE O.R. PROCEDURE UNRELATED TO PRINCIPA...,98713.9750
528,EvergreenHealth,Large,King,982,EXTENSIVE O.R. PROCEDURE UNRELATED TO PRINCIPA...,65374.9112
529,EvergreenHealth,Large,King,983,EXTENSIVE O.R. PROCEDURE UNRELATED TO PRINCIPA...,32406.9000
530,EvergreenHealth,Large,King,987,NON-EXTENSIVE O.R. PROC UNRELATED TO PRINCIPAL...,115053.5150


2. Kindred Hospotal Seattle - First Hill

In [156]:
kindred_firsthill = pd.read_excel('kindred_seattle_firsthill.xlsx')
kindred_firsthill.head()

Unnamed: 0,Facility ID,FACNAME,Final DRG Code,DRGNAME,AVG_of_Charges,AVG_of_Reimb per DC,MIN_of_Reimb per DC1,MAX_of_Reimb per DC1
0,1000/4003,Paramount,166,OTHER RESP SYSTEM O.R. PROCEDURES W MCC,777425.446667,305118.976667,101403.75,535089.18
1,1000/4003,Paramount,177,RESPIRATORY INFECTIONS & INFLAMMATIONS W MCC,254795.957,110739.08,36375.0,463752.8
2,1000/4003,Paramount,189,PULMONARY EDEMA & RESPIRATORY FAILURE,171400.991176,53539.758235,11616.0,115832.21
3,1000/4003,Paramount,207,RESPIRATORY SYSTEM DIAGNOSIS W VENTILATOR SUPP...,429516.148333,114989.987778,49106.75,293124.15
4,1000/4003,Paramount,853,INFECTIOUS & PARASITIC DISEASES W O.R. PROCEDU...,543390.104444,233097.047778,45476.6,713200.0


In [157]:
# we only need drg_code, name, and avg_of_charges
kindred_firsthill = kindred_firsthill[['Final DRG Code', 'DRGNAME', 'AVG_of_Charges']]
kindred_firsthill

Unnamed: 0,Final DRG Code,DRGNAME,AVG_of_Charges
0,166,OTHER RESP SYSTEM O.R. PROCEDURES W MCC,777425.446667
1,177,RESPIRATORY INFECTIONS & INFLAMMATIONS W MCC,254795.957
2,189,PULMONARY EDEMA & RESPIRATORY FAILURE,171400.991176
3,207,RESPIRATORY SYSTEM DIAGNOSIS W VENTILATOR SUPP...,429516.148333
4,853,INFECTIOUS & PARASITIC DISEASES W O.R. PROCEDU...,543390.104444
...,...,...,...
793,981,EXTENSIVE O.R. PROCEDURE UNRELATED TO PRINCIPA...,NO DATA
794,559,"AFTERCARE, MUSCULOSKELETAL SYSTEM & CONNECTIVE...",NO DATA
795,862,POSTOPERATIVE & POST-TRAUMATIC INFECTIONS W MCC,NO DATA
796,919,COMPLICATIONS OF TREATMENT W MCC,NO DATA


In [158]:
# drop rows with NaN and rename columns
kindred_firsthill.dropna()
kindred_firsthill = kindred_firsthill[kindred_firsthill.AVG_of_Charges != 'NO DATA']
kindred_firsthill.columns = ['drg_code', 'name', 'price']
kindred_firsthill

Unnamed: 0,drg_code,name,price
0,166,OTHER RESP SYSTEM O.R. PROCEDURES W MCC,777425.446667
1,177,RESPIRATORY INFECTIONS & INFLAMMATIONS W MCC,254795.957
2,189,PULMONARY EDEMA & RESPIRATORY FAILURE,171400.991176
3,207,RESPIRATORY SYSTEM DIAGNOSIS W VENTILATOR SUPP...,429516.148333
4,853,INFECTIOUS & PARASITIC DISEASES W O.R. PROCEDU...,543390.104444
...,...,...,...
771,853,INFECTIOUS & PARASITIC DISEASES W O.R. PROCEDU...,1278652.401667
772,870,SEPTICEMIA OR SEVERE SEPSIS W MV >96 HOURS,417961.128333
773,871,SEPTICEMIA OR SEVERE SEPSIS W/O MV >96 HOURS W...,303152.91
774,949,AFTERCARE W CC/MCC,153721.9


In [159]:
# change the datatype of price to float
kindred_firsthill['price'] = kindred_firsthill['price'].astype(float)
kindred_firsthill.dtypes

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  kindred_firsthill['price'] = kindred_firsthill['price'].astype(float)


drg_code      int64
name         object
price       float64
dtype: object

In [160]:
# round the price to 4 decimal points
kindred_firsthill = kindred_firsthill.round({'price': 4})
kindred_firsthill.head()

Unnamed: 0,drg_code,name,price
0,166,OTHER RESP SYSTEM O.R. PROCEDURES W MCC,777425.4467
1,177,RESPIRATORY INFECTIONS & INFLAMMATIONS W MCC,254795.957
2,189,PULMONARY EDEMA & RESPIRATORY FAILURE,171400.9912
3,207,RESPIRATORY SYSTEM DIAGNOSIS W VENTILATOR SUPP...,429516.1483
4,853,INFECTIOUS & PARASITIC DISEASES W O.R. PROCEDU...,543390.1044


In [161]:
# add hospotal name / county to dataframe
kindred_firsthill.insert(0, 'hospital', 'Kindred Hospotal Seattle - First Hill')
kindred_firsthill.insert(1, 'hospital_size', 'Small')
kindred_firsthill.insert(2, 'county', 'King')
kindred_firsthill

Unnamed: 0,hospital,hospital_size,county,drg_code,name,price
0,Kindred Hospotal Seattle - First Hill,Small,King,166,OTHER RESP SYSTEM O.R. PROCEDURES W MCC,7.774254e+05
1,Kindred Hospotal Seattle - First Hill,Small,King,177,RESPIRATORY INFECTIONS & INFLAMMATIONS W MCC,2.547960e+05
2,Kindred Hospotal Seattle - First Hill,Small,King,189,PULMONARY EDEMA & RESPIRATORY FAILURE,1.714010e+05
3,Kindred Hospotal Seattle - First Hill,Small,King,207,RESPIRATORY SYSTEM DIAGNOSIS W VENTILATOR SUPP...,4.295161e+05
4,Kindred Hospotal Seattle - First Hill,Small,King,853,INFECTIOUS & PARASITIC DISEASES W O.R. PROCEDU...,5.433901e+05
...,...,...,...,...,...,...
771,Kindred Hospotal Seattle - First Hill,Small,King,853,INFECTIOUS & PARASITIC DISEASES W O.R. PROCEDU...,1.278652e+06
772,Kindred Hospotal Seattle - First Hill,Small,King,870,SEPTICEMIA OR SEVERE SEPSIS W MV >96 HOURS,4.179611e+05
773,Kindred Hospotal Seattle - First Hill,Small,King,871,SEPTICEMIA OR SEVERE SEPSIS W/O MV >96 HOURS W...,3.031529e+05
774,Kindred Hospotal Seattle - First Hill,Small,King,949,AFTERCARE W CC/MCC,1.537219e+05


3. UW Medicine - Harborview

In [162]:
uw_harborview = pd.read_excel('uwmedicine_harborview.xlsx', skiprows = list(range(0,4)))
uw_harborview.columns = uw_harborview.iloc[0, :]
uw_harborview.drop(uw_harborview.index[0], inplace=True)
uw_harborview

Unnamed: 0,DRG,DRG Description,Average Charge per Case,Average Self-Pay and Post Service Discount per Case,Average Self-Pay and Prompt-Pay Discount per Case
1,003,ECMO OR TRACHEOSTOMY WITH MV >96 HOURS OR PRIN...,713666.631829,499566.64228,449609.978052
2,004,TRACHEOSTOMY WITH MV >96 HOURS OR PRINCIPAL DI...,832838.6924,582987.08468,524688.376212
3,020,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,337176.958365,236023.870856,212421.48377
4,021,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,268733.310636,188113.317445,169301.985701
5,022,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,84868.448824,59407.914176,53467.122759
...,...,...,...,...,...
257,965,OTHER MULTIPLE SIGNIFICANT TRAUMA WITHOUT CC/MCC,55205.347059,38643.742941,34779.368647
258,974,HIV WITH MAJOR RELATED CONDITION WITH MCC,86515.685,60560.9795,54504.88155
259,981,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,204057.862759,142840.503931,128556.453538
260,982,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,93088.954444,65162.268111,58646.0413


In [163]:
uw_harborview = uw_harborview[['DRG', 'DRG Description', 'Average Charge per Case']]
uw_harborview

Unnamed: 0,DRG,DRG Description,Average Charge per Case
1,003,ECMO OR TRACHEOSTOMY WITH MV >96 HOURS OR PRIN...,713666.631829
2,004,TRACHEOSTOMY WITH MV >96 HOURS OR PRINCIPAL DI...,832838.6924
3,020,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,337176.958365
4,021,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,268733.310636
5,022,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,84868.448824
...,...,...,...
257,965,OTHER MULTIPLE SIGNIFICANT TRAUMA WITHOUT CC/MCC,55205.347059
258,974,HIV WITH MAJOR RELATED CONDITION WITH MCC,86515.685
259,981,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,204057.862759
260,982,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,93088.954444


In [164]:
# drop NaN and rename columns
uw_harborview.dropna()
uw_harborview.columns = ['drg_code', 'name', 'price']
# change data types
uw_harborview['drg_code'] = kindred_firsthill['drg_code'].astype(int)
uw_harborview['price'] = kindred_firsthill['price'].astype(float)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  uw_harborview['drg_code'] = kindred_firsthill['drg_code'].astype(int)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  uw_harborview['price'] = kindred_firsthill['price'].astype(float)


In [165]:
uw_harborview.dtypes

drg_code      int64
name         object
price       float64
dtype: object

In [166]:
# round price to 4 decimal points
uw_harborview = uw_harborview.round({'price': 4})
uw_harborview

Unnamed: 0,drg_code,name,price
1,177,ECMO OR TRACHEOSTOMY WITH MV >96 HOURS OR PRIN...,254795.9570
2,189,TRACHEOSTOMY WITH MV >96 HOURS OR PRINCIPAL DI...,171400.9912
3,207,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,429516.1483
4,853,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,543390.1044
5,870,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,369229.4187
...,...,...,...
257,638,OTHER MULTIPLE SIGNIFICANT TRAUMA WITHOUT CC/MCC,156942.1850
258,682,HIV WITH MAJOR RELATED CONDITION WITH MCC,174036.4410
259,862,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,317340.0467
260,870,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,322722.4400


In [167]:
# add hospotal name / county to dataframe
uw_harborview.insert(0, 'hospital', 'UW Medicine - Harborview Medical Center')
uw_harborview.insert(1, 'hospital_size', 'Large')
uw_harborview.insert(2, 'county', 'King')
uw_harborview

Unnamed: 0,hospital,hospital_size,county,drg_code,name,price
1,UW Medicine - Harborview Medical Center,Large,King,177,ECMO OR TRACHEOSTOMY WITH MV >96 HOURS OR PRIN...,254795.9570
2,UW Medicine - Harborview Medical Center,Large,King,189,TRACHEOSTOMY WITH MV >96 HOURS OR PRINCIPAL DI...,171400.9912
3,UW Medicine - Harborview Medical Center,Large,King,207,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,429516.1483
4,UW Medicine - Harborview Medical Center,Large,King,853,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,543390.1044
5,UW Medicine - Harborview Medical Center,Large,King,870,INTRACRANIAL VASCULAR PROCEDURES WITH PRINCIPA...,369229.4187
...,...,...,...,...,...,...
257,UW Medicine - Harborview Medical Center,Large,King,638,OTHER MULTIPLE SIGNIFICANT TRAUMA WITHOUT CC/MCC,156942.1850
258,UW Medicine - Harborview Medical Center,Large,King,682,HIV WITH MAJOR RELATED CONDITION WITH MCC,174036.4410
259,UW Medicine - Harborview Medical Center,Large,King,862,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,317340.0467
260,UW Medicine - Harborview Medical Center,Large,King,870,EXTENSIVE O.R. PROCEDURES UNRELATED TO PRINCIP...,322722.4400


4. Providence St.Peter Hospital

In [168]:
providence_stpeter = pd.read_excel('providence_st_peter.xlsx', skiprows = list(range(0,7)))
providence_stpeter.columns = providence_stpeter.iloc[0, :]
providence_stpeter.drop(providence_stpeter.index[0], inplace=True)
providence_stpeter

Unnamed: 0,Facility,MS DRG,Description,Number of Claims,Average Proposed Charge
1,Providence St. Peter Hospital,003,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",28,697327.00683
2,Providence St. Peter Hospital,004,"Trach W Mv >96 Hrs Or Pdx Exc Face, Mouth & Ne...",12,542412.5057
3,Providence St. Peter Hospital,011,"Tracheostomy For Face, Mouth & Neck Diagnoses ...",6,147253.14
4,Providence St. Peter Hospital,020,Intracranial Vascular Procedures W Pdx Hemorrh...,7,428392.872857
5,Providence St. Peter Hospital,021,Intracranial Vascular Procedures W Pdx Hemorrh...,4,276916.05
...,...,...,...,...,...
640,Providence St. Peter Hospital,983,Extensive O.R. Procedure Unrelated To Principa...,6,43776.111667
641,Providence St. Peter Hospital,987,Non-Extensive O.R. Proc Unrelated To Principal...,19,161707.765937
642,Providence St. Peter Hospital,988,Non-Extensive O.R. Proc Unrelated To Principal...,11,60946.599091
643,Providence St. Peter Hospital,989,Non-Extensive O.R. Proc Unrelated To Principal...,1,87350.86


In [169]:
providence_stpeter = providence_stpeter[['Facility', 'MS DRG', 'Description', 'Average Proposed Charge']]
providence_stpeter.columns = ['hospital', 'drg_code', 'name', 'price']
providence_stpeter.head()

Unnamed: 0,hospital,drg_code,name,price
1,Providence St. Peter Hospital,3,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",697327.00683
2,Providence St. Peter Hospital,4,"Trach W Mv >96 Hrs Or Pdx Exc Face, Mouth & Ne...",542412.5057
3,Providence St. Peter Hospital,11,"Tracheostomy For Face, Mouth & Neck Diagnoses ...",147253.14
4,Providence St. Peter Hospital,20,Intracranial Vascular Procedures W Pdx Hemorrh...,428392.872857
5,Providence St. Peter Hospital,21,Intracranial Vascular Procedures W Pdx Hemorrh...,276916.05


In [170]:
# change data types
providence_stpeter['drg_code'] = providence_stpeter['drg_code'].astype(int)
providence_stpeter['price'] = providence_stpeter['price'].astype(float)
providence_stpeter.dtypes

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  providence_stpeter['drg_code'] = providence_stpeter['drg_code'].astype(int)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  providence_stpeter['price'] = providence_stpeter['price'].astype(float)


hospital     object
drg_code      int64
name         object
price       float64
dtype: object

In [171]:
# round price to 4 decimal points
providence_stpeter = providence_stpeter.round({'price': 4})
providence_stpeter.head()

Unnamed: 0,hospital,drg_code,name,price
1,Providence St. Peter Hospital,3,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",697327.0068
2,Providence St. Peter Hospital,4,"Trach W Mv >96 Hrs Or Pdx Exc Face, Mouth & Ne...",542412.5057
3,Providence St. Peter Hospital,11,"Tracheostomy For Face, Mouth & Neck Diagnoses ...",147253.14
4,Providence St. Peter Hospital,20,Intracranial Vascular Procedures W Pdx Hemorrh...,428392.8729
5,Providence St. Peter Hospital,21,Intracranial Vascular Procedures W Pdx Hemorrh...,276916.05


In [172]:
providence_stpeter.insert(1, 'hospital_size', 'Large')
providence_stpeter.insert(2, 'county', 'Thurston')
providence_stpeter

Unnamed: 0,hospital,hospital_size,county,drg_code,name,price
1,Providence St. Peter Hospital,Large,Thurston,3,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",697327.0068
2,Providence St. Peter Hospital,Large,Thurston,4,"Trach W Mv >96 Hrs Or Pdx Exc Face, Mouth & Ne...",542412.5057
3,Providence St. Peter Hospital,Large,Thurston,11,"Tracheostomy For Face, Mouth & Neck Diagnoses ...",147253.1400
4,Providence St. Peter Hospital,Large,Thurston,20,Intracranial Vascular Procedures W Pdx Hemorrh...,428392.8729
5,Providence St. Peter Hospital,Large,Thurston,21,Intracranial Vascular Procedures W Pdx Hemorrh...,276916.0500
...,...,...,...,...,...,...
640,Providence St. Peter Hospital,Large,Thurston,983,Extensive O.R. Procedure Unrelated To Principa...,43776.1117
641,Providence St. Peter Hospital,Large,Thurston,987,Non-Extensive O.R. Proc Unrelated To Principal...,161707.7659
642,Providence St. Peter Hospital,Large,Thurston,988,Non-Extensive O.R. Proc Unrelated To Principal...,60946.5991
643,Providence St. Peter Hospital,Large,Thurston,989,Non-Extensive O.R. Proc Unrelated To Principal...,87350.8600


5. Providence St.Mary Medical Center

In [173]:
providence_stmary = pd.read_excel('providence_st_mary.xlsx', skiprows = list(range(0,7)))
providence_stmary.columns = providence_stmary.iloc[0, :]
providence_stmary.drop(providence_stmary.index[0], inplace=True)
providence_stmary

Unnamed: 0,Facility,MS DRG,Description,Number of Claims,Average Proposed Charge
1,Providence St. Mary Medical Center,003,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",1,34617.95
2,Providence St. Mary Medical Center,025,Craniotomy & Endovascular Intracranial Procedu...,2,136247.3182
3,Providence St. Mary Medical Center,026,Craniotomy & Endovascular Intracranial Procedu...,1,42713.12
4,Providence St. Mary Medical Center,029,Spinal Procedures W Cc Or Spinal Neurostimulators,3,90854.076667
5,Providence St. Mary Medical Center,030,Spinal Procedures W/O Cc/Mcc,1,98089.94
...,...,...,...,...,...
447,Providence St. Mary Medical Center,964,Other Multiple Significant Trauma W Cc,3,39856.403333
448,Providence St. Mary Medical Center,981,Extensive O.R. Procedure Unrelated To Principa...,4,82048.23
449,Providence St. Mary Medical Center,982,Extensive O.R. Procedure Unrelated To Principa...,7,52606.302857
450,Providence St. Mary Medical Center,987,Non-Extensive O.R. Proc Unrelated To Principal...,5,116117.77


In [174]:
providence_stmary = providence_stmary[['Facility', 'MS DRG', 'Description', 'Average Proposed Charge']]
providence_stmary.columns = ['hospital', 'drg_code', 'name', 'price']
providence_stmary.dropna()
providence_stmary.head()

Unnamed: 0,hospital,drg_code,name,price
1,Providence St. Mary Medical Center,3,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",34617.95
2,Providence St. Mary Medical Center,25,Craniotomy & Endovascular Intracranial Procedu...,136247.3182
3,Providence St. Mary Medical Center,26,Craniotomy & Endovascular Intracranial Procedu...,42713.12
4,Providence St. Mary Medical Center,29,Spinal Procedures W Cc Or Spinal Neurostimulators,90854.076667
5,Providence St. Mary Medical Center,30,Spinal Procedures W/O Cc/Mcc,98089.94


In [175]:
# change data types
providence_stmary['drg_code'] = providence_stmary['drg_code'].astype(int)
providence_stmary['price'] = providence_stmary['price'].astype(float)
providence_stmary.dtypes

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  providence_stmary['drg_code'] = providence_stmary['drg_code'].astype(int)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  providence_stmary['price'] = providence_stmary['price'].astype(float)


hospital     object
drg_code      int64
name         object
price       float64
dtype: object

In [176]:
# round price to 4 decimal points
providence_stmary = providence_stmary.round({'price': 4})
providence_stmary.head()

Unnamed: 0,hospital,drg_code,name,price
1,Providence St. Mary Medical Center,3,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",34617.95
2,Providence St. Mary Medical Center,25,Craniotomy & Endovascular Intracranial Procedu...,136247.3182
3,Providence St. Mary Medical Center,26,Craniotomy & Endovascular Intracranial Procedu...,42713.12
4,Providence St. Mary Medical Center,29,Spinal Procedures W Cc Or Spinal Neurostimulators,90854.0767
5,Providence St. Mary Medical Center,30,Spinal Procedures W/O Cc/Mcc,98089.94


In [177]:
providence_stmary.insert(1, 'hospital_size', 'Medium')
providence_stmary.insert(2, 'county', 'Walla Walla')
providence_stmary

Unnamed: 0,hospital,hospital_size,county,drg_code,name,price
1,Providence St. Mary Medical Center,Medium,Walla Walla,3,"Ecmo Or Trach W Mv >96 Hrs Or Pdx Exc Face, Mo...",34617.9500
2,Providence St. Mary Medical Center,Medium,Walla Walla,25,Craniotomy & Endovascular Intracranial Procedu...,136247.3182
3,Providence St. Mary Medical Center,Medium,Walla Walla,26,Craniotomy & Endovascular Intracranial Procedu...,42713.1200
4,Providence St. Mary Medical Center,Medium,Walla Walla,29,Spinal Procedures W Cc Or Spinal Neurostimulators,90854.0767
5,Providence St. Mary Medical Center,Medium,Walla Walla,30,Spinal Procedures W/O Cc/Mcc,98089.9400
...,...,...,...,...,...,...
447,Providence St. Mary Medical Center,Medium,Walla Walla,964,Other Multiple Significant Trauma W Cc,39856.4033
448,Providence St. Mary Medical Center,Medium,Walla Walla,981,Extensive O.R. Procedure Unrelated To Principa...,82048.2300
449,Providence St. Mary Medical Center,Medium,Walla Walla,982,Extensive O.R. Procedure Unrelated To Principa...,52606.3029
450,Providence St. Mary Medical Center,Medium,Walla Walla,987,Non-Extensive O.R. Proc Unrelated To Principal...,116117.7700


#### Join the 5 tables together

In [178]:
hospitals = [evergreen, kindred_firsthill, uw_harborview, providence_stpeter, providence_stmary]
all_hospital_fee = pd.concat(hospitals)
all_hospital_fee

Unnamed: 0,hospital,hospital_size,county,drg_code,name,price
0,EvergreenHealth,Large,King,3,"ECMO OR TRACH W MV >96 HRS OR PDX EXC FACE, MO...",707407.9667
1,EvergreenHealth,Large,King,4,"TRACH W MV >96 HRS OR PDX EXC FACE, MOUTH & NE...",481304.0050
2,EvergreenHealth,Large,King,20,INTRACRANIAL VASCULAR PROCEDURES W PDX HEMORRH...,298443.5875
3,EvergreenHealth,Large,King,23,CRANIOTOMY W MAJOR DEVICE IMPLANT OR ACUTE COM...,225376.5787
4,EvergreenHealth,Large,King,24,CRANIO W MAJOR DEV IMPL/ACUTE COMPLEX CNS PDX ...,105068.8867
...,...,...,...,...,...,...
447,Providence St. Mary Medical Center,Medium,Walla Walla,964,Other Multiple Significant Trauma W Cc,39856.4033
448,Providence St. Mary Medical Center,Medium,Walla Walla,981,Extensive O.R. Procedure Unrelated To Principa...,82048.2300
449,Providence St. Mary Medical Center,Medium,Walla Walla,982,Extensive O.R. Procedure Unrelated To Principa...,52606.3029
450,Providence St. Mary Medical Center,Medium,Walla Walla,987,Non-Extensive O.R. Proc Unrelated To Principal...,116117.7700


In [179]:
# save dataframe as csv
all_hospital_fee.to_csv('all_hospital_fee.csv', index=False)