# Task 3 - US Healthcare Readmissions and Mortality
Author: [Adrian Vega](https://github.com/adriacv17) <br >Respository: [datafun-06-projects](https://github.com/adriacv17/datafun-06-projects) <br> Data: 09/28/2023 <br>Custom Exporatory Data Project

## Section 1-Load - Read from a data file into a pandas DataFrame.

In [None]:
import pandas as pd # import pandas library
import numpy as np # import numpy library
import statistics as stats #import statistics library

labvaluesdf = pd.read_csv('HepatitisCdata.csv', index_col=0) # Create DataFrame from csv file

#rename columns for clarity on lab test
labvaluesdf.columns = ['Category', 'Age', 'Sex', 'Albumin', 'Alkaline Phosphatase',
                        'Alanine Transaminase', 'Aspartate Transamimase', 'Bilirubin',
                          'Acetylcholinesterase', 'Cholesterol', 'Creatinine',
                            'Gamma-Glutamyl Transferase', 'Proteins']

labvaluesdf # call labvaluesdf DataFrame

## Section 2-View - Display the first 5 rows and the last 5 rows.

In [None]:
labvaluesdf.head(5) #display first 5 rows

In [None]:
labvaluesdf.tail(5) #display last 5 rows

## Section 3-Describe: Use the DataFrame describe() function to calculate basic descriptive statistics for all numeric columns. 

In [None]:
pd.set_option('display.precision', 2) # format for floating-point values

labvaluesdf.describe()  # get basic descriptive statistics using describe() funtion using default numeric columns

## Section 4-Series: Use the Series method describe() to calculate the descriptive stats for all category/text columns.

In [None]:
# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html for help with library

labvaluesdf.describe(exclude=[np.number]) #Excluding numeric columns from a DataFrame description.

In [None]:
labvaluesdf.describe(include='object') #Including only objects from pandas.pydata.org

## Section 5-Unique: Use the Series method unique() to get unique category values. 

In [None]:
unique_category = labvaluesdf['Category'].unique() #Category/Type of the patient. Call all unique values
print(unique_category)

In [None]:
unique_sex = labvaluesdf['Sex'].unique() # m = male, f = female,  Call all unique values
print(unique_sex)

## Section 6-Histograms: Use the DataFrame's hist() function to create a histogram for each numerical column.

In [None]:
import matplotlib.pyplot as plt

#enable Matplotlib support
%matplotlib inline

#Histogram for age as it does not have same y-axis label

histogram = labvaluesdf['Age'].hist()
plt.title('Histogram of Age') # set title
plt.ylabel('Frequency') #label Y axis
plt.xlabel('Age') #label X axis



In [None]:
#Histograms of laboratory tests

columns = ['Albumin', 'Alkaline Phosphatase', 'Alanine Transaminase',
            'Aspartate Transamimase', 'Bilirubin',
            'Acetylcholinesterase', 'Cholesterol', 'Creatinine',
            'Gamma-Glutamyl Transferase', 'Proteins'] # make list of columns for loop

for col in columns: #loop through each column making a histogram
    histogram = labvaluesdf.hist(col) # Produce histogram
    
    plt.title(f'Histogram of {col.title()}') # set title, use title function
    plt.ylabel('Measurement of Lab Test') #label Y axis
    plt.xlabel(col.title()) #label X axis, use title function

## Section 7-List: Get some of your information into a list. Process each item in the list (use for or comprehensions as you like). 

In [None]:
# Get information into lists

abnormal_AST = [AST for AST in labvaluesdf['Aspartate Transamimase'].tolist() if AST > 34] # filter for AST levels greater than 34(abnormal)

print(f'list of abnormal AST levels', (abnormal_AST))

abnormal_ALT = [ALT for ALT in labvaluesdf['Alanine Transaminase'].tolist() if ALT > 35] #filter for ALT levels greater than 35(abnormal female range used)

print(f'list of abnormal ALT levels', (abnormal_ALT))


## Section 8-Filter: Use filter() to show only part of the information. 

In [288]:

labvaluesdf.filter(['Age', 'Sex', 'Alanine Transaminase','Aspartate Transamimase', 'Bilirubin']) #filtered for big indicators of liver disease

Unnamed: 0,Age,Sex,Alanine Transaminase,Aspartate Transamimase,Bilirubin
1,32,m,7.7,22.1,7.5
2,32,m,18.0,24.7,3.9
3,32,m,36.2,52.6,6.1
4,32,m,30.6,22.6,18.9
5,32,m,32.6,24.8,9.6
...,...,...,...,...,...
611,62,f,5.9,110.3,50.0
612,64,f,2.9,44.4,20.0
613,64,f,3.5,99.0,48.0
614,46,f,39.0,62.0,20.0


## Section 9-Map: Use map() to transform some of the data.