## Common Programmatic Assessments in pandas
### Gather

In [1]:
import pandas as pd

In [2]:
patients = pd.read_csv('patients.csv')
treatments = pd.read_csv('treatments.csv')
adverse_reactions = pd.read_csv('adverse_reactions.csv')

### Assess
These are the programmatic assessment methods in pandas that you will probably use most often:

* .head (DataFrame and Series)
* .tail (DataFrame and Series)
* .sample (DataFrame and Series)
* .info (DataFrame only)
* .describe (DataFrame and Series)
* .value_counts (Series only)
* Various methods of indexing and selecting data (.loc and bracket notation with/without boolean indexing, also .iloc)

Try them out below and keep their results in mind. Some will come in handy later in the lesson.

Check out the [pandas API reference](https://pandas.pydata.org/pandas-docs/stable/reference/index.html) for detailed usage information.

Try `.head` and `.tail` on the `patients` table.

In [3]:
patients.head().T

Unnamed: 0,0,1,2,3,4
patient_id,1,2,3,4,5
assigned_sex,female,female,male,male,male
given_name,Zoe,Pamela,Jae,Liêm,Tim
surname,Wellish,Hill,Debord,Phan,Neudorf
address,576 Brown Bear Drive,2370 University Hill Road,1493 Poling Farm Road,2335 Webster Street,1428 Turkey Pen Lane
city,Rancho California,Armstrong,York,Woodbridge,Dothan
state,California,Illinois,Nebraska,NJ,AL
zip_code,92390,61812,68467,7095,36303
country,United States,United States,United States,United States,United States
contact,951-719-9170ZoeWellish@superrito.com,PamelaSHill@cuvox.de+1 (217) 569-3204,402-363-6804JaeMDebord@gustr.com,PhanBaLiem@jourrapide.com+1 (732) 636-8246,334-515-7487TimNeudorf@cuvox.de


In [4]:
patients.tail().T

Unnamed: 0,498,499,500,501,502
patient_id,499,500,501,502,503
assigned_sex,male,male,female,female,male
given_name,Mustafa,Ruman,Jinke,Chidalu,Pat
surname,Lindström,Bisliev,de Keizer,Onyekaozulu,Gersten
address,2530 Victoria Court,494 Clarksburg Park Road,649 Nutter Street,3652 Boone Crockett Lane,2778 North Avenue
city,Milton Mills,Sedona,Overland Park,Seattle,Burr
state,ME,AZ,MO,WA,Nebraska
zip_code,3852,86341,64110,98109,68324
country,United States,United States,United States,United States,United States
contact,207-477-0579MustafaLindstrom@jourrapide.com,928-284-4492RumanBisliev@gustr.com,816-223-6007JinkedeKeizer@teleworm.us,ChidaluOnyekaozulu@jourrapide.com1 360 443 2060,PatrickGersten@rhyta.com402-848-4923


Try `.sample` on the `treatments` table.

In [5]:
treatments.sample()

Unnamed: 0,given_name,surname,auralin,novodra,hba1c_start,hba1c_end,hba1c_change
105,finlay,sheppard,-,31u - 30u,7.51,7.17,0.34


In [6]:
treatments.sample(3)

Unnamed: 0,given_name,surname,auralin,novodra,hba1c_start,hba1c_end,hba1c_change
7,eddie,archer,31u - 38u,-,7.89,7.55,0.34
145,ruman,bisliev,46u - 53u,-,7.72,7.39,0.33
193,svanhvít,guðjónsdóttir,37u - 44u,-,7.57,7.11,


Try `.info` on the `treatments` table.

In [7]:
treatments.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 280 entries, 0 to 279
Data columns (total 7 columns):
given_name      280 non-null object
surname         280 non-null object
auralin         280 non-null object
novodra         280 non-null object
hba1c_start     280 non-null float64
hba1c_end       280 non-null float64
hba1c_change    171 non-null float64
dtypes: float64(3), object(4)
memory usage: 15.4+ KB


Try `.describe` on the `patients` table.

In [8]:
patients.describe()

Unnamed: 0,patient_id,zip_code,weight,height,bmi
count,503.0,491.0,503.0,503.0,503.0
mean,252.0,49084.118126,173.43499,66.634195,27.483897
std,145.347859,30265.807442,33.916741,4.411297,5.276438
min,1.0,1002.0,48.8,27.0,17.1
25%,126.5,21920.5,149.3,63.0,23.3
50%,252.0,48057.0,175.3,67.0,27.2
75%,377.5,75679.0,199.5,70.0,31.75
max,503.0,99701.0,255.9,79.0,37.7


Try `.value_counts` on the *adverse_reaction* column of the `adverse_reactions` table.

In [9]:
adverse_reactions["adverse_reaction"].value_counts()

hypoglycemia                 19
injection site discomfort     6
headache                      3
cough                         2
nausea                        2
throat irritation             2
Name: adverse_reaction, dtype: int64

Try selecting the records in the `patients` table for patients that are from the *city* New York.

In [10]:
patients.query(" city=='{}' | city=='{}' | city=='{}' ".format("New York","York","NY")).T

Unnamed: 0,2,9,35,84,129,142,152,188,213,215,229,237,244,247,251,263,277,301,461
patient_id,3,10,36,85,130,143,153,189,214,216,230,238,245,248,252,264,278,302,462
assigned_sex,male,female,female,female,female,male,male,male,female,male,male,male,male,male,male,female,male,female,male
given_name,Jae,Sophie,Kamila,Nương,Rebecca,Finley,Christopher,Søren,Onyemaechi,John,John,John,John,Tuukka,John,Julia,John,Onyekachukwu,Cannan
surname,Debord,Cabrera,Pecinová,Vũ,Jephcott,Chandler,Woodward,Sørensen,Onwughara,Doe,Doe,Doe,Doe,Leppäluoto,Doe,Carvalho,Doe,Obinna,Cabrera
address,1493 Poling Farm Road,3303 Anmoore Road,3558 Longview Avenue,465 Southern Street,989 Wayback Lane,2754 Westwood Avenue,3450 Southern Street,2397 Bell Street,685 Duncan Avenue,123 Main Street,123 Main Street,123 Main Street,123 Main Street,1886 Bicetown Road,123 Main Street,3662 Shinn Street,123 Main Street,2970 Forest Avenue,2102 Geraldine Lane
city,York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York,New York
state,Nebraska,New York,New York,NY,NY,New York,NY,NY,NY,NY,NY,NY,NY,NY,NY,NY,NY,NY,NY
zip_code,68467,10011,10004,10001,10004,10001,10004,10011,10013,12345,12345,12345,12345,10011,12345,10036,12345,10004,10014
country,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States,United States
contact,402-363-6804JaeMDebord@gustr.com,SophieCabreraIbarra@teleworm.us1 718 795 9124,718-501-0503KamilaPecinova@dayrep.com,VuCamNuong@fleckens.hu516-720-5094,631-370-7406RebeccaJephcott@armyspy.com,516-740-5280FinleyChandler@dayrep.com,ChristopherWoodward@jourrapide.com+1 (516) 630...,SrenSrensen@superrito.com1 212 201 3108,917-622-9142OnyemaechiOnwughara@einrot.com,johndoe@email.com1234567890,johndoe@email.com1234567890,johndoe@email.com1234567890,johndoe@email.com1234567890,917-408-8855TuukkaLeppaluoto@teleworm.us,johndoe@email.com1234567890,JuliaAzevedoCarvalho@superrito.com+1 (212) 782...,johndoe@email.com1234567890,OnyekachukwuObinna@teleworm.us646-982-6609,646-289-4177CannanCabreraOrdonez@superrito.com
