## Common Programmatic Assessments in pandas
### Gather

In [1]:
import pandas as pd

In [2]:
patients = pd.read_csv('patients.csv')
treatments = pd.read_csv('treatments.csv')
adverse_reactions = pd.read_csv('adverse_reactions.csv')

### Assess
These are the programmatic assessment methods in pandas that you will probably use most often:

* .head (DataFrame and Series)
* .tail (DataFrame and Series)
* .sample (DataFrame and Series)
* .info (DataFrame only)
* .describe (DataFrame and Series)
* .value_counts (Series only)
* Various methods of indexing and selecting data (.loc and bracket notation with/without boolean indexing, also .iloc)

Try them out below and keep their results in mind. Some will come in handy later in the lesson.

Check out the [pandas API reference](https://pandas.pydata.org/pandas-docs/stable/api.html) for detailed usage information.

Try `.head` and `.tail` on the `patients` table.

In [3]:
patients.head()

Unnamed: 0,patient_id,assigned_sex,given_name,surname,address,city,state,zip_code,country,contact,birthdate,weight,height,bmi
0,1,female,Zoe,Wellish,576 Brown Bear Drive,Rancho California,California,92390.0,United States,951-719-9170ZoeWellish@superrito.com,7/10/1976,121.7,66,19.6
1,2,female,Pamela,Hill,2370 University Hill Road,Armstrong,Illinois,61812.0,United States,PamelaSHill@cuvox.de+1 (217) 569-3204,4/3/1967,118.8,66,19.2
2,3,male,Jae,Debord,1493 Poling Farm Road,York,Nebraska,68467.0,United States,402-363-6804JaeMDebord@gustr.com,2/19/1980,177.8,71,24.8
3,4,male,Liêm,Phan,2335 Webster Street,Woodbridge,NJ,7095.0,United States,PhanBaLiem@jourrapide.com+1 (732) 636-8246,7/26/1951,220.9,70,31.7
4,5,male,Tim,Neudorf,1428 Turkey Pen Lane,Dothan,AL,36303.0,United States,334-515-7487TimNeudorf@cuvox.de,2/18/1928,192.3,27,26.1


In [4]:
patients.tail()

Unnamed: 0,patient_id,assigned_sex,given_name,surname,address,city,state,zip_code,country,contact,birthdate,weight,height,bmi
498,499,male,Mustafa,Lindström,2530 Victoria Court,Milton Mills,ME,3852.0,United States,207-477-0579MustafaLindstrom@jourrapide.com,4/10/1959,181.1,72,24.6
499,500,male,Ruman,Bisliev,494 Clarksburg Park Road,Sedona,AZ,86341.0,United States,928-284-4492RumanBisliev@gustr.com,3/26/1948,239.6,70,34.4
500,501,female,Jinke,de Keizer,649 Nutter Street,Overland Park,MO,64110.0,United States,816-223-6007JinkedeKeizer@teleworm.us,1/13/1971,171.2,67,26.8
501,502,female,Chidalu,Onyekaozulu,3652 Boone Crockett Lane,Seattle,WA,98109.0,United States,ChidaluOnyekaozulu@jourrapide.com1 360 443 2060,2/13/1952,176.9,67,27.7
502,503,male,Pat,Gersten,2778 North Avenue,Burr,Nebraska,68324.0,United States,PatrickGersten@rhyta.com402-848-4923,5/3/1954,138.2,71,19.3


Try `.sample` on the `treatments` table.

In [5]:
treatments.sample()

Unnamed: 0,given_name,surname,auralin,novodra,hba1c_start,hba1c_end,hba1c_change
233,arne,jørgensen,32u - 43u,-,9.65,9.31,0.34


Try `.info` on the `treatments` table.

In [6]:
treatments.info

<bound method DataFrame.info of       given_name       surname    auralin    novodra  hba1c_start  hba1c_end  \
0       veronika      jindrová  41u - 48u          -         7.63       7.20   
1         elliot    richardson          -  40u - 45u         7.56       7.09   
2       yukitaka      takenaka          -  39u - 36u         7.68       7.25   
3           skye    gormanston  33u - 36u          -         7.97       7.62   
4         alissa        montez          -  33u - 29u         7.78       7.46   
5        jasmine         sykes          -  42u - 44u         7.56       7.18   
6         sophia        haugen  37u - 42u          -         7.65       7.27   
7          eddie        archer  31u - 38u          -         7.89       7.55   
8          saber        ménard          -  54u - 54u         8.08       7.70   
9           asia       woźniak  30u - 36u          -         7.76       7.37   
10        joseph           day  29u - 36u          -         7.70       7.19   
11     k

Try `.describe` on the `patients` table.

In [7]:
treatments.describe()

Unnamed: 0,hba1c_start,hba1c_end,hba1c_change
count,280.0,280.0,171.0
mean,7.985929,7.589286,0.546023
std,0.568638,0.569672,0.279555
min,7.5,7.01,0.2
25%,7.66,7.27,0.34
50%,7.8,7.42,0.38
75%,7.97,7.57,0.92
max,9.95,9.58,0.99


Try `.value_counts` on the *adverse_reaction* column of the `adverse_reactions` table.

In [8]:
adverse_reactions.adverse_reaction.value_counts

<bound method IndexOpsMixin.value_counts of 0     injection site discomfort
1                  hypoglycemia
2                  hypoglycemia
3                         cough
4             throat irritation
5                  hypoglycemia
6                  hypoglycemia
7                  hypoglycemia
8                  hypoglycemia
9     injection site discomfort
10                     headache
11                        cough
12                 hypoglycemia
13    injection site discomfort
14                 hypoglycemia
15                       nausea
16                 hypoglycemia
17                       nausea
18                 hypoglycemia
19                     headache
20                 hypoglycemia
21    injection site discomfort
22    injection site discomfort
23                 hypoglycemia
24    injection site discomfort
25                 hypoglycemia
26            throat irritation
27                 hypoglycemia
28                 hypoglycemia
29                 hypoglyce

Try selecting the records in the `patients` table for patients that are from the *city* New York.

In [10]:
len(patients[patients.city=='New York'])


18