# pandas Dataframes - Examining Data

## lesson_2_2_1

### Import packages

In [13]:
import pandas as pd

### Creating a Basic Dataframe From JSON

In [14]:
# define the data as a list
data = [
    ("Dexter","Johnsons","dog","shiba inu","red sesame",1.5,35,"m",False,"both",True),
    ("Alfred","Johnsons","cat","mix","tuxedo",4,12,"m",True,"indoor",True),
    ("Petra","Smith","cat","ragdoll","calico",6,None,"f",False,"both",True),
    ("Ava","Smith","dog","mix","blk/wht",12,32,"f",True,"both",False),
    ("Schroder","Brown","cat","mix","orange",13,15,"m",False,"indoor",True),
    ("Blackbeard","Brown","bird","parrot","multi",5,3,"f",False,"indoor",),
]

# define the labels
labels = ["name","owner","type","breed","color","age","weight","gender","health issues","indoor/outboor","vaccinated"]

# create dataframe
vet_records = pd.DataFrame.from_records(data, columns=labels)

In [15]:
vet_records

Unnamed: 0,name,owner,type,breed,color,age,weight,gender,health issues,indoor/outboor,vaccinated
0,Dexter,Johnsons,dog,shiba inu,red sesame,1.5,35.0,m,False,both,True
1,Alfred,Johnsons,cat,mix,tuxedo,4.0,12.0,m,True,indoor,True
2,Petra,Smith,cat,ragdoll,calico,6.0,,f,False,both,True
3,Ava,Smith,dog,mix,blk/wht,12.0,32.0,f,True,both,False
4,Schroder,Brown,cat,mix,orange,13.0,15.0,m,False,indoor,True
5,Blackbeard,Brown,bird,parrot,multi,5.0,3.0,f,False,indoor,


### Examining the Data in a Dataframe

There are several different ways to examine data using a pandas dataframe.  Two are `.head()` and `.tail()`. These show the first five and the last five rows of the dataframe respectively.

In [16]:
 # displays the first five rows in the dataframe
vet_records.head()

Unnamed: 0,name,owner,type,breed,color,age,weight,gender,health issues,indoor/outboor,vaccinated
0,Dexter,Johnsons,dog,shiba inu,red sesame,1.5,35.0,m,False,both,True
1,Alfred,Johnsons,cat,mix,tuxedo,4.0,12.0,m,True,indoor,True
2,Petra,Smith,cat,ragdoll,calico,6.0,,f,False,both,True
3,Ava,Smith,dog,mix,blk/wht,12.0,32.0,f,True,both,False
4,Schroder,Brown,cat,mix,orange,13.0,15.0,m,False,indoor,True


In [17]:
 # displays the first five rows in the dataframe
vet_records.tail()

Unnamed: 0,name,owner,type,breed,color,age,weight,gender,health issues,indoor/outboor,vaccinated
1,Alfred,Johnsons,cat,mix,tuxedo,4.0,12.0,m,True,indoor,True
2,Petra,Smith,cat,ragdoll,calico,6.0,,f,False,both,True
3,Ava,Smith,dog,mix,blk/wht,12.0,32.0,f,True,both,False
4,Schroder,Brown,cat,mix,orange,13.0,15.0,m,False,indoor,True
5,Blackbeard,Brown,bird,parrot,multi,5.0,3.0,f,False,indoor,


In [18]:
# displays all the records of the datframe
vet_records

Unnamed: 0,name,owner,type,breed,color,age,weight,gender,health issues,indoor/outboor,vaccinated
0,Dexter,Johnsons,dog,shiba inu,red sesame,1.5,35.0,m,False,both,True
1,Alfred,Johnsons,cat,mix,tuxedo,4.0,12.0,m,True,indoor,True
2,Petra,Smith,cat,ragdoll,calico,6.0,,f,False,both,True
3,Ava,Smith,dog,mix,blk/wht,12.0,32.0,f,True,both,False
4,Schroder,Brown,cat,mix,orange,13.0,15.0,m,False,indoor,True
5,Blackbeard,Brown,bird,parrot,multi,5.0,3.0,f,False,indoor,


#### `.dtypes` show you the types of data in the dataframe by column.  If the `dtype` is `object`, this indicates that pandas is seeing that data as more than one type.

In [19]:
# object means a mixed type column
vet_records.dtypes

name               object
owner              object
type               object
breed              object
color              object
age               float64
weight            float64
gender             object
health issues        bool
indoor/outboor     object
vaccinated         object
dtype: object

Notice all the `string` columns are listed as `object`.  This is because a `string` type takes a maximum length argument, so when importing from CSV, they are imported as a `object` so they can be variable length.

#### `.describe` shows statitical operations on columns that these operations can be performed on.

In [20]:
# `.describe` shows statistical information on columns that the operations can be performed on
vet_records.describe()

Unnamed: 0,age,weight
count,6.0,5.0
mean,6.916667,19.4
std,4.58712,13.649176
min,1.5,3.0
25%,4.25,12.0
50%,5.5,15.0
75%,10.5,32.0
max,13.0,35.0


In [21]:
# to show all columns in `.describe` add `include="all"`
vet_records.describe(include="all")

Unnamed: 0,name,owner,type,breed,color,age,weight,gender,health issues,indoor/outboor,vaccinated
count,6,6,6,6,6,6.0,5.0,6,6,6,5
unique,6,3,3,4,6,,,2,2,2,2
top,Dexter,Johnsons,cat,mix,red sesame,,,m,False,both,True
freq,1,2,3,3,1,,,3,4,3,4
mean,,,,,,6.916667,19.4,,,,
std,,,,,,4.58712,13.649176,,,,
min,,,,,,1.5,3.0,,,,
25%,,,,,,4.25,12.0,,,,
50%,,,,,,5.5,15.0,,,,
75%,,,,,,10.5,32.0,,,,


#### `.at` allows the user to change the value of a specific cell

In [10]:
# change a specific value with `.at`
vet_records.at[0, "weight"] = 34.7 

In [11]:
# notice the weight was changed for Dexter
vet_records

Unnamed: 0,name,owner,type,breed,color,age,weight,gender,health issues,indoor/outboor,vaccinated
0,Dexter,Johnsons,dog,shiba inu,red sesame,1.5,34.7,m,False,both,True
1,Alfred,Johnsons,cat,mix,tuxedo,4.0,12.0,m,True,indoor,True
2,Petra,Smith,cat,ragdoll,calico,6.0,,f,False,both,True
3,Ava,Smith,dog,mix,blk/wht,12.0,32.0,f,True,both,False
4,Schroder,Brown,cat,mix,orange,13.0,15.0,m,False,indoor,True
5,Blackbeard,Brown,bird,parrot,multi,5.0,3.0,f,False,indoor,


####  `.assign` is used to add another column of data

In [22]:
# we are going to add the ratio age:weight as a column to the dataframe
# notice that this method iterates throught the dataframe
vet_records = vet_records.assign(age_weight=(vet_records['age']/vet_records['weight']))

In [24]:
# review the new dataframe
vet_records

Unnamed: 0,name,owner,type,breed,color,age,weight,gender,health issues,indoor/outboor,vaccinated,age_weight
0,Dexter,Johnsons,dog,shiba inu,red sesame,1.5,35.0,m,False,both,True,0.042857
1,Alfred,Johnsons,cat,mix,tuxedo,4.0,12.0,m,True,indoor,True,0.333333
2,Petra,Smith,cat,ragdoll,calico,6.0,,f,False,both,True,
3,Ava,Smith,dog,mix,blk/wht,12.0,32.0,f,True,both,False,0.375
4,Schroder,Brown,cat,mix,orange,13.0,15.0,m,False,indoor,True,0.866667
5,Blackbeard,Brown,bird,parrot,multi,5.0,3.0,f,False,indoor,,1.666667
