**DataFrames Day 1**

Learning Outcomes: Prep for LO 1. Students will be able to select, run, and interpret univariate and bivariate statistical analyses and visualizations.

In [1]:
#Import the pandas package
import pandas as pd

Pandas is a popular open-source library in python used for data manipulation and analysis.

# **Dataframes**

DataFrame: A two-dimensional table provided by the pandas package. It consists of data, rows, and columns. It can be created from scratch or from a csv file (what will primarily do this semester).

# **Creating Dataframes from Scratch**

There are multiple ways to create dataframes from scratch. Most of the dataframes we will use in this class will be created from a csv file, so we won't cover all the ways you can create them from scratch. Below is one way. Collapsed at the end of the file are other ways. If interested, you can review those on your own.

The dataframe below is created from a list of lists where each list contains the data for a row. The data lists do NOT include the row labeled index name, so that is added later.

In [2]:
data=[['Gryffindor', 11, 'Male', 'Half-blood'], ['Gryffindor', 11, 'Female', 'Muggle-born'],['Slytherin', 11, 'Male', 'Pure-blood'],
      ['Slytherin', 31, 'Male', 'Half-blood'], ['Gryffindor', 63, 'Male', 'Half-Giant']]

HP_df = pd.DataFrame(data,
                  index=['Harry Potter', 'Hermione Granger', 'Draco Malfoy', 'Severus Snape', 'Rubeus Hagrid'],
                  columns=['House', 'Age', 'Gender', 'BloodStatus'])

HP_df.index.names = ['Name'] # set the name of the row labeled index column as 'Name'. Optional (if don't do it, index name would be blank)
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,11,Female,Muggle-born
Draco Malfoy,Slytherin,11,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryffindor,63,Male,Half-Giant


In [3]:
#Make your own DataFrame with 2 rows and 2 columns
bears = [["red", "small"], ["black and white", "large"]]
DFBear = pd.DataFrame(bears, index=["Chifu","Po"], columns=["color", "size"])
DFBear

Unnamed: 0,color,size
Chifu,red,small
Po,black and white,large


# **Creating DataFrames from CSV File**

The code below shows how to create a dataframe from a csv file.

Notice that this csv file doesn't have an automatic row labeled index (like Name in the Harry Potter dataframe). When creating the dataframe, a row labeled index was added with values 0,1,2,3, etc. It doesn't have a name.

The row labeled index and the row range index of this dataframe have the same values when sorted in the default sorting.

In [4]:
df = pd.read_csv('https://www.dropbox.com/s/puruptejplzhts5/insurance.csv?dl=1')
df

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
0,19,female,27.900,0,yes,southwest,16884.92400
1,18,male,33.770,1,no,southeast,1725.55230
2,28,male,33.000,3,no,southeast,4449.46200
3,33,male,22.705,0,no,northwest,21984.47061
4,32,male,28.880,0,no,northwest,3866.85520
...,...,...,...,...,...,...,...
1333,50,male,30.970,3,no,northwest,10600.54830
1334,18,female,31.920,0,no,northeast,2205.98080
1335,18,female,36.850,0,no,southeast,1629.83350
1336,21,female,25.800,0,no,southwest,2007.94500


# **Return Only A Few Rows (.head)**

Adding the .head() method to the dataframe will return 5 records. If you put a number instead of the (), it will return that many records. .head(10) will return 10 records.

In [24]:
df.head()

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
0,19,female,27.9,0,yes,southwest,16884.924
1,18,male,33.77,1,no,southeast,1725.5523
2,28,male,33.0,3,no,southeast,4449.462
3,33,male,22.705,0,no,northwest,21984.47061
4,32,male,28.88,0,no,northwest,3866.8552


In [25]:
df.head(10)

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
0,19,female,27.9,0,yes,southwest,16884.924
1,18,male,33.77,1,no,southeast,1725.5523
2,28,male,33.0,3,no,southeast,4449.462
3,33,male,22.705,0,no,northwest,21984.47061
4,32,male,28.88,0,no,northwest,3866.8552
5,31,female,25.74,0,no,southeast,3756.6216
6,46,female,33.44,1,no,southeast,8240.5896
7,37,female,27.74,3,no,northwest,7281.5056
8,37,male,29.83,2,no,northeast,6406.4107
9,60,female,25.84,0,no,northwest,28923.13692


# **Read DataFrame Columns**

Return just the charges column

In [26]:
df['charges']

0       16884.92400
1        1725.55230
2        4449.46200
3       21984.47061
4        3866.85520
           ...     
1333    10600.54830
1334     2205.98080
1335     1629.83350
1336     2007.94500
1337    29141.36030
Name: charges, Length: 1338, dtype: float64

This also returns the row range index number.

# **Read DataFrame Rows**

**.iloc()** is a pandas method used to access and retrieve data from a DataFrame based on its integer **range index**.

The syntax is generally: df.iloc[row_index, column_index]. When the column_index is used, it will return a specific cell, not an entire row.

Here are a few ways it is used.

Return the third row (remember that the first row is represented by 0) of the insurance dataframe.

In [27]:
df.iloc[2]

age                28
sex              male
bmi              33.0
children            3
smoker             no
region      southeast
charges      4449.462
Name: 2, dtype: object

Return the first 2 rows  of the insurance dataframe.

In [28]:
df.iloc[:2]

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
0,19,female,27.9,0,yes,southwest,16884.924
1,18,male,33.77,1,no,southeast,1725.5523


Return rows 3-5  of the insurance dataframe.

In [29]:
df.iloc[2:5] #it doesn't include the last index row number you include

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
2,28,male,33.0,3,no,southeast,4449.462
3,33,male,22.705,0,no,northwest,21984.47061
4,32,male,28.88,0,no,northwest,3866.8552


**.loc()** is a pandas method used to access and retrieve data from a DataFrame based on its **labeled index**.

The syntax is generally: df.loc[row_index, column_index]. When the column_index is used, it will return a specific cell, not an entire row.

Return Harry Potter's row by both range index and labled index

In [30]:
#Range Index
HP_df.iloc[0]

IndexError: single positional indexer is out-of-bounds

In [None]:
#Labled Index
HP_df.loc['Harry Potter']

# **Read DataFrame Cells**

By including the column_index, we can use the .iloc() and .loc() methods to return DataFrame cells.

Return the BMI value of the 2nd row.

In [5]:
df.iloc[1, 2]

33.77

Return the BMI values of the first two rows

In [6]:
df.iloc[:2, 2]

0    27.90
1    33.77
Name: bmi, dtype: float64

Return Harry Potter's Blood Status

In [7]:
HP_df.loc['Harry Potter', 'BloodStatus']

'Half-blood'

# **Adding DataFrame Rows**

You can add a row to a dataframe using the .loc() method. This adds a row to the specified row labeled index.

Add Neville Longbottom's information to the Harry Potter dataframe.

In [None]:
nevilleLongbottomList = ["Gryffindor", 11, "Male", "Pure-blood"]

HP_df.loc['Neville Longbottom'] = nevilleLongbottomList

HP_df

# **Adding DataFrame Columns**

Resetting the Harry Potter df:

In [None]:
data=[['Gryffindor', 11, 'Male', 'Half-blood'], ['Gryffindor', 11, 'Female', 'Muggle-born'],['Slytherin', 11, 'Male', 'Pure-blood'],
      ['Slytherin', 31, 'Male', 'Half-blood'], ['Gryffindor', 63, 'Male', 'Half-Giant']]

HP_df = pd.DataFrame(data,
                  index=['Harry Potter', 'Hermione Granger', 'Draco Malfoy', 'Severus Snape', 'Rubeus Hagrid'],
                  columns=['House', 'Age', 'Gender', 'BloodStatus'])

HP_df.index.names = ['Name']
HP_df

Adding Lists as DataFrame Columns

In [None]:
hairColor = ['Brown', 'Brown', 'Blond', 'Black', 'Black']

HP_df["HairColor"] = hairColor
HP_df

Add eye color before the hair color column

In [None]:
eyeColor = ['Green', 'Brown', 'Blue', 'Brown', 'Brown']

HP_df.insert(4, 'EyeColor', eyeColor)

HP_df

# **Edit/Update**

Updating rows and columns is just like adding rows and columns, except you reference a row/column that already exists.

Resetting the Harry Potter dataframe:

In [7]:
data=[['Slytherin', 40, 'Male', 'Half-blood'], ['Gryffindor', 40, 'Male', 'Pure-blood'],['Slytherin', 40, 'Male', 'Pure-blood'],
      ['Ravenclaw', 40, 'Male', 'Half-blood'], ['Gryffindor', 40, 'Male', 'Pure-blood']]

HP_df = pd.DataFrame(data,
                  index=['Harry Potter', 'Hermione Granger', 'Draco Malfoy', 'Severus Snape', 'Rubeus Hagrid'],
                  columns=['House', 'Age', 'Gender', 'BloodStatus'])

HP_df.index.names = ['Name']
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Slytherin,40,Male,Half-blood
Hermione Granger,Gryffindor,40,Male,Pure-blood
Draco Malfoy,Slytherin,40,Male,Pure-blood
Severus Snape,Ravenclaw,40,Male,Half-blood
Rubeus Hagrid,Gryffindor,40,Male,Pure-blood


Correct Harry Potter's row (should be Gryffindor, 11, Male, Half-blood). Use range index.



In [8]:
HP_df.iloc[0] = ['Gryffindor', 11, 'Male', 'Half-blood']
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,40,Male,Pure-blood
Draco Malfoy,Slytherin,40,Male,Pure-blood
Severus Snape,Ravenclaw,40,Male,Half-blood
Rubeus Hagrid,Gryffindor,40,Male,Pure-blood


Correct Severus Snape's row. Should be Slytherin, 31, Male, Half-blood. Use labeled index.

In [9]:
HP_df.loc['Severus Snape'] = ['Slytherin', 31, 'Male', 'Half-blood']
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,40,Male,Pure-blood
Draco Malfoy,Slytherin,40,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryffindor,40,Male,Pure-blood


Correct the gender column. Should be Harry - Male, Hermione - Female, Draco - Male, Severus - Male, and Hagrid - Male.

In [10]:
HP_df['Gender'] = ['Male', 'Female', 'Male', 'Male', 'Male']
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,40,Female,Pure-blood
Draco Malfoy,Slytherin,40,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryffindor,40,Male,Pure-blood


Correct Hagrid's blood status. It should be 'Half-Giant'. Use range index.

In [15]:
HP_df.iloc[4,3] = "Half-Giant"
HP_df


Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,40,Female,Muggle-born
Draco Malfoy,Slytherin,40,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryffindor,40,Male,Half-Giant


Correct Hermione's blood status. It should be 'Muggle-born'. Use labeled index.

In [13]:
HP_df.loc["Hermione Granger", "BloodStatus"] = 'Muggle-born'
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,40,Female,Muggle-born
Draco Malfoy,Slytherin,40,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryffindor,40,Male,Half-Giant


# **Delete**

Resetting the dataframe:

In [16]:
data=[['Gryffindor', 11, 'Male', 'Half-blood'], ['Gryffindor', 11, 'Female', 'Muggle-born'],['Slytherin', 11, 'Male', 'Pure-blood'],
      ['Slytherin', 31, 'Male', 'Half-blood'], ['Gryffindor', 63, 'Male', 'Half-Giant']]

HP_df = pd.DataFrame(data,
                  index=['Harry Potter', 'Hermione Granger', 'Draco Malfoy', 'Severus Snape', 'Rubeus Hagrid'],
                  columns=['House', 'Age', 'Gender', 'BloodStatus'])

HP_df.index.names = ['Name']
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,11,Female,Muggle-born
Draco Malfoy,Slytherin,11,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryffindor,63,Male,Half-Giant


Delete Rows: Delete Draco and Hagrid using labeled index.

In [17]:
HP_df = HP_df.drop(['Draco Malfoy', 'Rubeus Hagrid'])
HP_df

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryffindor,11,Male,Half-blood
Hermione Granger,Gryffindor,11,Female,Muggle-born
Severus Snape,Slytherin,31,Male,Half-blood


Delete Columns: Delete house and gender using labeled index

In [18]:
HP_df = HP_df.drop(['House', 'Gender'], axis=1)
HP_df

Unnamed: 0_level_0,Age,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
Harry Potter,11,Half-blood
Hermione Granger,11,Muggle-born
Severus Snape,31,Half-blood


NOTE: axis=1 refers to columns. axis=0 refers to row. axis=0 is the default which is why you don't need to put it for the rows

Delete Rows: Delete Snape using Range Index

In [19]:
HP_df = HP_df.drop(HP_df.index[2])
HP_df

Unnamed: 0_level_0,Age,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1
Harry Potter,11,Half-blood
Hermione Granger,11,Muggle-born


Delete Columns: Delete Age using Range Index

In [20]:
HP_df = HP_df.drop(HP_df.columns[0], axis=1)
HP_df

Unnamed: 0_level_0,BloodStatus
Name,Unnamed: 1_level_1
Harry Potter,Half-blood
Hermione Granger,Muggle-born


Delete Range of rows (indexs). Delete the last remaining two rows

In [21]:
HP_df = HP_df.drop(HP_df.index[0:2])
HP_df

Unnamed: 0_level_0,BloodStatus
Name,Unnamed: 1_level_1


# **Sort By**

Using the insurance dataframe, sort by age ascending

In [31]:
df = df.sort_values(by=['age'], ascending=True)
df

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
1248,18,female,39.820,0,no,southeast,1633.96180
482,18,female,31.350,0,no,southeast,1622.18850
492,18,female,25.080,0,no,northeast,2196.47320
525,18,female,33.880,0,no,southeast,11482.63485
529,18,male,25.460,0,no,northeast,1708.00140
...,...,...,...,...,...,...,...
398,64,male,25.600,2,no,southwest,14988.43200
335,64,male,34.500,0,no,southwest,13822.80300
378,64,female,30.115,3,no,northwest,16455.70785
1265,64,male,23.760,0,yes,southeast,26926.51440


Sort by age descending

In [32]:
df = df.sort_values(by=['age'], ascending=False)
df

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
635,64,male,38.190,0,no,northeast,14410.932100
1051,64,male,26.410,0,no,northeast,14394.557900
603,64,female,39.050,3,no,southeast,16085.127500
752,64,male,37.905,0,no,northwest,14210.535950
768,64,female,39.700,0,no,southwest,14319.031000
...,...,...,...,...,...,...,...
781,18,male,41.140,0,no,southeast,1146.796600
270,18,male,29.370,1,no,southeast,1719.436300
427,18,female,29.165,0,no,northeast,7323.734819
399,18,female,38.170,0,no,southeast,1631.668300


# **You Try: Investigate Insurance Data**

**Return the record with the most expensive insurance charge.**

In [35]:
df.sort_values('charges', ascending=False).head(1)


Unnamed: 0,age,sex,bmi,children,smoker,region,charges
543,54,female,47.41,0,yes,southeast,63770.42801


**Return the record with the 5th most expensive insurance charge.**

In [46]:
df2 = df.sort_values('charges', ascending=False).head(5)
df2.iloc[4]


age                  33
sex              female
bmi               35.53
children              0
smoker              yes
region        northwest
charges     55135.40209
Name: 819, dtype: object

**Replace the charges value of the record with the cheapest insurance charge with 1500.0000.**

In [57]:
df3 = df.sort_values('charges', ascending=True)
df3.iloc[0,6] = 1500.0000
df3

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
940,18,male,23.210,0,no,southeast,1500.00000
808,18,male,30.140,0,no,southeast,1131.50660
1244,18,male,33.330,0,no,southeast,1135.94070
663,18,male,33.660,0,no,southeast,1136.39940
22,18,male,34.100,0,no,southeast,1137.01100
...,...,...,...,...,...,...,...
819,33,female,35.530,0,yes,northwest,55135.40209
577,31,female,38.095,1,yes,northeast,58571.07448
1230,52,male,34.485,3,yes,northwest,60021.39897
1300,45,male,30.360,0,yes,southeast,62592.87309


# **You Try: Add Student/Professor**

Iterate through the Harry Potter dataframe and determine if a character is a student or a professor. If their age is 18 or greater they are a professor. If their age is less than 18 they are a student.

Save the occupation (professor or student) in a list called occupation_list.

Add occupation_list to a new column at the end of the df and label it "Occupation".

Display all rows of the df.

This problem involves new concepts you may not be familiar with if you are new to python. These resources may be helpful.

[For loop](https://www.pythoncheatsheet.org/cheatsheet/control-flow ) (to iterate through the Harry Potter dataframe)

[If statement](https://www.pythoncheatsheet.org/cheatsheet/control-flow)

If you don't remember/know how to add an item to a list in python, I'll let you have the experience of googling it.

# **You Try: Add New Row**

Add the new_row to the end of the insurance dataframe. This will require some problem solving and thinking beyond what was taught today.

In [58]:
new_row = [20, 'female', 25.123, 1, 'no', 'southeast', 2345.12345]


# **Optional: Other Ways to Create DataFrames from Scratch**

**Option 2:** Create from a lists of lists where each list contains the data for a row. The data lists DO include the row labeled index names.

DataFrame(data, columns)

Must set the index.

In [59]:
data=[['Harry Potter', 'Gryfinndor', 11, 'Male', 'Half-blood'], ['Hermione Granger','Gryffindor', 11, 'Female', 'Muggle-born'],
     ['Draco Malfoy','Slytherin', 11, 'Male', 'Pure-blood'], ['Severus Snape','Slytherin', 31, 'Male', 'Half-blood'],
     ['Rubeus Hagrid','Gryffindor', 63, 'Male', 'Half-Giant']]

HP_df2 = pd.DataFrame(data,
                  columns=['Name','House', 'Age', 'Gender', 'BloodStatus'])

HP_df2.set_index('Name', inplace=True) # need to set (identity) the index name
HP_df2

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryfinndor,11,Male,Half-blood
Hermione Granger,Gryffindor,11,Female,Muggle-born
Draco Malfoy,Slytherin,11,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryffindor,63,Male,Half-Giant


**Option 3:** Create from a dictionary. The key is the column name. The values (which are lists) are the data for that column.

DataFrame(data, index)

Optional: Name the index

In [60]:
data = {'House':['Gryfinndor', 'Gryfinndor', 'Slytherin', 'Slytherin', 'Gryfinndor'], 'Age':[11,11,11,31,63],
                   'Gender':['Male', 'Female', 'Male', 'Male', 'Male'],
                   'BloodStatus':['Half-blood', 'Muggle-born', 'Pure-blood', 'Half-blood', 'Half-Giant']}

HP_df3 = pd.DataFrame(data,
                  index=['Harry Potter', 'Hermione Granger', 'Draco Malfoy', 'Severus Snape', 'Rubeus Hagrid'])

HP_df3.index.names = ['Name']
HP_df3

Unnamed: 0_level_0,House,Age,Gender,BloodStatus
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Harry Potter,Gryfinndor,11,Male,Half-blood
Hermione Granger,Gryfinndor,11,Female,Muggle-born
Draco Malfoy,Slytherin,11,Male,Pure-blood
Severus Snape,Slytherin,31,Male,Half-blood
Rubeus Hagrid,Gryfinndor,63,Male,Half-Giant


NOTE: You could just pass a dictionary to a DataFrame without passing the index and it will still create a DataFrame. It will only use the row range index (0,1,2,3,4) and not the row index labeled index (Harry Potter, Hermione Granger, etc.)

In [61]:
data = {'House':['Gryfinndor', 'Gryfinndor', 'Slytherin', 'Slytherin', 'Gryfinndor'], 'Age':[11,11,11,31,63],
                   'Gender':['Male', 'Female', 'Male', 'Male', 'Male'],
                   'BloodStatus':['Half-blood', 'Muggle-born', 'Pure-blood', 'Half-blood', 'Half-Giant']}

HP_df4 = pd.DataFrame(data)

HP_df4

Unnamed: 0,House,Age,Gender,BloodStatus
0,Gryfinndor,11,Male,Half-blood
1,Gryfinndor,11,Female,Muggle-born
2,Slytherin,11,Male,Pure-blood
3,Slytherin,31,Male,Half-blood
4,Gryfinndor,63,Male,Half-Giant


In [66]:
HP_df4['Occupation'] = ['Professor' if age >= 18 else 'Student' for age in HP_df4.Age]

HP_df4

Unnamed: 0,House,Age,Gender,BloodStatus,Occupation
0,Gryfinndor,11,Male,Half-blood,Student
1,Gryfinndor,11,Female,Muggle-born,Student
2,Slytherin,11,Male,Pure-blood,Student
3,Slytherin,31,Male,Half-blood,Professor
4,Gryfinndor,63,Male,Half-Giant,Professor


: 