*Over time, we've been seeing Supreme Court Justices being nominated at a young age and serving for multiple decades at an old age.*

*Within the current court (Roberts), two Justices have served for a relatively long time; Justice Clarence Thomas (30 years come October) and Justice Stephen Bryer (27 years).*

*Respecting Article 3 and Federalist 78 original intent, this analysis in no way is aiming a finger to a particular Justice for their length of time on the court.*

### Column Descriptions

0. **name** - Justice's full name
1. **birth_date** - Justice's date of birth
2. **birth_st** - State in which Justice was born
3. **gender** - Justice's gender
4. **race** - Justice's race
5. **religion** - Justice's religion
6. **state** - State in which Justice lives
7. **party** - Political party of the President who nominated Justice
8. **president** - President who nominated Justice
9. **sentae_aye** - Senate votes for the nomination of Justice
10. **sentae_nay** - Senate votes against the nomination of Justice
11. **term_start** - Date when Justice started on the court
12. **term_end** - Date when Justice ended on the court
13. **death_date** - Justice's death date

In [1]:
# import libraries
import numpy as np
import pandas as pd
import datetime

In [2]:
scotus = pd.read_csv("supreme_data.csv")

In [3]:
scotus.sample(3)

Unnamed: 0.1,Unnamed: 0,name,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
9,10,"Chase, Samuel",1741-04-17,MD,M,White,episcopalian,MD,Federalist,1,,,1796-02-04,1811-06-19,1811-06-19
49,50,"Lamar, Lucius Quintus Cincinnatus",1825-09-17,GA,M,White,methodist,MS,Democrat,22,32.0,28.0,1888-01-18,1893-01-23,1893-01-23
75,76,"Roberts, Owen Josephus",1875-05-02,PA,M,White,episcopalian,PA,Republican,31,,,1930-06-02,1945-07-31,1955-05-17


# Data Cleaning

**Task 1: Dop the ```Unnamed: 0``` column**

In [4]:
scotus.drop(columns = ["Unnamed: 0"], inplace = True)

In [5]:
scotus.columns

Index(['name', 'birth_date', 'birth_st', 'gender', 'race', 'religion', 'state',
       'party', 'president', 'senate_aye', 'senate_nay', 'term_start',
       'term_end', 'death_date'],
      dtype='object')

**Task 2: Keep the first and last name for each Justice under the ```name``` column**

##### 2.1: Output all of the Justice's names

In [6]:
for i in scotus["name"]:
    print(i)

Jay, John
Rutledge, John
Cushing, William
Wilson, James
Blair, John, Jr.
Iredell, James
Johnson, Thomas
Paterson, William
Rutledge, John
Chase, Samuel
Ellsworth, Oliver
Washington, Bushrod
Moore, Alfred
Marshall, John
Johnson, William
Livingston, Henry Brockholst
Todd, Thomas
Duvall, Gabriel
Story, Joseph
Thompson, Smith
Trimble, Robert
McLean, John
Baldwin, Henry
Wayne, James Moore
Taney, Roger Brooke
Barbour, Philip Pendelton
Catron, John
McKinley, John
Daniel, Peter Vivian
Nelson, Samuel
Woodbury, Levi
Grier, Robert Cooper
Curtis, Benjamin Robbins
Campbell, John Archibald
Clifford, Nathan
Swayne, Noah Haynes
Miller, Samuel Freeman
Davis, David
Field, Stephen Johnson
Chase, Salmon Portland
Strong, William
Bradley, Joseph P.
Hunt, Ward
Waite, Morrison Remick
Harlan, John Marshall (I)
Woods, William Burnham
Matthews, Stanley
Gray, Horace
Blatchford, Samuel
Lamar, Lucius Quintus Cincinnatus
Fuller, Melville Weston
Brewer, David Josiah
Brown, Henry Billings
Shiras, George, Jr.
Jackson, H

##### 2.2: Output all of the Justice's names with no commas and rename it

In [7]:
for i in scotus["name"]:
    if "," not in i:
        print(i)

Amy Coney Barrett


In [8]:
scotus["name"] = scotus["name"].replace({"Amy Coney Barrett": "Barrett, Amy Coney"})

In [9]:
scotus[scotus["name"] == "Barrett, Amy Coney"]

Unnamed: 0,name,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
116,"Barrett, Amy Coney",1972-01-28,LA,F,White,roman catholic,IN,Republican,45,52.0,48.0,2020-10-27,,


##### 2.2: Create a function that renames the ```name``` column

In [10]:
def rename_name(justice_name):
    '''Split the names after the comma to retrieve the last name [0]. Then split the name
    after a space [1]. Return the both values.'''
    # since there's two Justice Harlans, include their respective number within their name
    if justice_name == "Harlan, John Marshall (I)" or justice_name == "Harlan, John Marshall (II)":
        last_name = justice_name.split(",")[0]
        first_name = justice_name.split(" ")[1]
        num = justice_name.split(" ")[3]
        full_name = first_name + " " + last_name + " " + num
        return full_name
    elif justice_name == "Van Devanter, Willis":
        last_name = justice_name.split(",")[0]
        first_name = justice_name.split(" ")[2]
        full_name = first_name + " " + last_name
        return full_name
    else:
        last_name = justice_name.split(",")[0]
        first_name = justice_name.split(" ")[1]
        full_name = first_name + " " + last_name
        return full_name

##### 2.3: Test the function with various name combinations

In [11]:
# test a name with only a first and last
rename_name("Kagan, Elena")

'Elena Kagan'

In [12]:
# test a name with a middle name
rename_name("White, Byron Raymond")

'Byron White'

In [13]:
# test a name with suffixes 
rename_name("Holmes, Oliver Wendell, Jr")

'Oliver Holmes'

In [14]:
# test one of the Justice Harlans'
rename_name("Harlan, John Marshall (I)")

'John Harlan (I)'

In [15]:
# test Justice Van Devanter
rename_name("Van Devanter, Willis")

'Willis Van Devanter'

##### 2.5: Apply the ```rename_name``` function to the ```name``` column

In [16]:
scotus = scotus.assign(name = scotus["name"].apply(rename_name))

In [17]:
scotus["name"]

0             John Jay
1        John Rutledge
2      William Cushing
3         James Wilson
4          John, Blair
            ...       
112    Sonia Sotomayor
113        Elena Kagan
114       Neil Gorsuch
115    Brett Kavanaugh
116        Amy Barrett
Name: name, Length: 117, dtype: object

##### 2.6: Check if any of the values in the ```name``` column have a ```,```

In [18]:
for i in scotus["name"]:
    if "," in i:
        print(i)

John, Blair
George, Shiras


##### 2.7: Rename the above Justices to the correct format

In [19]:
scotus["name"] = scotus["name"].replace({"John, Blair": "John Blair",
                                        "George, Shiras": "George Shiras"})

In [20]:
scotus[(scotus["name"] == "John Blair")]

Unnamed: 0,name,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
4,John Blair,1732-04-17,VA,M,White,episcopalian,VA,Federalist,1,,,1790-02-02,1796-01-27,1800-08-31


In [21]:
scotus[(scotus["name"] == "George Shiras")]

Unnamed: 0,name,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
53,George Shiras,1832-01-26,PA,M,White,presbyterian,PA,Republican,23,,,1892-10-10,1903-02-23,1924-08-02


**Task 3: Set the index to the ```name``` column**

In [22]:
scotus["name"].nunique()

115

In [23]:
scotus.shape[0]

117

In [24]:
scotus.duplicated(subset=['name']).sum()

2

In [25]:
scotus.loc[scotus.duplicated(subset = ["name"]), :]

Unnamed: 0,name,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
8,John Rutledge,1739-09-18,SC,M,White,episcopalian,SC,Federalist,1,,,1795-08-12,1795-12-15,1800-07-18
74,Charles Hughes,1862-04-11,NY,M,White,baptist,NY,Republican,31,52.0,26.0,1930-02-24,1941-07-01,1948-08-27


In [26]:
scotus.set_index("name", inplace = True)

In [27]:
scotus.sample()

Unnamed: 0_level_0,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
John McLean,1785-03-11,NJ,M,White,methodist,OH,Democrat,7,,,1830-01-11,1861-04-04,1861-04-04


**Task 4:** Convert the ```birth_date```, ```term_start```, ```term_end```, and ```death_date``` to date time objects (year, month, and day)

##### 4.1: Check the the data columns have a data type of ```str```

In [28]:
date_columns = ["birth_date", "term_start", "term_end"]

In [29]:
for i in date_columns:
    data_type = type(scotus["birth_date"].iloc[0])
    print(data_type)

<class 'str'>
<class 'str'>
<class 'str'>


##### 4.2: Create a function that follows the above task

In [30]:
scotus[scotus["birth_date"].isna()]

Unnamed: 0_level_0,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1


In [31]:
scotus[scotus["term_start"].isna()]

Unnamed: 0_level_0,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1


In [32]:
scotus[scotus["term_end"].isna()]

Unnamed: 0_level_0,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Clarence Thomas,1948-06-23,GA,M,Black,roman catholic,GA,Republican,41,52.0,48.0,1991-10-23,,
Stephen Breyer,1938-08-15,CA,M,White,jewish,MA,Democrat,42,87.0,9.0,1994-08-03,,
John Roberts,1955-01-27,NY,M,White,roman catholic,MD,Republican,43,78.0,22.0,2005-09-29,,
Samuel Alito,1950-04-01,NJ,M,White,roman catholic,NJ,Republican,43,58.0,42.0,2006-01-31,,
Sonia Sotomayor,1954-06-25,NY,F,Hispanic,roman catholic,NY,Democrat,44,68.0,31.0,2009-09-09,,
Elena Kagan,1960-04-08,NY,F,White,jewish,MA,Democrat,44,63.0,37.0,2010-08-07,,
Neil Gorsuch,1967-08-29,CO,M,White,episcopalian,CO,Republican,45,54.0,45.0,2017-04-10,,
Brett Kavanaugh,1965-02-12,DC,M,White,roman catholic,MD,Republican,45,50.0,48.0,2018-10-06,,
Amy Barrett,1972-01-28,LA,F,White,roman catholic,IN,Republican,45,52.0,48.0,2020-10-27,,


In [33]:
scotus[scotus["death_date"].isna()]

Unnamed: 0_level_0,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,term_start,term_end,death_date
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Sandra O'Connor,1930-03-26,TX,F,White,episcopalian,AZ,Republican,40,99.0,0.0,1981-09-25,2006-01-31,
Anthony Kennedy,1936-07-23,CA,M,White,roman catholic,CA,Republican,40,97.0,0.0,1988-02-18,2018-07-31,
David Souter,1939-09-17,MA,M,White,episcopalian,NH,Republican,41,90.0,9.0,1990-10-09,2009-06-30,
Clarence Thomas,1948-06-23,GA,M,Black,roman catholic,GA,Republican,41,52.0,48.0,1991-10-23,,
Stephen Breyer,1938-08-15,CA,M,White,jewish,MA,Democrat,42,87.0,9.0,1994-08-03,,
John Roberts,1955-01-27,NY,M,White,roman catholic,MD,Republican,43,78.0,22.0,2005-09-29,,
Samuel Alito,1950-04-01,NJ,M,White,roman catholic,NJ,Republican,43,58.0,42.0,2006-01-31,,
Sonia Sotomayor,1954-06-25,NY,F,Hispanic,roman catholic,NY,Democrat,44,68.0,31.0,2009-09-09,,
Elena Kagan,1960-04-08,NY,F,White,jewish,MA,Democrat,44,63.0,37.0,2010-08-07,,
Neil Gorsuch,1967-08-29,CO,M,White,episcopalian,CO,Republican,45,54.0,45.0,2017-04-10,,


- Since some of the values within the ```term_end``` and ```death_date``` columns have a float value of ```nan```, return it as ```Nat``` (```datetime``` specific) to block the function from throwing an error

In [34]:
def get_year(date):
    try:
        whole_date = datetime.datetime.fromisoformat(date)
        year = whole_date.year
        return year
    except:
        return pd.NaT
    
def get_month(date):
    try:
        whole_date = datetime.datetime.fromisoformat(date)
        month = whole_date.month
        return month
    except:
        return pd.NaT

def get_day(date):
    try:
        whole_date = datetime.datetime.fromisoformat(date)
        day = whole_date.day
        return day
    except:
        return pd.NaT

##### 4.3: Test the function

In [35]:
get_year("1762-06-05"), get_month("1762-06-05"), get_day("1762-06-05")

(1762, 6, 5)

In [36]:
get_year(float("nan")), get_month(float("nan")), get_day(float("nan"))

(NaT, NaT, NaT)

##### 4.4: Apply the function the columns from the above task

In [37]:
scotus = scotus.assign(birth_year = scotus.get('birth_date').apply(get_year),
                       birth_month = scotus.get('birth_date').apply(get_month),
                       birth_day = scotus.get('birth_date').apply(get_day)
                      )

scotus = scotus.assign(term_start_year = scotus.get('term_start').apply(get_year),
                       term_start_month = scotus.get('term_start').apply(get_month),
                       term_start_day = scotus.get('term_start').apply(get_day)
                      )

scotus = scotus.assign(term_end_year = scotus.get('term_end').apply(get_year),
                       term_end_month = scotus.get('term_end').apply(get_month),
                       term_end_day = scotus.get('term_end').apply(get_day)
                      )

scotus = scotus.assign(death_year = scotus.get('death_date').apply(get_year),
                       death_month = scotus.get('death_date').apply(get_month),
                       death_day = scotus.get('death_date').apply(get_day)
                      )

In [38]:
scotus.sample()

Unnamed: 0_level_0,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,...,birth_day,term_start_year,term_start_month,term_start_day,term_end_year,term_end_month,term_end_day,death_year,death_month,death_day
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Howell Jackson,1832-04-08,TN,M,White,baptist,TN,Democrat,23,,,...,8,1893,3,4,1895,8,8,1895,8,8


##### 4.6: Convert the ```null``` values for ```term_end``` and ```death_date``` related columns to ```nan```

In [39]:
null_columns = ["term_end", "term_end_year", "term_end_month", "term_end_day", 
                "death_date", "death_year", "death_month", "death_day"]

In [40]:
for i in null_columns:
    scotus[i] = scotus[i].fillna("nan")

In [41]:
scotus[scotus["term_end_day"] == "nan"]

Unnamed: 0_level_0,birth_date,birth_st,gender,race,religion,state,party,president,senate_aye,senate_nay,...,birth_day,term_start_year,term_start_month,term_start_day,term_end_year,term_end_month,term_end_day,death_year,death_month,death_day
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Clarence Thomas,1948-06-23,GA,M,Black,roman catholic,GA,Republican,41,52.0,48.0,...,23,1991,10,23,,,,,,
Stephen Breyer,1938-08-15,CA,M,White,jewish,MA,Democrat,42,87.0,9.0,...,15,1994,8,3,,,,,,
John Roberts,1955-01-27,NY,M,White,roman catholic,MD,Republican,43,78.0,22.0,...,27,2005,9,29,,,,,,
Samuel Alito,1950-04-01,NJ,M,White,roman catholic,NJ,Republican,43,58.0,42.0,...,1,2006,1,31,,,,,,
Sonia Sotomayor,1954-06-25,NY,F,Hispanic,roman catholic,NY,Democrat,44,68.0,31.0,...,25,2009,9,9,,,,,,
Elena Kagan,1960-04-08,NY,F,White,jewish,MA,Democrat,44,63.0,37.0,...,8,2010,8,7,,,,,,
Neil Gorsuch,1967-08-29,CO,M,White,episcopalian,CO,Republican,45,54.0,45.0,...,29,2017,4,10,,,,,,
Brett Kavanaugh,1965-02-12,DC,M,White,roman catholic,MD,Republican,45,50.0,48.0,...,12,2018,10,6,,,,,,
Amy Barrett,1972-01-28,LA,F,White,roman catholic,IN,Republican,45,52.0,48.0,...,28,2020,10,27,,,,,,


##### 4.5: Convert ```birth_month```, ```term_start_month```, ```term_end_month```, and ```death_day``` number values to name values

In [42]:
months_code = {1: "January", 2: "February", 3: "March", 4: "April", 5: "May", 6: "June",
          7: "July", 8: "August", 9: "September", 10: "October", 11: "November", 
          12: "December", "nan": "nan"}

In [43]:
def code_to_months(code):
    return months_code[code]

In [44]:
code_to_months(7)

'July'

In [45]:
code_to_months("nan")

'nan'

In [46]:
month_columns = ["birth_month", "term_start_month", "term_end_month"]

In [47]:
for i in month_columns:
    scotus[i] = scotus[i].apply(code_to_months)
    print(i, scotus[i].unique(), " ")

birth_month ['December' 'September' 'March' 'April' 'October' 'November' 'June' 'May'
 'January' 'August' 'July' 'February']  
term_start_month ['October' 'February' 'May' 'September' 'March' 'August' 'November'
 'April' 'January' 'June' 'July' 'December']  
term_end_month ['June' 'March' 'September' 'August' 'January' 'October' 'December'
 'November' 'July' 'February' 'April' 'May' 'nan']  


**---------------------------------------------------------------------------------------------------------------------------------------**

**Task 5: Reorder the columns**

In [48]:
cols = scotus.columns.tolist()
cols

['birth_date',
 'birth_st',
 'gender',
 'race',
 'religion',
 'state',
 'party',
 'president',
 'senate_aye',
 'senate_nay',
 'term_start',
 'term_end',
 'death_date',
 'birth_year',
 'birth_month',
 'birth_day',
 'term_start_year',
 'term_start_month',
 'term_start_day',
 'term_end_year',
 'term_end_month',
 'term_end_day',
 'death_year',
 'death_month',
 'death_day']

In [49]:
len(cols)

25

In [50]:
new_cols = ['birth_date', 'birth_year', 'birth_month', 'birth_day',
            'birth_st', 'gender', 'race', 'religion', 'state', 'party', 'president', 'senate_aye', 'senate_nay', 
            'term_start', 'term_start_year', 'term_start_month', 'term_start_day',  
            'term_end', 'term_end_year', 'term_end_month', 'term_end_day',
            'death_date', 'death_year', 'death_month', 'death_day']

In [51]:
len(new_cols)

25

In [52]:
scotus = scotus[new_cols]

In [53]:
scotus.sample()

Unnamed: 0_level_0,birth_date,birth_year,birth_month,birth_day,birth_st,gender,race,religion,state,party,...,term_start_month,term_start_day,term_end,term_end_year,term_end_month,term_end_day,death_date,death_year,death_month,death_day
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Joseph Lamar,1857-10-14,1857,October,14,GA,M,White,disciples of christ,GA,Democrat,...,January,3,1916-01-02,1916,January,2,1916-01-02,1916,1,2


**Task 6: Replace the ```M``` and ```F``` values within the ```gender``` column with ```Male``` and ```Female``` respectively.**
-  Use the ```map``` function

In [54]:
scotus["gender"] = scotus["gender"].map({"M": "Male",
                                        "F": "Female"})

In [55]:
scotus["gender"].unique()

array(['Male', 'Female'], dtype=object)

**Task 7: Capitalize the values within the ```religion``` column**

In [56]:
scotus["religion"].unique()

array(['episcopalian', 'congregationalist', ' presbyterian', ' unitarian',
       'methodist', ' roman catholic', ' protestant', ' quaker',
       'dutch reform', 'baptist', 'lutheran', 'disciples of christ',
       'jewish', 'roman catholic'], dtype=object)

In [57]:
scotus["religion"] = scotus["religion"].str.title()

- Why the ```title()``` method instead of ```capitalize()```?
    - Since a couple of the values are more than one word (```roman catholic```), it'll only make sense to capitalize the first letter of both words as its a proper noun
        - As seen below, ```disciples of christ``` converted to ```Disciples Of Christ```
            - *Let sleeping dogs lie*

In [58]:
scotus["religion"].unique()

array(['Episcopalian', 'Congregationalist', ' Presbyterian', ' Unitarian',
       'Methodist', ' Roman Catholic', ' Protestant', ' Quaker',
       'Dutch Reform', 'Baptist', 'Lutheran', 'Disciples Of Christ',
       'Jewish', 'Roman Catholic'], dtype=object)

**Task 8: Replace the state abbreviation values within the ```birth_state``` and ```state``` columns with the states' name**

##### 8.1: Create a dictionary of the state abbreviations  (```CA: California```) 

In [59]:
births = scotus["birth_st"].unique()
births

array(['NY', 'SC', 'MA', nan, 'VA', 'MD', 'CT', 'NC', 'NJ', 'GA', 'PA',
       'NH', 'KY', 'OH', 'ME', 'TN', 'LA', 'IN', 'MN', 'AL', 'MI', 'TX',
       'CA', 'IL', 'KS', 'CO', 'WI', 'DC'], dtype=object)

In [60]:
reside = scotus["state"].unique()
reside

array(['NY', 'SC', 'MA', 'PA', 'VA', 'NC', 'MD', 'NJ', 'CT', 'KY', 'OH',
       'GA', 'TN', 'AL', 'NH', 'ME', 'IA', 'IL', 'CA', 'MS', 'KS', 'MI',
       'LA', 'WY', 'UT', 'MN', 'TX', 'IN', 'MO', 'CO', 'AZ'], dtype=object)

In [61]:
scotus['birth_st'] = scotus['birth_st'].fillna("nan")

In [62]:
state_codes = {'AL': 'Alabama','AK': 'Alaska','AZ': 'Arizona','AR': 'Arkansas','CA': 'California',
               'CO': 'Colorado','CT': 'Connecticut','DC': "District of Columbia", 'DE': 'Delaware','FL': 'Florida','GA': 'Georgia',
               'HI': 'Hawaii','ID': 'Idaho', 'IL': 'Illinois', 'IN': 'Indiana','IA': 'Iowa','KS': 'Kansas',
               'KY': 'Kentucky','LA': 'Louisiana','ME': 'Maine','MD': 'Maryland','MA': 'Massachusetts',
               'MI': 'Michigan','MN': 'Minnesota','MS': 'Mississippi','MO': 'Missouri','MT': 'Montana',
               'NE': 'Nebraska','NV': 'Nevada','NH': 'New Hampshire','NJ': 'New Jersey','NM': 'New Mexico',
               'NY': 'New York', 'NC': 'North Carolina','ND': 'North Dakota','OH': 'Ohio','OK': 'Oklahoma',
               'OR': 'Oregon','PA': 'Pennsylvania','RI': 'Rhode Island','SC': 'South Carolina',
               'SD': 'South Dakota','TN': 'Tennessee','TX': 'Texas','UT': 'Utah', 'VT': 'Vermont',
               'VA': 'Virginia','WA': 'Washington','WV': 'West Virginia','WI': 'Wisconsin','WY': 'Wyoming', 
               "nan": "nan"}

In [63]:
len(state_codes)

52

##### 8.2: Create a function that converts all of the state abbreviations to state names 

In [64]:
def code_to_state(code):
    return state_codes[code]

In [65]:
code_to_state("LA")

'Louisiana'

##### 8.3: Apply the ```code_to_state``` function to the ```birth_st``` and ```state``` columns

In [66]:
scotus["birth_st"] = scotus["birth_st"].apply(code_to_state)

In [67]:
scotus["birth_st"].unique()

array(['New York', 'South Carolina', 'Massachusetts', 'nan', 'Virginia',
       'Maryland', 'Connecticut', 'North Carolina', 'New Jersey',
       'Georgia', 'Pennsylvania', 'New Hampshire', 'Kentucky', 'Ohio',
       'Maine', 'Tennessee', 'Louisiana', 'Indiana', 'Minnesota',
       'Alabama', 'Michigan', 'Texas', 'California', 'Illinois', 'Kansas',
       'Colorado', 'Wisconsin', 'District of Columbia'], dtype=object)

In [68]:
scotus["state"] = scotus["state"].apply(code_to_state)

In [69]:
scotus["state"].unique()

array(['New York', 'South Carolina', 'Massachusetts', 'Pennsylvania',
       'Virginia', 'North Carolina', 'Maryland', 'New Jersey',
       'Connecticut', 'Kentucky', 'Ohio', 'Georgia', 'Tennessee',
       'Alabama', 'New Hampshire', 'Maine', 'Iowa', 'Illinois',
       'California', 'Mississippi', 'Kansas', 'Michigan', 'Louisiana',
       'Wyoming', 'Utah', 'Minnesota', 'Texas', 'Indiana', 'Missouri',
       'Colorado', 'Arizona'], dtype=object)

**Task 8: Replace the President number values within the ```president``` column**

In [70]:
scotus["president"].unique()

array([ 1,  2,  3,  4,  5,  6,  7,  8, 10, 11, 13, 14, 15, 16, 18, 19, 20,
       21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
       38, 40, 41, 42, 43, 44, 45])

In [71]:
president_codes = {1: "George Washington", 2: "John Adams", 3: "Thomas Jefferson", 4: "James Madison", 
                   5: "James Monroe", 6: "John Quincy Adams", 7: "Andrew Jackson", 8: "Martin Van Buren", 
                   9: "William Henry Harrison", 10: "John Tyler", 11: "James Polk", 12: "Zachary Taylor", 
                   13: "Millard Fillmore", 14: "Franklin Pierce", 15: "James Buchanan", 16: "Abraham Lincoln", 
                   17: "Andrew Jackson", 18: "Ulysses Grant", 19: "Rutherford Hayes", 20: "James Garfield", 
                   21: "Chester Arthur", 22: "Grover Cleveland", 23: "Benjamin Harrison", 24: "Grover Cleveland",
                   25: "William McKinley", 26: "Theodore Roosevelt", 27: "William Taft", 28: "Woodrow Wilson", 
                   29: "Warren Harding", 30: "Calvin Coolidge", 31: "Herbert Hoover", 32: "Franklin Roosevelt",
                   33: "Hary Truman", 34: "Dwight Eisenhower", 35: "John Kennedy", 36: "Lyndon Johnson", 
                   37: "Richard Nixon", 38: "Gerald Ford", 39: "James Carter", 40: "Ronald Reagan", 
                   41: "George H.W. Bush", 42: "Bill Clinton", 43: "George W. Bush", 44: "Barack Obama",
                   45: "Donald Trump", 46: "Joseph Biden"}

In [72]:
president_codes[1]

'George Washington'

In [73]:
def code_to_president(code):
    return president_codes[code]

In [74]:
code_to_president(1)

'George Washington'

In [75]:
scotus["president"] = scotus["president"].apply(code_to_president)

In [76]:
scotus["president"].unique()

array(['George Washington', 'John Adams', 'Thomas Jefferson',
       'James Madison', 'James Monroe', 'John Quincy Adams',
       'Andrew Jackson', 'Martin Van Buren', 'John Tyler', 'James Polk',
       'Millard Fillmore', 'Franklin Pierce', 'James Buchanan',
       'Abraham Lincoln', 'Ulysses Grant', 'Rutherford Hayes',
       'James Garfield', 'Chester Arthur', 'Grover Cleveland',
       'Benjamin Harrison', 'William McKinley', 'Theodore Roosevelt',
       'William Taft', 'Woodrow Wilson', 'Warren Harding',
       'Calvin Coolidge', 'Herbert Hoover', 'Franklin Roosevelt',
       'Hary Truman', 'Dwight Eisenhower', 'John Kennedy',
       'Lyndon Johnson', 'Richard Nixon', 'Gerald Ford', 'Ronald Reagan',
       'George H.W. Bush', 'Bill Clinton', 'George W. Bush',
       'Barack Obama', 'Donald Trump'], dtype=object)

**Task 9: Check the dataframe one final time**

In [77]:
scotus.sample(3)

Unnamed: 0_level_0,birth_date,birth_year,birth_month,birth_day,birth_st,gender,race,religion,state,party,...,term_start_month,term_start_day,term_end,term_end_year,term_end_month,term_end_day,death_date,death_year,death_month,death_day
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Brett Kavanaugh,1965-02-12,1965,February,12,District of Columbia,Male,White,Roman Catholic,Maryland,Republican,...,October,6,,,,,,,,
Melville Fuller,1833-02-11,1833,February,11,Maine,Male,White,Episcopalian,Illinois,Democrat,...,October,8,1910-07-04,1910.0,July,4.0,1910-07-04,1910.0,7.0,4.0
David Souter,1939-09-17,1939,September,17,Massachusetts,Male,White,Episcopalian,New Hampshire,Republican,...,October,9,2009-06-30,2009.0,June,30.0,,,,


In [78]:
scotus.to_csv('supreme_data_revised.csv', index = False)

**Part 2 (EDA) and Part 3 (Statistical Analysis) can be found [here]("https://github.com/bzekeria/US-Supreme-Court/tree/main/Demographics") once completed**