## Dataset

The data being used was from inmate databases aquired from the Nebraska Department of Corrections Public Records. https://dcs-inmatesearch.ne.gov/Corrections/COR_download.htm

The objective is to combine the active and complete databases from the Nebraska Department of Corrections, delete unnesscary rows, and prepare the new dataset for analysis

## 1. Load the Data

Load the data using Pandas. 

Pandas 'ExcelFile' will load the data as a Pandas DataFrame object. 

In [1]:
import numpy as np
import pandas as pd

xls = pd.ExcelFile('inmateDB_updated.xlsx')
df = pd.read_excel(xls, 'Record Type 1')

## 2. Check the First Few Rows of Data

The DataFrame's first five rows can be viewed using the .head() method

In [2]:
df.head()

Unnamed: 0,ID NUMBER,COMMITTED LAST NAME,FIRST NAME,MIDDLE NAME,NAME EXTENSION,LEGAL LAST NAME,FIRST NAME2,MIDDLE NAME3,NAME EXTENSION4,DATE OF BIRTH,...,PAROLE ELIGIBILITY DATE,EARLIEST POSSIBLE RELEASE DATE,GOOD TIME LAW,INST RELEASE DATE,INST RELEASE TYPE,PAROLE BOARD NEXT REVIEW DATE(MONTH&YEAR),PAROLE BOARD FINAL HEARING DATE(MONTH&YEAR),PAROLE BOARD STATUS,PAROLE DATE,PAROLE DISCHARGE DESC
0,1702,CLIFFORD,BRADLEY,,,,,,,NaT,...,,,,1986-01-06,MANDATORY DISCHARGE,,NaT,,NaT,
1,6145,KANE,THOMAS,,,,,,,1928-12-21,...,1952-06-20 00:00:00,,2926.0,1952-08-31,ESCAPE,,NaT,,NaT,
2,6452,ATKINS,LARRY,,,,,,,1929-07-26,...,,,,1955-07-20,DISCRETIONARY PAROLE,,NaT,PAROLED,1980-12-09,EARLY DISCHARGE BY PAROLE BRD
3,12444,SHANEYFELT,CHARLEY,,,,,,,1905-04-10,...,,,,1987-12-24,MANDATORY DISCHARGE,,NaT,,NaT,
4,15379,BEADES,JOE,,,,,,,1924-10-12,...,1955-05-02 00:00:00,LFE,2926.0,1989-07-19,DISCRETIONARY PAROLE,,NaT,PAROLED,1993-01-17,EARLY DISCHARGE BY PAROLE BRD


## 3. Description of Data

The DataFrame info() method is used to see helpful descriptions of the data, such as the column name and number of rows. The 'Non-Null Count' is the number of rows that have a value for that particular column. The 'Dtype' is the data type found within each column. An int64 is an integer, an object type is usually written text, and datetime64 is a date time value. 

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 72954 entries, 0 to 72953
Data columns (total 32 columns):
 #   Column                                       Non-Null Count  Dtype         
---  ------                                       --------------  -----         
 0   ID NUMBER                                    72954 non-null  int64         
 1   COMMITTED LAST NAME                          72953 non-null  object        
 2   FIRST NAME                                   72953 non-null  object        
 3   MIDDLE NAME                                  54905 non-null  object        
 4   NAME EXTENSION                               72954 non-null  object        
 5   LEGAL LAST NAME                              1045 non-null   object        
 6   FIRST NAME2                                  1045 non-null   object        
 7   MIDDLE NAME3                                 1045 non-null   object        
 8   NAME EXTENSION4                              1045 non-null   object        


## 4. Creating a New Active Prisoner Distinction Column

A new column will be created that will reflect if the inmates are actively incarcerated or are no longer in prison. This will be created by assigning a new column that includes inmates in both the full and active inmate databases

### 4.1 Examining the Active Prisoner Database

The Active prisoner database will be loaded and examined in the same manner as with the full inmate database

In [4]:
xls = pd.ExcelFile('inmateDownloadActive.xlsx')
df_Active = pd.read_excel(xls, 'Record Type 1')

Some of the Active prisoner database columns are slightly different than the full database, but it contains the same basic information as the full database.

In [5]:
df_Active.head()

Unnamed: 0,ID NUMBER,COMMITTED LAST NAME,FIRST NAME,MIDDLE NAME,NAME EXTENSION,LEGAL LAST NAME,FIRST NAME.1,MIDDLE NAME.1,NAME EXTENSION.1,DATE OF BIRTH,...,PAROLE ELIGIBILITY DATE,EARLIEST POSSIBLE RELEASE DATE,GOOD TIME LAW,INST RELEASE DATE,INST RELEASE TYPE,PAROLE BOARD NEXT REVIEW DATE(MONTH&YEAR),PAROLE BOARD FINAL HEARING DATE(MONTH&YEAR),PAROLE BOARD STATUS,Unnamed: 30,Unnamed: 31
0,6145,KANE,THOMAS,,,,,,,1928-12-21,...,1952-06-20 00:00:00,,2926,1952-08-31,ESCAPE,,NaT,,,
1,20841,ARNOLD,WILLIAM,L,,,,,,1942-08-28,...,1959-06-02 00:00:00,LFE,2926,1967-07-15,ESCAPE,1959-12-01 00:00:00,NaT,INITIAL REVIEW,,
2,25324,WALKER,RICHARD,T,,,,,,1946-12-24,...,1972-12-15 00:00:00,LFE,2926,2008-11-25,DISCRETIONARY PAROLE,,NaT,CONTINUED ON PAROLE,,
3,25565,ALVAREZ,THOMAS,A,,,,,,1947-10-24,...,,,2926,NaT,,2023-05-01 00:00:00,NaT,DEFERRED,,
4,26103,ADAMS,BRIAN,J,,,,,,1949-04-20,...,LFE,LFE,2926,NaT,,2024-03-01 00:00:00,NaT,DEFERRED,,


In [6]:
df_Active.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7453 entries, 0 to 7452
Data columns (total 32 columns):
 #   Column                                       Non-Null Count  Dtype         
---  ------                                       --------------  -----         
 0   ID NUMBER                                    7453 non-null   int64         
 1   COMMITTED LAST NAME                          7453 non-null   object        
 2   FIRST NAME                                   7453 non-null   object        
 3   MIDDLE NAME                                  5464 non-null   object        
 4   NAME EXTENSION                               7453 non-null   object        
 5   LEGAL LAST NAME                              102 non-null    object        
 6   FIRST NAME.1                                 102 non-null    object        
 7   MIDDLE NAME.1                                102 non-null    object        
 8   NAME EXTENSION.1                             102 non-null    object        
 9

### 4.2 Assigning a new 'Active' Inmate Column

The pandas DataFrame assign() method returns a new column, 'ACTIVE', that reflects if a prisoner is Active or no longer incarcerated. It uses the DataFrame .isin method to check if the full database's inmates' ID NUMBER is in the Active inmate database. The DataFrame .astype method changes the column to the integer datatype. A new dataframe called 'df1' was also created to store this all the previous columns as well as the new Active Column.

In [7]:
df1 = df.assign(ACTIVE=df['ID NUMBER']
                .isin(df_Active['ID NUMBER'])
                .astype(int))

The new column 'ACTIVE' is seen at the as last column. If a row has a 1, that means the prisoner is active, and a 0 means they are no longer incarcerated. 

In [8]:
df1.head()

Unnamed: 0,ID NUMBER,COMMITTED LAST NAME,FIRST NAME,MIDDLE NAME,NAME EXTENSION,LEGAL LAST NAME,FIRST NAME2,MIDDLE NAME3,NAME EXTENSION4,DATE OF BIRTH,...,EARLIEST POSSIBLE RELEASE DATE,GOOD TIME LAW,INST RELEASE DATE,INST RELEASE TYPE,PAROLE BOARD NEXT REVIEW DATE(MONTH&YEAR),PAROLE BOARD FINAL HEARING DATE(MONTH&YEAR),PAROLE BOARD STATUS,PAROLE DATE,PAROLE DISCHARGE DESC,ACTIVE
0,1702,CLIFFORD,BRADLEY,,,,,,,NaT,...,,,1986-01-06,MANDATORY DISCHARGE,,NaT,,NaT,,0
1,6145,KANE,THOMAS,,,,,,,1928-12-21,...,,2926.0,1952-08-31,ESCAPE,,NaT,,NaT,,1
2,6452,ATKINS,LARRY,,,,,,,1929-07-26,...,,,1955-07-20,DISCRETIONARY PAROLE,,NaT,PAROLED,1980-12-09,EARLY DISCHARGE BY PAROLE BRD,0
3,12444,SHANEYFELT,CHARLEY,,,,,,,1905-04-10,...,,,1987-12-24,MANDATORY DISCHARGE,,NaT,,NaT,,0
4,15379,BEADES,JOE,,,,,,,1924-10-12,...,LFE,2926.0,1989-07-19,DISCRETIONARY PAROLE,,NaT,PAROLED,1993-01-17,EARLY DISCHARGE BY PAROLE BRD,0


## 5. Initial Data Cleaning

Datasets are almost always imperfect and this can hinder future analysis. 

### 5.1 Dropping unneeded columns 

The columns that were removed contained unnessescary personal information about the inmates or their sentences that were not of use in this research's more macro-based lense. Parole information was also referenced by the NDCS to be fairly incomplete and too difficult to research. 

Uneeded columns can be deleted using the DataFrame drop method.

In [9]:
df1 = df1.drop(['FIRST NAME',
         'FIRST NAME',
         'MIDDLE NAME',
        'MIDDLE NAME',
        'NAME EXTENSION',
        'COMMITTED LAST NAME',
        'LEGAL LAST NAME',
        'NAME EXTENSION',
        'FIRST NAME2',
        'MIDDLE NAME3',
        'NAME EXTENSION4',
        'GUN CLAUSE',
        'MIN MONTH',
        'CURRENT SENTENCE PARDONED OR COMMUTED DATE',
        'MIN DAY',
        'MAX MONTH',
        'MAX DAY',
        'PAROLE ELIGIBILITY DATE',
        'GOOD TIME LAW',
        'EARLIEST POSSIBLE RELEASE DATE',
        'INST RELEASE TYPE',
        'PAROLE BOARD NEXT REVIEW DATE(MONTH&YEAR)',
        'PAROLE BOARD FINAL HEARING DATE(MONTH&YEAR)',
        'PAROLE BOARD STATUS',
        'PAROLE DISCHARGE DESC',
        'PAROLE DATE'],axis=1)

The remaining rows can now be seen. 

In [10]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 72954 entries, 0 to 72953
Data columns (total 10 columns):
 #   Column               Non-Null Count  Dtype         
---  ------               --------------  -----         
 0   ID NUMBER            72954 non-null  int64         
 1   DATE OF BIRTH        72940 non-null  datetime64[ns]
 2   RACE DESC            72954 non-null  object        
 3   GENDER               72954 non-null  object        
 4   FACILITY             14080 non-null  object        
 5   SENTENCE BEGIN DATE  71475 non-null  datetime64[ns]
 6   MIN TERM/YEAR        72954 non-null  object        
 7   MAX TERM/YEAR        72954 non-null  object        
 8   INST RELEASE DATE    67994 non-null  datetime64[ns]
 9   ACTIVE               72954 non-null  int64         
dtypes: datetime64[ns](3), int64(2), object(5)
memory usage: 5.6+ MB


In [11]:
df1.head()

Unnamed: 0,ID NUMBER,DATE OF BIRTH,RACE DESC,GENDER,FACILITY,SENTENCE BEGIN DATE,MIN TERM/YEAR,MAX TERM/YEAR,INST RELEASE DATE,ACTIVE
0,1702,NaT,,MALE,,NaT,0,0,1986-01-06,0
1,6145,1928-12-21,WHITE,MALE,NEBRASKA STATE PENITENTIARY,1952-06-20,1,3,1952-08-31,1
2,6452,1929-07-26,WHITE,MALE,NEBRASKA STATE PENITENTIARY,1953-11-25,2,10,1955-07-20,0
3,12444,1905-04-10,WHITE,MALE,,1935-10-15,1,9,1987-12-24,0
4,15379,1924-10-12,WHITE,MALE,,1945-05-02,10,LFE,1989-07-19,0


### 5.2 Dropping All "NA" Missing Features

We can check if any columns have missing data values and count them by using the isnull() method and sum() method. 

In [12]:
df1['SENTENCE BEGIN DATE'].isnull().sum()

1479

Missing data values or (NA) is removed from the data in certain columns. The DataFrame dropna() method can be used to do this. 

We drop all inmates who have an unknown sentence begin date.

In [13]:
df1 = df1.dropna(axis=0, how="any", subset=['SENTENCE BEGIN DATE'])

We can check to see if this function worked.

In [14]:
df1['SENTENCE BEGIN DATE'].isnull().sum()

0

We drop all inmates who have an unknown date of birth.

In [15]:
df1['DATE OF BIRTH'].isnull().sum()

6

In [16]:
df1 = df1.dropna(axis=0, how="any", subset=['DATE OF BIRTH'])

### 5.3 Checking for Duplicate Rows

Sometimes data contains duplicate rows of information that may skew future analysis. These can be found by using the DataFrame duplicated() method.

In [17]:
duplicate = df1[df1.duplicated()] 
duplicate

Unnamed: 0,ID NUMBER,DATE OF BIRTH,RACE DESC,GENDER,FACILITY,SENTENCE BEGIN DATE,MIN TERM/YEAR,MAX TERM/YEAR,INST RELEASE DATE,ACTIVE


The duplicate rows are removed by using DataFrame drop_duplicates(). 

In [18]:
df1 = df1.drop_duplicates()

The overall number of entries will change.

In [19]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 71469 entries, 1 to 72953
Data columns (total 10 columns):
 #   Column               Non-Null Count  Dtype         
---  ------               --------------  -----         
 0   ID NUMBER            71469 non-null  int64         
 1   DATE OF BIRTH        71469 non-null  datetime64[ns]
 2   RACE DESC            71469 non-null  object        
 3   GENDER               71469 non-null  object        
 4   FACILITY             13589 non-null  object        
 5   SENTENCE BEGIN DATE  71469 non-null  datetime64[ns]
 6   MIN TERM/YEAR        71469 non-null  object        
 7   MAX TERM/YEAR        71469 non-null  object        
 8   INST RELEASE DATE    66510 non-null  datetime64[ns]
 9   ACTIVE               71469 non-null  int64         
dtypes: datetime64[ns](3), int64(2), object(5)
memory usage: 6.0+ MB


## 6. Adding in Inmate Age Columns for Further Analysis

Some columns will be added using information from the original columns to help expand the analysis.

### 6.1 Changing Column Data Types to Datetime 

Datetime is a data type that is used for dates and times. The Pandas to_datetime method changes an object data type to a datetime format. 

In [20]:
df1['SENTENCE BEGIN DATE DT'] = pd.to_datetime(df1['SENTENCE BEGIN DATE'])
df1['DATE OF BIRTH DT'] = pd.to_datetime(df1['DATE OF BIRTH'])

### 6.2 Finding Age of Inmates at Time of Incarceration

This age will be helpful to see at what age inmates were incarcerated and how that may have changed over time

Datetime also is a Python module that can manipulate the datetime datatype. Timedelta is a function from datetime that is used to calcuate the difference between dates. The Sentence Begin Age can be found through finding the number of days between the sentence begin date and the inmate's date of birth. This number is then divided by 365 days to get their age in years. 

In [21]:
import datetime as dt 
from datetime import timedelta

df1['SENTENCE BEGIN AGE DAYS'] = df1['SENTENCE BEGIN DATE DT'] - df1['DATE OF BIRTH DT']
df1['SENTENCE BEGIN AGE'] = df1['SENTENCE BEGIN AGE DAYS'] / timedelta(days=365)

### 6.3 Finding Current Age of Inmates

This age will be helpful to see how old inmates are now, especially for active ones.

The 'date' is set to the current local date and time using the Pandas datetime.now() method. .strftime is the date in a string format that is given in Year/Month/Day. The Current Age can be found through finding the number of days between the current date and the inmate's date of birth. This number is then divided by 365 days to get their age in years. 

In [22]:
date = pd.datetime.now().strftime('%Y%m%d')
date = pd.to_datetime(date,format='%Y%m%d')
df1['CURRENT AGE DAYS'] = date - df1['DATE OF BIRTH DT']
df1['CURRENT AGE'] = df1['CURRENT AGE DAYS'] / timedelta(days=365)

  date = pd.datetime.now().strftime('%Y%m%d')


We can drop these columns since they were just made temporarily to get the ages in years. 

In [23]:
df1 = df1.drop([
             'CURRENT AGE DAYS',
             'SENTENCE BEGIN AGE DAYS'],axis=1)

### 6.4 Making a New Column for the Sentence Begin Year

This column will be helpful to later separate inmates into different decade cohorts for analysis.

The datetime (dt) .year function can be used to find just the year portion of the full date. 

In [24]:
df1['SENTENCE BEGIN YEAR'] = df1['SENTENCE BEGIN DATE DT'].dt.year

## 7.0 Altering Life Sentencing Values

Some inmates recieve life sentences in their MIN TERM/YEAR or MAX TERM/YEAR, and are given the value 'LFE' instead of an integer. This object value limits evaluation on average MIN/MAX TERM of the inmates. These values are changed to numerical representations of how much life left the inmates have. 

### 7.1 Declaring Average Lifespan Values and Finding Years Between Lifespan and Inmate Sentence Begin Age

lfeM represents the average lifespan of a male Nebraskan, and lfeF represents the average lifespan of a female Nebraskan. This values were obtained from (LINK). The columns LFE SENTENCE M and LFE SENTENCE F are created from taking the respective average lifespans subtracted by the inmates' age at the start of their incarceration. 

In [25]:
lfeM = 77.7
lfeF = 81.89

df1['LFE SENTENCE M'] = lfeM - df1['SENTENCE BEGIN AGE']
df1['LFE SENTENCE F'] = lfeF - df1['SENTENCE BEGIN AGE']

### 7.2 Assigning Found Values to a Life Sentence Column

The lfe_sentence() function looks to see if the row has the a male or female inmate, and returns the corresponding LFE SENTENCE column. A new column, LFE SENTENCE MIN/MAX is created from applying the lfe_sentence() function to our data. The values in LFE SENTENCE MIN/MAX are then rounded using .round(). 

In [26]:
def lfe_sentence(df1):
    if df1['GENDER'] == 'MALE':
        return df1['LFE SENTENCE M']
    else:
        return df1['LFE SENTENCE F']

df1['LFE SENTENCE MIN/MAX'] = df1.apply(lfe_sentence,axis=1)
df1['LFE SENTENCE MIN/MAX'] = df1['LFE SENTENCE MIN/MAX'].round()

### 7.3 Replace MIN or MAX TERM/YEAR Values with Life Sentence Age Values

Now the original MIN or MAX TERM/YEAR value of 'LFE' is replaced with the value in LFE SENTEnCE MIN/MAX by using a numpy where method. The where method lets you find and replace values. 

In [27]:
df1['MIN TERM/YEAR'] = np.where(df1['MIN TERM/YEAR'] == 'LFE', 
                                df1['LFE SENTENCE MIN/MAX'], df1['MIN TERM/YEAR'])
df1['MAX TERM/YEAR'] = np.where(df1['MAX TERM/YEAR'] == 'LFE', 
                                df1['LFE SENTENCE MIN/MAX'], df1['MAX TERM/YEAR'])

### 7.4 Dropping Old Columns

The columns used to create these life sentence replacement values can now be deleted.

In [28]:
df1=df1.drop([
             'LFE SENTENCE M',
             'LFE SENTENCE F',
'LFE SENTENCE MIN/MAX'],axis=1)

In [29]:
df1.head()

Unnamed: 0,ID NUMBER,DATE OF BIRTH,RACE DESC,GENDER,FACILITY,SENTENCE BEGIN DATE,MIN TERM/YEAR,MAX TERM/YEAR,INST RELEASE DATE,ACTIVE,SENTENCE BEGIN DATE DT,DATE OF BIRTH DT,SENTENCE BEGIN AGE,CURRENT AGE,SENTENCE BEGIN YEAR
1,6145,1928-12-21,WHITE,MALE,NEBRASKA STATE PENITENTIARY,1952-06-20,1,3,1952-08-31,1,1952-06-20,1928-12-21,23.512329,92.238356,1952
2,6452,1929-07-26,WHITE,MALE,NEBRASKA STATE PENITENTIARY,1953-11-25,2,10,1955-07-20,0,1953-11-25,1929-07-26,24.350685,91.643836,1953
3,12444,1905-04-10,WHITE,MALE,,1935-10-15,1,9,1987-12-24,0,1935-10-15,1905-04-10,30.534247,115.953425,1935
4,15379,1924-10-12,WHITE,MALE,,1945-05-02,10,57,1989-07-19,0,1945-05-02,1924-10-12,20.567123,96.432877,1945
6,16657,1929-01-10,WHITE,MALE,,1948-12-22,58,58,2002-12-27,0,1948-12-22,1929-01-10,19.961644,92.183562,1948


## 8.0 Deleting Death and Independent Values 

Very few of the MIN or MAX TERM/YEAR values are labelled with DTH or IND. These stand for Death and Independent sentences, and only hinder analysis on average sentencing. They are removed to simplify the data, as they account for fewer than 30 inmates. These few inmates act as major outliers compared to the rest of the database.

### 8.1 Checking to See if IND or DTH Values Exist 

The DataFrame method str.contains() finds if the given value is in the DataFrame being analyzed. .any() returns a true or false statement if the given value is found in the DataFrame. 

In [30]:
df1['MAX TERM/YEAR'].str.contains('IND').any()

True

In [31]:
df1['MAX TERM/YEAR'].str.contains('DTH').any()

True

### 8.2 Drop rows where values equal IND or DTH

The drop() method is used again to delete any columns where the MIN or MAX TERM/YEAR value is equal to DTH or IND.

In [32]:
df1.drop(df1.loc[df['MIN TERM/YEAR']=='DTH'].index, inplace=True)
df1.drop(df1.loc[df['MIN TERM/YEAR']=='IND'].index, inplace=True)

df1.drop(df1.loc[df['MAX TERM/YEAR']=='DTH'].index, inplace=True)
df1.drop(df1.loc[df['MAX TERM/YEAR']=='IND'].index, inplace=True)

### 8.3 Check Work

In [33]:
df1['MAX TERM/YEAR'].str.contains('IND').any()

False

In [34]:
df1['MAX TERM/YEAR'].str.contains('DTH').any()

False

### 8.4 Changing the MIN and MAX TERM/YEAR to numeric values

The pandas method to_numeric() is used to change the MIN and MAX TERM/YEAR string values into float values (integers with decimal points). This will make finding future inmate sentencing averages easier. 

In [35]:
df1['MIN TERM/YEAR'] = pd.to_numeric(df1['MIN TERM/YEAR'])
df1['MAX TERM/YEAR'] = pd.to_numeric(df1['MAX TERM/YEAR'])

Now the value type change can be seen. 

In [36]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 71440 entries, 1 to 72953
Data columns (total 15 columns):
 #   Column                  Non-Null Count  Dtype         
---  ------                  --------------  -----         
 0   ID NUMBER               71440 non-null  int64         
 1   DATE OF BIRTH           71440 non-null  datetime64[ns]
 2   RACE DESC               71440 non-null  object        
 3   GENDER                  71440 non-null  object        
 4   FACILITY                13570 non-null  object        
 5   SENTENCE BEGIN DATE     71440 non-null  datetime64[ns]
 6   MIN TERM/YEAR           71440 non-null  float64       
 7   MAX TERM/YEAR           71440 non-null  float64       
 8   INST RELEASE DATE       66493 non-null  datetime64[ns]
 9   ACTIVE                  71440 non-null  int64         
 10  SENTENCE BEGIN DATE DT  71440 non-null  datetime64[ns]
 11  DATE OF BIRTH DT        71440 non-null  datetime64[ns]
 12  SENTENCE BEGIN AGE      71440 non-null  float6

## 9.0 Reducing the Database to only contain inmates incarcerated after 1979.

When looking at broad trends over the studied period, this is the range that needs to be researched. All previous incarcerations are outlier data that had incomplete digital documentation.

The DataFrame can be altered to only keep rows where inmates were incarcerated after 1979.

In [37]:
df1 = df1[df1['SENTENCE BEGIN YEAR'] > 1979]

The DataFrame lost around 2857 rows of inmates.

In [38]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 68583 entries, 142 to 72953
Data columns (total 15 columns):
 #   Column                  Non-Null Count  Dtype         
---  ------                  --------------  -----         
 0   ID NUMBER               68583 non-null  int64         
 1   DATE OF BIRTH           68583 non-null  datetime64[ns]
 2   RACE DESC               68583 non-null  object        
 3   GENDER                  68583 non-null  object        
 4   FACILITY                11586 non-null  object        
 5   SENTENCE BEGIN DATE     68583 non-null  datetime64[ns]
 6   MIN TERM/YEAR           68583 non-null  float64       
 7   MAX TERM/YEAR           68583 non-null  float64       
 8   INST RELEASE DATE       63663 non-null  datetime64[ns]
 9   ACTIVE                  68583 non-null  int64         
 10  SENTENCE BEGIN DATE DT  68583 non-null  datetime64[ns]
 11  DATE OF BIRTH DT        68583 non-null  datetime64[ns]
 12  SENTENCE BEGIN AGE      68583 non-null  floa

## 10.0 Converting DataFrame to CSV for Future Use

The DataFrame method to_csv converts a DataFrame to a CSV file for easier storage and sharing. 

In [39]:
df1.to_csv('inmate_updatedClean_demographics.csv', encoding='utf-8', index=False)