# [ES-21AC] State Prisons and County Jails 

---
### Professor: Victoria Robinson 
### Data Science Fellow: Phillip Pierini

The goal of this project is for students to understand that they can do social work using data. This notebook explores the incarceration trends and impacts of prison realignment in California. 

*Estimated Time: 50 minutes*

---

### Table of Contents

[THE DATA](#sectiondata)<br>


[CONTEXT](#sectioncontext)<br>


[JAILS](#section1)<br>

1. [DATA](#subsection1)<br>
2. [DATA ANALYSIS](#subsection2)<br>
3. [GENERAL:SENTENCED & UNSENTENCED](#subsection3)<br>
4. [GENDER DIVISION](#subsection4)<br>

[Final Survey](#section2)<br>

---

**Please run the cell below before you begin.**

**Dependencies:**

In [None]:
from datascience import *
import numpy as np

import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

import ipywidgets as widgets
%run Data_Cleaning_and_Helper_Functions.ipynb

--- 

## THE DATA <a id='sectiondata'></a>

In this notebook, you will use data from the Jail Profile Survey provided by the Board of State and Community Corrections (BSCC). 


The Jail Profile Survey releases reports on data recorded by local agencies. Some of the valuable information that will find here include the total number of unsentenced and sentenced males and females in each of these facilities. This data has been used to determine the needs of each of these counties when determining the distribution state bond monies, and future projections for other jail needs. 

If you are interested in learning more please visit:

[Jail Profile Survey](http://www.bscc.ca.gov/downloads/JPSWorkbook.pdf) 

[Jail Profile Survey data](https://app.bscc.ca.gov/joq//jps/QuerySelection.asp)



---
## CONTEXT  <a id='sectioncontext'></a>
---

During the course, we have learned about the policies of realignment, incarceration, and crime trends in California. By exploring these datasets we hope to get a clear picture of the magnitude of prison and county jail overcrowding per facility and region, and the effects of realignment policies in state prisons and county jails.



The key difference between state prisons and jails involve the process of sentencing. Prisons are designed for long term sentences, while jails are for those who are unsentenced or have short term sentences. (Short-term sentences are generally one year or less.) Prisons are larger and controlled at the state level. In contrast, jails are smaller and handled by a city or county.



The relationship between the two institutions is emphasized by mass incarceration. Through this activity, we will analyze how overcrowding within California’s state prisons influenced the size of jail populations after realignment policies were implemented.


---
# JAILS <a id='section1'></a>
---

## 0. DATA<a id='subsection1'></a>
---

### 0.1  Data Dictionary 

Below you will find a data dictionary for future reference.

|Column Name   | Description |
|--------------|---------|
|Jurisdiction | The unit of government that has legal authority over an inmate (state or federal)|
|Facility | Name of the county jail |
|Year |Year that the data was collected |
|Month | Month that the data was collected |
|Unsentenced males| Non-sentenced inmates are all inmates other than those who have been sentenced on all charges pending * ** |
|Unsentenced females| Non-sentenced inmates are all inmates other than those who have been sentenced on all charges pending * **|
|Sentenced males| Sentenced inmates are those who have been sentenced on all charges and are no longer on trial. This category includes inmates who are being incarcerated pending or during an appeal. * |
|Sentenced females|Sentenced inmates are those who have been sentenced on all charges and are no longer on trial. This category includes inmates who are being incarcerated pending or during an appeal. * |
|Total facility ADP| ADP Total should include all inmates (including those under contract from any agency/jurisdiction) assigned to all single/double and multiple occupancy cells, administrative segregation, disciplinary isolation, and medical and mental health beds.|


\* Note that the counts for sentenced and unsentences male/female inmates is an *average daily population (ADP) for the given month*


** For example, if an inmate has been sentenced on three charges but is still being tried on a fourth charge, they should be reported as “non-sentenced.”

** If an inmate is found not to be competent for trial and is detained in a county jail facility, count them in Non-Sentenced (Male/Female & Misdemeanor/Felony). If they are detained in the state hospital, do not count them in any category.


### 0.2 Importing the Data
Let's start off by importing our jails data into our Jupyter Notebook so we can use and analyze the jails data. To do this, we can use the `datascience` package and the `read_table()` function. This function takes in the name of our CSV file (which in our case is "jails_cleaned.csv" and then it reads the file for us so we can use it here.

In [None]:
jail = Table().read_table("jails_cleaned.csv")

Now let's look at the first 5 entries. Use the function `show` to display the number of desired rows.

In [None]:
jail.show(5)

---
## 1. DATA ANALYSIS<a id='subsection2'></a>
---

### Before you begin!
**Notes:**

1. Throughout the notebook you will encounter ellipsis (...) This is an indication that you need to replace the ... with your code. 


2. We have placed `raise NotImplementedError` throughout the notebook. These errors are here to inform you that you have forgotted to answer a question. All you need to do after attempting each question is to comment out this error by adding a # before the error. (e.g. # `raise NotImplementedError`) 

Don't worry if this seems confusing. We will go over some examples in class. 

### Question 1.

Like we did in the Prisons notebook, the first thing that we want to do is check that the number of months does not exceed 276.

**A)** group by ***Facility*** using the function by using the `group()` function. 

In [None]:
# YOUR CODE HERE 
jail.group(...)

raise NotImplementationError

Now that you have grouped by *Facility*, we see that the `group` function produced a two-column table with unique *Facility* names and the number of times each name appeared in the table. 
 
**B)** sort by ***count*** by using the `sort` function with the`descending` parameter equal to **True**.



In [None]:
# YOUR CODE HERE 
jail.group('Facility').sort(..., descending= True)


raise NotImplementationError

On this case, there does not seem to be any issues with the counts.

### Question 2.

In the table above, we have a row entry for each month in a given year for 23 years for each jail.
Like we did before, we want to get the **yearly total for each institution** to explore how this total number of inmates in the jail population changed over time. To accomplish this, the first thing we need to aggregate some columns. We will break this into steps in the next section.

**A)** Select: Select the columns 'Year','Unsentenced males', 'Unsentenced females','Sentenced males','Sentenced females', 'Total facility ADP' using the function `select`.

**B)**  Assign: Assign them to a variable called 'data_year'.


In [None]:
data_year= jail.select('Year','Unsentenced males', 'Unsentenced females','Sentenced males','Sentenced females', 'Total facility ADP')

**C)** Group: group  by ***Year*** by using the `group()` function with the `collect=` ***sum***.


In [None]:
# YOUR CODE HERE 
data_year = data_year.group(...,collect= ...)
data_year.show(5)

raise NotImplementationError

**Plotting**: To analyze the data, we are going to create various visualizations! Often times it is more useful to visually inspect the information as it might reveal useful insights and provide a context to the data we are looking at. First, we will take a look at the total facility ADP over the years.

Remember, the format for plotting is: 
* **To draw the plot**:
*data_table*.`plot(`*x_variable*, *y_variable*`)`. Where the x and y variables stand for the name of the columns in our data table
* **To label the plot**: We use `plt.xlabel`(*x_axis name*), `plt.ylabel`(*y_axis name*), `plot.title`(*plot_title*)

## Question 3.
 
 **A)**: Plot ***Total facility ADP sum*** over time.
 
 Hint: Think about what two column names you need to use. Do not forget to write the column inside quotations. e.g. "month"

In [None]:
# YOUR CODE HERE 
data_year.plot(...,...)
plt.xlabel(...)
plt.ylabel(...)
plt.title("Total Facility ADP over time")

We will focus on some years that mark important shifts in population. We will mainly focus on 2011 and 2014 by plotting red dots representing those years.

### Question 4.

**A)** First, let's start off by getting the data at the years 2011 and 2014. To do this we can use the `where()`  to limit our data table to the ***Year*** ***2011*** and save this information into a new variable called **data2011**.  

In [None]:
data2011 = data_year.where("Year", 2011)

data2011.show()

**Your turn!**

**B)** Filter your data set to where ***Year*** is equal to ***2014***, and save it into a variable called **data2014**.

In [None]:
# YOUR CODE 
... = data_year.where(..., ...)


raise NotImplementationError

Next, recall that **to add red dots** we need to use: `plt.plot(x_coordinate, y_coordinate, line property)`

Before we use the plot function, we want to determine our **x_coordinates** and **y_ coordinates** for our newly created 1-row tables.  

Let's begin by selecting the column **Year** and **Total facility ADP sum** from our newly create data table **data2011** and assign them to new variables called **x_coordinate_2011* and **y_coordinate_2011** respectively.

In [None]:
x_coordinate_2011 = data2011.column('Year')
y_coordinate_2011 = data2011.column("Total facility ADP sum")

**Your turn!**

**C)** Save the value for the **Year** column in your **data2014** table into a new variable and call it **x_coordinate_2014**

**D)** Save the value for the **Total facility ADP sum** column in your **data2014** table into a new variable and call it **y_coordinate_2014**

In [None]:
# YOUR CODE 
... = ....column(...)
... = ....column(...)


raise NotImplementationError

Now you are ready to add the dot to the plot.

In [None]:
data_year.plot("Year", "Total facility ADP sum")
plt.xlabel("Year")
plt.ylabel("Total Facility ADP")
plt.title("Total Facility ADP over time")

plt.plot(x_coordinate_2011, y_coordinate_2011, 'ro')
plt.plot(x_coordinate_2014, y_coordinate_2014, 'ro')

### Question 5.

**Looking at the graph produced in the section above how does it reflect the systematic changes of Califonia's potential jail population? Name a court case that is represented by a red dot on the plot above.**

*double click this cell to type your response*

## 2. GENERAL: DESIGNED & STAFFED CAPACITY<a id='subsection3'></a>

Now, let us go ahead and compare the sentenced and unsentenced population for county jails (over all the years since 1995). We will be creating and looking at the following comparisons:
- overall (male and female) sentenced and (male and female) unsentenced
- male unsentenced vs male sentenced
- female unsentenced vs female sentenced

### 2.1 General: Sentenced vs. Unsentenced

To understand how sentenced and unsentenced jail populations have changed over time, we need to estimate the totals for each of these two categories.  


By looking at our data set, we can notice that it is composed of 5 main columns, but none shows the total for these categories. 

|Sentenced males|Sentenced females|Unsentenced males| Unsentenced females|Total facility ADP|
|--------------|---------|------|----------|-----|

Therefore, we will begin by calculating the **sentenced total** and the **unsentenced total**. This means, that we need to aggregate male and female data for each of these two categories into a single- new column.

### Question 1.

Let's begin by separating each column into a new variable so that we can extract and later aggregate such values. As a refresher, you can use the function `column` to select the values in a given column. 

**Note:** We are still using the totals per year for each of the categories. Thus, we will continue to use `data_year` data table. 


Let's select all the values for male sentenced and female sentenced and save it into new variables **'m_sentenced'** and **'f_sentenced'** respectively.

In [None]:
m_sentenced = data_year.column("Sentenced males sum")
f_sentenced = data_year.column("Sentenced females sum")

Your turn!

**A)**  Repeat the procedure above, but now selected the columns for **unsentenced** males and females, and save them into new variables **'m_unsentenced'** and **'f_unsentenced'** respectively.

In [None]:
# YOUR CODE 
... = data_year.column(...)
... = data_year.column(...)

raise NotImplementationError

Let's see what **m_sentenced** looks like!

In [None]:
m_sentenced

It looks like there are justs a list of numbers. This means that we can just add the values from m_sentenced and f_sentenced into a single column to get the total sentenced per year.

In [None]:
sentenced_all =  m_sentenced + f_sentenced

### Question 2.

**A)** Add m_unsentenced and f_unsentenced and assign the sum to a new variable and call it **sentenced_all**.

In [None]:
# YOUR CODE
unsentenced_all = ...


raise NotImplementationError

We can now go ahead and add these two values to our original data table (called `data_year` in order to keep track of the data we are calculating). We will use the function `with_column` which takes in a label for your  column, and the values that you want to assign to that new column. 

In [None]:
data_year = data_year.with_column("Total Sentenced", sentenced_all)

### Question 3.

**A)** Now it's your turn to add the list of values in **unsentenced_all** to a new column on our data table **data_year** . Name the new column **Total Unsentenced**. 

In [None]:
# YOUR CODE 
data_year = data_year.with_column(...,...)


raise NotImplementationError

Now let's look at our table to ensure that everything is order. Use the `show` function to view the first 5 rows.

In [None]:
data_year.show(5)

Let's then select the relevant columns that we need, which are the ***year column, the total sentenced column, and the total unsentenced column*** and assign them to a new variable and name it **totals**. We will use the function `select` to do this.

In [None]:
totals = data_year.select("Year", "Total Sentenced", "Total Unsentenced")
totals

Similar to how we plotted the Total ADP over Time above, let's now plot the total number of people sentenced versus the total number of people unsentenced.

In [None]:
totals.plot("Year")
plt.title("Total Sentenced vs Total Unsentenced")
plt.xlabel("Year")
plt.ylabel("Number of People")

### Question 4.

**Do you notice anything interesting about this visualization we just plotted?. What can you tell about the difference in the two lines before the year 2000? Do you notice anything interesting after 2010?**

double click this cell to type your response

Now let's explore this same data, but as **percentages**. To do this, we can get our **total** using the column **Total facility ADP sum** and then divide the total sentenced and total unsentenced by this value. 

In [None]:
total_adp = data_year.column("Total facility ADP sum")

sent_percent = sentenced_all / total_adp * 100
unsent_percent = unsentenced_all / total_adp * 100

We have our percentages now so let's repeat the same process as above where we add it to our `data_year` table and select the relevant columns that we want.

In [None]:
data_year = data_year.with_column("Total Sentenced Percent", sent_percent)
data_year = data_year.with_column("Total Unsentenced Percent", unsent_percent)

percent_totals = data_year.select("Year", "Total Sentenced Percent", "Total Unsentenced Percent")
percent_totals

Now let's plot this data using our same line plot method as before!

In [None]:
percent_totals.plot("Year")
plt.title("Total Sentenced Percent vs Total Unsentenced Percent")
plt.xlabel("Year")
plt.ylabel("Percent of People")

Similar to above, let's try plotting dots so we can see what happened at specific years in the jails. Namely, 2011 and 2014. Let's see how these specific years play out on our plot above by plotting red dots representing those years.

First, let's start off by getting the data in the years 2011 and 2014. To do this we can use the `where()` function and specify the column we are looking at and the specific value we want that column to be (in order to get the rest of the data).

In [None]:
data2011 = percent_totals.where("Year", 2011)
data2014 = percent_totals.where("Year", 2014)

Next, we want to get the x and y coordinates of each point using the data we just found. For example, using data2011, can now assign x_coordinate2011 to the year column and the y_coordinate2011 to the Percent column.

Let's compute the coordinates for 2011. 

In [None]:
x_coordinate2011 = data2011.column('Year')
y_coordinate2011 = data2011.column("Total Sentenced Percent")

### Question 5.

**A)** Now, repeat this procedure, but for the year 2014. 

In [None]:
# YOUR CODE 
x_coordinate2014 = data2014.column(...)
y_coordinate2014 = data2014.column(...)


raise NotImplementationError

**B)**  Now, repeat this procedure, but for the **unsentenced population** for both years. 

In [None]:
# YOUR CODE 
x_coordinate2011_un = data2011.column(...)
y_coordinate2011_un = data2011.column(...)

x_coordinate2014_un = data2014.column(...)
y_coordinate2014_un = data2014.column(...)


raise NotImplementationError

Now add our points to our plot.

In [None]:
percent_totals.plot("Year")
plt.title("Total Sentenced Percent vs Total Unsentenced Percent")
plt.xlabel("Year")
plt.ylabel("Percent of People")

plt.plot(x_coordinate2011, y_coordinate2011, 'ro')
plt.plot(x_coordinate2014, y_coordinate2014, 'ro')
plt.plot(x_coordinate2011_un, y_coordinate2011_un, 'ro')
plt.plot(x_coordinate2014_un, y_coordinate2014_un, 'ro')

### Question 6.

**Is there anything interesting that you see related to the percentages and years? How does looking at percentages and numbers compare?**

double click this cell to type your response

## 3. GENDER DIVISION <a id='subsection4'></a>
---

### 3.1 Males: Sentenced vs. Unsentenced

We just looked at the total number of people who were sentenced and the total number of people who were unsentenced, per year. Next, let's look at just the number of **males who were sentenced vs the number of males who were unsentenced**. Let's start by selecting the relevant columns that we are going to use for our analysis.

Hint: use the function `select`. 

In [None]:
males = data_year.select("Year", "Sentenced males sum", "Unsentenced males sum")
males.show(5)

**Plotting**: We can now use these columns and plot the **total number** of males sentenced and unsentenced per year. 

In [None]:
males.plot("Year")
plt.title("Total Sentenced vs Total Unsentenced (Males)")
plt.xlabel("Year")
plt.ylabel("Number of Males")

We have found and plotted the counts of males but let's try finding the **percentage** of males from the total ADP count. Start by getting the sum of the sentenced males column, the sum of the unsentenced males column, and then dividing both of these columns by the total ADP (see if you can use a variable from earlier on in our analysis!)

In [None]:
male_sent = males.column("Sentenced males sum")
male_unsent = males.column("Unsentenced males sum")

m_sent_percent = male_sent / total_adp * 100
m_unsent_percent = male_unsent / total_adp * 100

Let's add these new percentage columns to our Male table (since they represent the males), and then select the relevant columns that we are going to plot.

In [None]:
males = males.with_column("Total Male Sentenced Percent", m_sent_percent)
males = males.with_column("Total Male Unsentenced Percent", m_unsent_percent)

Now, lets only focus on only the percentages. We will use the function `select` to do this. 

In [None]:
m_percent_totals = males.select("Year", "Total Male Sentenced Percent", "Total Male Unsentenced Percent")
m_percent_totals

**Plotting:** Now let us plot the **percentage** of males who were sentenced vs the percentage of males who were unsentenced.

In [None]:
m_percent_totals.plot("Year")
plt.title("Total Sentenced vs Total Unsentenced Percents (Males)")
plt.xlabel("Year")
plt.ylabel("Percent of Males")

Now, we want to add the red dots on key years as we did on some plots before.

Let's filter by Year 2011 and 2014 and add them to new variables **data2011** and **data2014** respectively. 

In [None]:
data2011 = m_percent_totals.where("Year", 2011)

data2014 = m_percent_totals.where("Year", 2014)

Next, we want to get the x and y coordinates of each point using the data we just found.

**A)** Let's begin with the **Total Male Sentenced Percent** for the **Year** **2011**

In [None]:
x_coordinate2011 = data2011.column('Year')
y_coordinate2011 = data2011.column("Total Male Sentenced Percent")

**B)** Now, we will repeat this selection for the **Total Male Sentenced Percent** for the **Year** **2014**.

In [None]:
x_coordinate2014 = data2014.column('Year')
y_coordinate2014 = data2014.column("Total Male Sentenced Percent")

Finally, repeat **A and B**, but now for the **Total Male Unsentenced Percent**

In [None]:
x_coordinate2011_un = data2011.column('Year')
y_coordinate2011_un = data2011.column("Total Male Unsentenced Percent")

x_coordinate2014_un = data2014.column('Year')
y_coordinate2014_un = data2014.column("Total Male Unsentenced Percent")

We can now add our points to our plot.


In [None]:
# PLOT
m_percent_totals.plot("Year")
plt.title("Total Sentenced vs Total Unsentenced Percents (Males)")
plt.xlabel("Year")
plt.ylabel("Percent of Males")


# POINTS
plt.plot(x_coordinate2011, y_coordinate2011, 'ro')
plt.plot(x_coordinate2014, y_coordinate2014, 'ro')
plt.plot(x_coordinate2011_un, y_coordinate2011_un, 'ro')
plt.plot(x_coordinate2014_un, y_coordinate2014_un, 'ro')

### Question 1.

**Look at the two plots that you made specifically for males. What patterns do you notice? Is there anything interesting you notice related to the years/points we plotted?**

double click this cell to type your response

### 3.2 Females: Sentenced vs. Unsentenced


Above we analyzed just the males that were sentenced and unsentenced. Now let us do the same with females. Let's start off by creating a females variable that will contain the relevant female columns from the original `data_year` table we had earlier. 

### Question 1.

**A)** Select **"Year", "Sentenced females sum", "Unsentenced females sum"** using the function `select`, and assign the results to a new variable called **females**

In [None]:
#YOUR CODE
... = data_year.select(...,...,...)
females


raise NotImplementationError

### Question 2. 

**A)** Using `females` table, we can plot the number of females sentenced and the number of females who were unsentenced over the years. 

Hint: Use the column **Year**.

In [None]:
# YOUR CODE
females.plot(...)
plt.title("Total Sentenced vs Total Unsentenced (Females)")
plt.xlabel("Year")
plt.ylabel("Number of Females")

raise NotImplementationError

### Question 3. 

**What do you notice about this plot compared to the related males plot?**

double click this cell to type your response

### Question 4. 

Similarly, we can calculate the percentages of sentenced females and unsentenced females and plot this relationship. 

**A)** First, get the **sentenced females sum** and **unsentenced females sum** columns. Then assign your output to **female_sent** and **female_unsent** respectively 

Hint: Use the fucntion `column` to select the two columns. 

In [None]:
# YOUR CODE 
female_sent = females.column(...)
female_unsent = females.column(...)

raise NotImplementationError

**B)** Divide by the **female_sent** and **female_sent** by **total ADP**. Then multiply by 100 in order to get a percentage. Finally assign your output to **f_sent_percent** and **f_unsent_percent**, respectively. 

In [None]:
# YOUR CODE 
f_sent_percent = (.../...)*...
f_unsent_percent = (.../...)*...


raise NotImplementationError

**C)** We have our percentages for our females. So now, let's add the **f_sent_percent** and **f_unsent_percent** to our `females` table for reference. On the code below, we have already provided the names for the new columns. 

In [None]:
# YOUR CODE 
females = females.with_column("Total Female Sentenced Percent", ...)
females = females.with_column("Total Female Unsentenced Percent", ...)


raise NotImplementationError

**D)** We can then select the relevant columns we want to plot. Use the function `select` to select the following columns: 
* **Year**
* **Total Female Sentenced Percent**
* **Total Female Unsentenced Percent**


In [None]:
# YOUR CODE 
f_percent_totals = females.select(...,...,...)
f_percent_totals


raise NotImplementationError

**E)** Using these three columns, plot the percentages for females, and add the red dots on the year 2011 and 2014.

We will help you with the creation of this plot. 

Run the following 6 cells, and do not worry too much about their implementation. (We already did this before!)

In [None]:
# Select key years
data2011 = f_percent_totals.where("Year", 2011)
data2014 = f_percent_totals.where("Year", 2014)

In [None]:
# Get the coordinates for the Year 2011 for the Total Female Sentenced Percent 
x_coordinate2011 = data2011.column('Year')
y_coordinate2011 = data2011.column("Total Female Sentenced Percent")

In [None]:
# Get the coordinates for the Year 2014 for the Total Female Sentenced Percent 
x_coordinate2014 = data2014.column('Year')
y_coordinate2014 = data2014.column("Total Female Sentenced Percent")

In [None]:
# Get the coordinates for the Year 2011 for the Total Female Unsentenced Percent 
x_coordinate2011_un = data2011.column('Year')
y_coordinate2011_un = data2011.column("Total Female Unsentenced Percent")

In [None]:
# Get the coordinates for the Year 2014 for the Total Female Unsentenced Percent 
x_coordinate2014_un = data2014.column('Year')
y_coordinate2014_un = data2014.column("Total Female Unsentenced Percent")

In [None]:
# Plot 
f_percent_totals.plot("Year")
plt.title("Total Sentenced vs Total Unsentenced Percents (Females)")
plt.xlabel("Year")
plt.ylabel("Percent of Females")

# Add red dots
plt.plot(x_coordinate2011, y_coordinate2011, 'ro')
plt.plot(x_coordinate2014, y_coordinate2014, 'ro')
plt.plot(x_coordinate2011_un, y_coordinate2011_un, 'ro')
plt.plot(x_coordinate2014_un, y_coordinate2014_un, 'ro')

### Question 5.

**Compare the totals plot (completed before) to the one you just created. What kind of story does this plot reveal?**


double click this cell to type your response

### Question 6.

**Go back to the end of the first notebook, and compare your final results to the ones presented here. What story do these two data sets (Prisons and Jails) tell you when you look at them side by side? How do they show the impact of realignment?**

---
# Final Survey <a id='section3'></a>

Congrats! You've finished the final Jupyter Notebook assignment! The Division of Data Sciences and Information would like to ask you to please fill this survey out as a part of your assignment. We would like to improve the module for future semesters, and would really appreciate it if you took the time to fill this out so we can better serve you!

Please make sure you are logged into your Berkeley (.edu) email address to access the form.

[Survey Link](https://goo.gl/forms/kHj3jQapNwP4l5Mf2)

Alternatively, please copy and paste this link into your URL bar: https://goo.gl/forms/kHj3jQapNwP4l5Mf2

---

## Saving the Notebook as an PDF

Congrats on finishing this notebook! As before, you will be submitting this notebook as an PDF file. To turn in this assignment follow the steps below:

1. **Important:** Click the Save icon located at the far left on the top toolbar. Make sure to do this before following the next steps.
2. Save the webpage as a PDF.
    * For Chrome users:
        1. Click on the rightmost button on the top toolbar
        2. In the drop down, click "Print"
        3. For "Destination", choose "Save as PDF"
    * For Firefox users:
        1. Click on the rightmost button on the top toolbar
        2. In the drop down, click "Print"
        3. Click Print and set the destination as "Adobe PDF" or "Microsoft Print to PDF"
5. Once the file downloads, open it using a PDF reader to make sure that everything looks okay.
6. If any pages are omitted from the output PDF, make sure that the images that you have uploaded to Jupyter are displaying properly in the notebook and that the correct filenames are specified. **Issues in converting the notebook to PDF format usually happen when an image in the notebook is not displayed/embedded properly.**
7. Submit to the Problem Set 3 Assignment on bCourses.

---
Notebook developed by: Ashley Quiterio and  Shalini Kunapuli

Data Science Modules: http://data.berkeley.edu/education/modules

