# Introduction

**Incarceration & COVID-19: How Jails Respond to COVID**<br>

This project scrapes daily jail roster information to create a large dataset. This dataset is designed to analyze how jail populations have fluctuated in response to COVID-19. Research centers on explaining why county jails in different parts of the United States have responded differently to the pandemic over time. 

A separate but related idea for this dataset analyzes the impact of pandemic-related jail population declines on local crime. This project uses daily jail roster population counts as the focal variable and the analysis uses group-based trajectory modeling. Our scraped data will address gaps in the [Vera](https://github.com/vera-institute/jail-population-data) dataset.

We start by comparing Washington and New York states because they dealt with COVID-19 at the early on-set of the pandemic. Below is a list of the data points to collect to harmonize with the Vera data.
- County Name
- State Name
- Daily Population Counts
- Reporting Jail Name

# Imports

In [149]:
# Import standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from datetime import datetime

# API libraries
import re
import os
from os import system  
import time
from time import sleep
import json
import random
import requests
# from math import floor
# from copy import deepcopy

# Selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

#SQL
import sqlite3
import mysql.connector

# MySQL

In [150]:
# FIGURE OUT PASSWORD STUFF

mydb = mysql.connector.connect(host='localhost',\
                              user='root',\
                              passwd='SolomonGrundy11222',\
                              database='testdb'\
                              )
print(mydb)

<mysql.connector.connection.MySQLConnection object at 0x7ff7a77a68d0>


In [151]:
# Create a database

mycursor = mydb.cursor()

#This line is commented out because it only needs to be run once
# mycursor.execute("CREATE DATABASE testdb")

In [152]:
mycursor.execute("SHOW DATABASES")

for db in mycursor:
    print(db)

('information_schema',)
('mysql',)
('performance_schema',)
('sys',)
('testdb',)


In [154]:
# # Create a table (do not erase)

# mycursor.execute("CREATE TABLE county_jails\
#                  (reporting_jurisdictions VARCHAR(100),\
#                  county_name VARCHAR(100),\
#                  state_name VARCHAR(100),\
#                  Date VARCHAR(100),\
#                  jail_population INTEGER(255))")

In [155]:
mycursor.execute("SHOW TABLES")

for tb in mycursor:
    print(tb)

('county_jails',)


In [156]:
sqlFormula = "INSERT INTO county_jails (Date, reporting_jurisdictions, county_name, \
state_name, jail_population) VALUES (%s, %s, %s, %s, %s)"

In [None]:
#USE FOR ALL COMMITS
# mycursor.execute(sqlFormula, jail1)
# mydb.commit()

# States

Be sure to check for APIs in addition to scraping. Will include NY, WA and FL.

## Washington

### Whitman

In [9]:
url = "http://www.whitmancountyjail.org/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

In [10]:
listy = driver.find_elements_by_css_selector('h4')

# to view full list
# for x in listy[:50]:
#     if len(x.text) > 0:
#         print(x.text)

In [25]:
todays_date = datetime.now().strftime('%Y-%m-%d')
JPWhitman = (len(listy))-10

print('Date = ',todays_date)
print('jail_population = ',PWhitman)

Date =  2020-06-12
jail_population =  26


In [21]:
#USE FOR ALL COMMITS
Whitman = (todays_date, "Whitman County Jail", "Whitman County", "WA", JPWhitman)
mycursor.execute(sqlFormula, Whitman)
mydb.commit()

In [51]:
driver.close()

### Spokane

In [None]:
url = "https://www.spokanecounty.org/352/Inmate-Roster'"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:

# Hidden input type

Sinmate = driver.find_element_by_xpath('//*[@id="tblInmateRoster_info"]').text
print(Sinmate)

In [None]:
JPSpokane

In [None]:
#USE FOR ALL COMMITS
Spokane = (todays_date, "Spokane County Jail", "Spokane County", "WA", JPSpokane)
mycursor.execute(sqlFormula, Spokane)
mydb.commit()

In [57]:
driver.close()

### Okanogan

Details can be found in the Daily Jail Inmate Log on [Okanogan Sherriff Website](https://okanogansheriff.org/).

In [56]:
url = "https://okanogansheriff.org/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
# Need to read from a pdf name=CD6F9816E7949144F43AB92B6CCADAA8

location = driver.find_element_by_xpath('/html/frameset/frame[2]').text
location

In [None]:
JPOkanogan

In [None]:
#USE FOR ALL COMMITS
Spokane = (todays_date, "Spokane County Jail", "Spokane County", "WA", JPSpokane)
mycursor.execute(sqlFormula, Spokane)
mydb.commit()

In [54]:
driver.close()

### Jefferson

[Jefferson](https://co.jefferson.wa.us/174/Jail-Inmate-Search)<br> To view the full inmate roster click the Clear button then the Search button.

In [1]:
url = "https://co.jefferson.wa.us/174/Jail-Inmate-Search"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

NameError: name 'webdriver' is not defined

In [None]:
# Hidden input type

inmate = driver.find_elements_by_name('Name')
print(len(inmate))

In [None]:
#USE FOR ALL COMMITS
Jefferson = (todays_date, "Jefferson County Jail", "Jefferson County", "WA", JPJefferson)
mycursor.execute(sqlFormula, Jefferson)
mydb.commit()

In [None]:
driver.close()

### Grant

[Grant](ttps://www.grantcountywa.gov/SHERIFF/Corrections/Inmate-Roster.htm), daily pdf

In [122]:
url = "https://www.grantcountywa.gov/SHERIFF/Corrections/Roster-InmateinmateRoster%20v%206.rpt.pdf"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

print(driver.page_source)

<html><head></head><body style="height: 100%; width: 100%; overflow: hidden; margin:0px; background-color: rgb(82, 86, 89);"><embed name="7C4C3242FAFE7B2DD30E5311C384F3B0" style="position:absolute; left: 0; top: 0;" width="100%" height="100%" src="about:blank" type="application/pdf" internalid="7C4C3242FAFE7B2DD30E5311C384F3B0"></body></html>


In [140]:
url = "https://www.grantcountywa.gov/SHERIFF/Corrections/Roster-InmateinmateRoster%20v%206.rpt.pdf"
user_agent = 'Chrome/41.0.2228.0'
headers = { 'User-Agent' : user_agent }

grant = urllib.request.Request(url, headers)

In [None]:
import beautifulsoup

In [None]:
JPGrant

In [None]:
#USE FOR ALL COMMITS
Grant = (todays_date, "Grant County Jail", "Grant County", "WA", JPGrant)
mycursor.execute(sqlFormula, Grant)
mydb.commit()

### Gray's Harbor

[Gray's Harbor]('http://ghlea.com/JailRosters/GHCJRoster.html')

In [45]:
url = "http://ghlea.com/JailRosters/GHCJRoster.html"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [46]:
GHinmate = driver.find_elements_by_xpath('//*[@id="main-table"]/tbody/tr')
JPGray=(len(GHinmate))
print(JPGray)

172


In [48]:
#USE FOR ALL COMMITS

Gray = (todays_date, "Grays Harbor County Jail", "Grays Harbor County", "WA", JPGray)
mycursor.execute(sqlFormula, Gray)
mydb.commit()

In [125]:
driver.close()

### Ferry

[Ferry](https://www.ferry-county.com/Courts%20and%20Law/Inmate%20Roster/Inmate_Roster_Page.html): in the section that says "MAY 11, 2020 - 8 inmates")

In [37]:
url = "https://www.ferry-county.com/Courts%20and%20Law/Inmate%20Roster/Inmate_Roster_Page.html"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [42]:
Finmate = driver.find_element_by_xpath('//*[@id="mainContent3"]/p[9]').text
JPFerry=Finmate[16]
print(JPFerry)

8


In [43]:
#USE FOR ALL COMMITS

Ferry = (todays_date, "Ferry County Corrections", "Ferry County", "WA", JPFerry)
mycursor.execute(sqlFormula, Ferry)
mydb.commit()

In [83]:
driver.close()

### Clallam

[Clallam](https://websrv23.clallam.net/NewWorld.InmateInquiry/WA0050000/)

In [29]:
url = "https://websrv23.clallam.net/NewWorld.InmateInquiry/WA0050000/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver') 
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [34]:
Clallam_inmate = driver.find_elements_by_class_name('Name')
JPClallam = (len(Clallam_inmate))
print(JPClallam)

74


In [35]:
#USE FOR ALL COMMITS

Clallam = (todays_date, "Clallam County Jail", "Clallam County", "WA", JPClallam)
mycursor.execute(sqlFormula, Clallam)
mydb.commit()

In [36]:
driver.close()

### Adams

[View](https://www.co.adams.wa.us/government/jail_roster_and_booking_information/index.php) Jail Roster Information

In [23]:
url = "https://www.co.adams.wa.us/jailrosterout.txt"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [26]:
Adams_text=driver.find_element_by_xpath('/html/body/pre').text
JPAdams = Adams_text.count("Booking")
print('Jail Population = ',JPAdams)

Jail Population =  22


In [27]:
#USE FOR ALL COMMITS

Adams = (todays_date, "Adams County Jail", "Adams County", "WA", JPAdams)
mycursor.execute(sqlFormula, Adams)
mydb.commit()

In [28]:
driver.close()

# Export csv

In [168]:
country_jail_df = pd.read_sql("SELECT * FROM county_jails", con=mydb)
country_jail_df.head()

Unnamed: 0,reporting_jurisdictions,county_name,state_name,Date,jail_population
0,Whitman County Jail,Whitman County,WA,2020-06-12,26
1,Whitman County Jail,Whitman County,CA,2020-06-12,26
2,Adams County Jail,Adams County,WA,2020-06-12,22
3,Clallam County Jail,Clallam County,WA,2020-06-12,74
4,Ferry County Corrections,Ferry County,WA,2020-06-12,8


In [169]:
country_jail_df.to_csv('County_Jail.csv')