# Introduction

**Incarceration & COVID-19: How Jails Respond to COVID**<br>

This project scrapes daily jail roster information to create a large dataset. This dataset is designed to analyze how jail populations have fluctuated in response to COVID-19. Research centers on explaining why county jails in different parts of the United States have responded differently to the pandemic over time. 

A separate but related idea for this dataset analyzes the impact of pandemic-related jail population declines on local crime. This project uses daily jail roster population counts as the focal variable and the analysis uses group-based trajectory modeling. Our scraped data will address gaps in the [Vera](https://github.com/vera-institute/jail-population-data) dataset.

We start by comparing Washington and New York states because they dealt with COVID-19 at the early on-set of the pandemic. Below is a list of the data points to collect to harmonize with the Vera data.
- County Name
- State Name
- Daily Population Counts
- Reporting Jail Name

# Imports

In [1]:
# Import standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# API libraries
import re
import os
from os import system  
import time
from time import sleep
import json
import random
import requests
# from math import floor
# from copy import deepcopy

# Selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

#SQL
import sqlite3
import mysql.connector

# MySQL

In [2]:
# FIGURE OUT PASSWORD STUFF

mydb = mysql.connector.connect(host='localhost',\
                              user='root',
                              passwd='',
                              database='testdb')
print(mydb)

<mysql.connector.connection.MySQLConnection object at 0x7ff954575e10>


In [3]:
# Created a db

mycursor = mydb.cursor()
# mycursor.execute("CREATE DATABASE testdb")

In [4]:
mycursor.execute("SHOW DATABASES")

for db in mycursor:
    print(db)

('information_schema',)
('mysql',)
('performance_schema',)
('sys',)
('testdb',)


In [6]:
#Created a table

# mycursor.execute("CREATE TABLE jails\
#                  (reporting_jurisdictions VARCHAR(100),\
#                  county_name VARCHAR(100),\
#                  state_name VARCHAR(100),\
#                  Date VARCHAR(100),\
#                  jail_population INTEGER(255))")

In [7]:
mycursor.execute("SHOW TABLES")

for tb in mycursor:
    print(tb)

('jails',)


In [11]:
sqlFormula = "INSERT INTO jails (Date, reporting_jurisdictions, county_name, \
state_name, jail_population) VALUES (%s, %s, %s, %s, %s)"

In [12]:
jail1 = ('today', 'Spokane County Jail', 'Spokane County', 'WA', 22)

In [13]:
mycursor.execute(sqlFormula, jail1)

mydb.commit()

# States

Be sure to check for APIs in addition to scraping. Will include NY, WA and FL.

## Washington

### Whitman

In [None]:
url = "http://www.whitmancountyjail.org/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

In [None]:
listy = driver.find_elements_by_css_selector('h4')

#view full list
# for x in listy[:50]:
#     if len(x.text) > 0:
#         print(x.text)

In [None]:
location = driver.find_element_by_xpath('//*[@id="form1"]/footer').text
location

In [None]:
from datetime import datetime # Current date time in local system )

In [None]:
JWhitman = (location[:19])
CWhitman = (location[:7])
SWhitman = (location[47:49])
DWhitman = datetime.now().strftime('%Y-%m-%d')
PWhitman = (len(listy))-10

print('reporting_jurisdictions = ',JWhitman)
print('county_name = ',CWhitman)
print('state_name = ',SWhitman)
print('Date = ',DWhitman)
print('jail_population = ',PWhitman)

In [None]:
driver.close()

### Spokane

#### Selenium

In [None]:
# url = "https://www.spokanecounty.org/352/Inmate-Roster'"
# driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
# driver.implicitly_wait(3)
# driver.get(url)

# print(driver.page_source)

In [None]:
# Outter html <h1 style="margin-top:0px;">Saturday, June 6, 2020</h1>

# date = driver.find_elements_by_xpath('//*[@id="aspnetForm"]/div[3]/h1')
# len(date)

In [None]:
driver.close()

### Okanogan

Details can be found in the Daily Jail Inmate Log on [Okanogan Sherriff Website](https://okanogansheriff.org/).

In [None]:
url = "https://okanogansheriff.org/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
# Need to read from a pdf name=CD6F9816E7949144F43AB92B6CCADAA8

# location = driver.find_element_by_xpath('/html/body/embed').text
# location

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

In [None]:
driver.close()

### Jefferson

[Jefferson](https://co.jefferson.wa.us/174/Jail-Inmate-Search)<br> To view the full inmate roster click the Clear button then the Search button.

In [None]:
url = "https://co.jefferson.wa.us/174/Jail-Inmate-Search"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
#Hidden input type

# inmate = driver.find_element_by_xpath('//*[@id="Inmate_Index"]').text
# inmate

inmate = driver.find_elements_by_name('Name')
print(len(inmate))

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Grant

[Grant](ttps://www.grantcountywa.gov/SHERIFF/Corrections/Inmate-Roster.htm), daily pdf

In [None]:
url = "https://www.grantcountywa.gov/SHERIFF/Corrections/Inmate-Roster.htm"
driver = webdriver.Chrome('/Users/meaganlazer/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

print(driver.page_source)

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Gray's Harbor

[Gray's Harbor]('http://ghlea.com/JailRosters/GHCJRoster.html')

In [None]:
url = "http://ghlea.com/JailRosters/GHCJRoster.html"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
inmate = driver.find_elements_by_xpath('//*[@id="main-table"]/tbody/tr')
print(len(inmate))

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Ferry

[Ferry](https://www.ferry-county.com/Courts%20and%20Law/Inmate%20Roster/Inmate_Roster_Page.html): in the section that says "MAY 11, 2020 - 8 inmates")

In [None]:
url = "https://www.ferry-county.com/Courts%20and%20Law/Inmate%20Roster/Inmate_Roster_Page.html"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
Finmate = driver.find_element_by_xpath('//*[@id="mainContent3"]/p[9]').text
Finmate[16]

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Clallam

[Clallam](https://websrv23.clallam.net/NewWorld.InmateInquiry/WA0050000/)

In [14]:
url = "https://websrv23.clallam.net/NewWorld.InmateInquiry/WA0050000/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver') 
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [22]:
Clallam_inmate = driver.find_elements_by_class_name('Name')
print(len(Clallam_inmate))

77


In [23]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Adams

[View](https://www.co.adams.wa.us/government/jail_roster_and_booking_information/index.php) Jail Roster Information

In [26]:
url = "https://www.co.adams.wa.us/jailrosterout.txt"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

print(driver.page_source)

<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">06/09/20                 ADAMS COUNTY SHERIFF'S OFFICE                         0
06:00               Jail Register - Arrivals Still Confined          Page:     1



ALVAREZ, CODY JAMES            Arrived: 00:00:00 02/01/20     Booking: 40243

        Offense                         Offense Begin Date
        -----------------------------   ------------------
        DWLS 3RD DEGREE                 00:00:00 02/01/20


MALDONADO, ALEJANDRO           Arrived: 23:00:00 02/27/20     Booking: 40327

        Offense                         Offense Begin Date
        -----------------------------   ------------------
        RAPE 1ST                        15:00:00 03/03/20


SILBA-BAUTISTA, RAFAEL         Arrived: 18:00:00 02/28/20     Booking: 40330

        Offense                         Offense Begin Date
        -----------------------------   ------------------
        ADAMS COUNTY WARRANT            16:

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()