# Introduction

**Incarceration & COVID-19: How Jails Respond to COVID**<br>

This project scrapes daily jail roster information to create a large dataset. This dataset is designed to analyze how jail populations have fluctuated in response to COVID-19. Research centers on explaining why county jails in different parts of the United States have responded differently to the pandemic over time. 

A separate but related idea for this dataset analyzes the impact of pandemic-related jail population declines on local crime. This project uses daily jail roster population counts as the focal variable and the analysis uses group-based trajectory modeling. Our scraped data will address gaps in the [Vera](https://github.com/vera-institute/jail-population-data) dataset.

We start by comparing Washington and New York states because they dealt with COVID-19 at the early on-set of the pandemic. Below is a list of the data points to collect to harmonize with the Vera data.
- County Name
- State Name
- Daily Population Counts
- Reporting Jail Name

# Imports

In [1]:
# Import standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# API libraries
import re
import os
import time
import random
import requests
from os import system   
from math import floor
from copy import deepcopy

# Scraping libraries
from bs4 import BeautifulSoup
from time import sleep
from random import randint
import json
# Selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


import sqlite3
import mysql.connector

In [2]:
pip install mysql-connector

Collecting mysql-connector
  Downloading mysql-connector-2.2.9.tar.gz (11.9 MB)
[K     |████████████████████████████████| 11.9 MB 9.5 MB/s eta 0:00:01
[?25hBuilding wheels for collected packages: mysql-connector
  Building wheel for mysql-connector (setup.py) ... [?25ldone
[?25h  Created wheel for mysql-connector: filename=mysql_connector-2.2.9-cp37-cp37m-macosx_10_9_x86_64.whl size=247955 sha256=90e48808630f3ac81073ae38260938d491dcf357a65d1b7e99741281b57c645a
  Stored in directory: /Users/meaganrossi/Library/Caches/pip/wheels/42/2f/c3/692fc7fc1f0d8c06b9175d94f0fc30f4f92348f5df5af1b8b7
Successfully built mysql-connector
Installing collected packages: mysql-connector
Successfully installed mysql-connector-2.2.9
Note: you may need to restart the kernel to use updated packages.


In [2]:
MySQL

NameError: name 'MySQL' is not defined

In [3]:
mydb = mysql.connector.connect(host='localhost',\
                              user='root',
                              passwd='Smoke4fire**',
                              database='testdb')
print(mydb)

<mysql.connector.connection.MySQLConnection object at 0x7fb0280e3bd0>


In [5]:
mycursor = mydb.cursor()
# Already created
# mycursor.execute("CREATE DATABASE testdb")

In [6]:
mycursor.execute("SHOW DATABASES")

for db in mycursor:
    print(db)

('information_schema',)
('mysql',)
('performance_schema',)
('sys',)
('testdb',)


In [8]:
mycursor.execute("CREATE TABLE jails\
                 (reporting_jurisdictions VARCHAR(100),\
                 county_name VARCHAR(100),\
                 state_name VARCHAR(100),\
                 Date VARCHAR(100),\
                 jail_population INTEGER(255))")


In [13]:
# mycursor.execute("SHOW TABLES")

# for tb in mycursor:
#     print(tb)

In [19]:
sqlFormula = "INSERT INTO jails (Date,reporting_jurisdictions,county_name\
state_name, jail_population) VALUES (%s, %s,%s, %s,%s)"

In [20]:
jail1 = ('today','Spokane County Jail','Spokane County','WA',22)

In [21]:
mycursor.execute(sqlFormula, jail1)

mydb.commit()

InternalError: Unread result found

In [None]:
stop

# States

Be sure to check for APIs in addition to scraping. Will include NY, WA and FL.

## Washington

### Whitman

In [None]:
url = "http://www.whitmancountyjail.org/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

In [None]:
listy = driver.find_elements_by_css_selector('h4')

#view full list
# for x in listy[:50]:
#     if len(x.text) > 0:
#         print(x.text)

In [None]:
location = driver.find_element_by_xpath('//*[@id="form1"]/footer').text
location

In [None]:
from datetime import datetime # Current date time in local system )

In [None]:
JWhitman = (location[:19])
CWhitman = (location[:7])
SWhitman = (location[47:49])
DWhitman = datetime.now().strftime('%Y-%m-%d')
PWhitman = (len(listy))-10

print('reporting_jurisdictions = ',JWhitman)
print('county_name = ',CWhitman)
print('state_name = ',SWhitman)
print('Date = ',DWhitman)
print('jail_population = ',PWhitman)

In [None]:
# Define pipeline

# conn = sqlite3.connect('new.db')
# curr = conn.cursor()

# curr.execute("""create table(\
# date DWhitman,\
# jail_population PWhitman'\
# county_name CWhitman,\
# state_name SWhitman,\
# reporting_jurisdictions JWhitman\
# )""")

In [None]:
driver.close()

### Spokane

#### Selenium

In [None]:
# url = "https://www.spokanecounty.org/352/Inmate-Roster'"
# driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
# driver.implicitly_wait(3)
# driver.get(url)

# print(driver.page_source)

In [None]:
# Outter html <h1 style="margin-top:0px;">Saturday, June 6, 2020</h1>

# date = driver.find_elements_by_xpath('//*[@id="aspnetForm"]/div[3]/h1')
# len(date)

In [None]:
# class PrisonSpider(scrapy.Spider):
#     name = 'Prison'

# #     def remove_characters(self, value):
# #         return value.strip('\xa0')
    
#     def start_requests(self):
#         yield SeleniumRequest(
#             url='https://www.spokanecounty.org/352/Inmate-Roster',
#             wait_time=3,
#             callback=self.parse
#         )

#     def parse(self, response):
#         products = response.xpath("//*")
#         for product in products:
#             yield {
#                 'Date': product.xpath('//*[@id="aspnetForm"]/div[3]/h1').get(),
#                 'County': product.xpath("/html/head/title").get()\
# #                 'State'
# #                 'Pop_Count'
# #                 'Jail'
#             }

In [None]:
driver.close()

### Okanogan

Details can be found in the Daily Jail Inmate Log on [Okanogan Sherriff Website](https://okanogansheriff.org/).

In [None]:
url = "https://okanogansheriff.org/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
# Need to read from a pdf name=CD6F9816E7949144F43AB92B6CCADAA8

# location = driver.find_element_by_xpath('/html/body/embed').text
# location

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

In [None]:
driver.close()

### Jefferson

[Jefferson](https://co.jefferson.wa.us/174/Jail-Inmate-Search)<br> To view the full inmate roster click the Clear button then the Search button.

In [None]:
url = "https://co.jefferson.wa.us/174/Jail-Inmate-Search"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
#Hidden input type

# inmate = driver.find_element_by_xpath('//*[@id="Inmate_Index"]').text
# inmate

inmate = driver.find_elements_by_name('Name')
print(len(inmate))

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Grant

[Grant](ttps://www.grantcountywa.gov/SHERIFF/Corrections/Inmate-Roster.htm), daily pdf

In [None]:
url = "https://www.grantcountywa.gov/SHERIFF/Corrections/Inmate-Roster.htm"
driver = webdriver.Chrome('/Users/meaganlazer/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

print(driver.page_source)

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Gray's Harbor

[Gray's Harbor]('http://ghlea.com/JailRosters/GHCJRoster.html')

In [None]:
url = "http://ghlea.com/JailRosters/GHCJRoster.html"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
inmate = driver.find_elements_by_xpath('//*[@id="main-table"]/tbody/tr')
print(len(inmate))

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Ferry

[Ferry](https://www.ferry-county.com/Courts%20and%20Law/Inmate%20Roster/Inmate_Roster_Page.html): in the section that says "MAY 11, 2020 - 8 inmates")

In [None]:
url = "https://www.ferry-county.com/Courts%20and%20Law/Inmate%20Roster/Inmate_Roster_Page.html"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
Finmate = driver.find_element_by_xpath('//*[@id="mainContent3"]/p[9]').text
Finmate[16]

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Clallam

[Clallam](https://websrv23.clallam.net/NewWorld.InmateInquiry/WA0050000/)

In [None]:
url = "https://websrv23.clallam.net/NewWorld.InmateInquiry/WA0050000/"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver') 
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()

### Adams

[View](https://www.co.adams.wa.us/government/jail_roster_and_booking_information/index.php) Jail Roster Information

In [None]:
url = "https://www.co.adams.wa.us/government/jail_roster_and_booking_information/index.php"
driver = webdriver.Chrome('/Users/meaganrossi/Projects/Incarceration_COVID/chromedriver')
driver.implicitly_wait(3)
driver.get(url)

# print(driver.page_source)

In [None]:
# print('reporting_jurisdictions = ',JWhitman)
# print('county_name = ',CWhitman)
# print('state_name = ',SWhitman)
# print('Date = ',DWhitman)
# print('jail_population = ',PWhitman)

driver.close()