# Introduction

**Incarceration & COVID-19: How Jails Respond to COVID**<br>

This project scrapes daily jail roster information to create a large dataset. This dataset is designed to analyze how jail populations have fluctuated in response to COVID-19. Research centers on explaining why county jails in different parts of the United States have responded differently to the pandemic over time. 

A separate but related idea for this dataset analyzes the impact of pandemic-related jail population declines on local crime. This project uses daily jail roster population counts as the focal variable and the analysis uses group-based trajectory modeling. Our scraped data will address gaps in the [Vera](https://github.com/vera-institute/jail-population-data) dataset.

We start by comparing Washington and New York states because they dealt with COVID-19 at the early on-set of the pandemic. Below is a list of the data points to collect to harmonize with the Vera data.
- County Name
- State Name
- Daily Population Counts
- Reporting Jail Name

# Imports

In [1]:
# Import Libraries
import re
import os
import time
import random
import requests
import numpy as np
import pandas as pd
from os import system   
from math import floor
from copy import deepcopy
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# import mysql.connector 
# from mysql.connector import errorcode
from bs4 import BeautifulSoup
from time import sleep
from random import randint
import json
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# States

## Washington

### Whitman

In [2]:
# http://www.whitmancountyjail.org/
#Need to count unique IDs

### Spokane

In [3]:
# Defining the url of the site
base_site = "https://www.spokanecounty.org/352/Inmate-Roster"

# Making a get request
response = requests.get(base_site)
response.status_code

200

In [4]:
# Extracting the HTML
html = response.content

# Checking that the reply is indeed an HTML code by inspecting the first 100 symbols
html[:100]

b'\r\n\r\n<!DOCTYPE html>\r\n<html lang="en">\r\n<head>\r\n\r\n\t<meta http-equiv="Content-type" content="text/html'

In [5]:
# Convert HTML to a BeautifulSoup object.
soup = BeautifulSoup(html, "html.parser")

In [6]:
# Exporting the HTML to a file
with open('Spokane_response.html', 'wb') as file:
    file.write(soup.prettify('utf-8'))

In [8]:
soup.find_all('title')

[<title>Inmate Roster | Spokane County, WA</title>,
 <title>Arrow Left</title>,
 <title>Arrow Right</title>,
 <title>Slideshow Left Arrow</title>,
 <title>Slideshow Right Arrow</title>]

In [33]:
soup.find_all('div', class_ = 'container')
# soup.find_all('li')

[]

### Okanogan

In [10]:
# https://okanogansheriff.org/
# (under Daily Jail Inmate Log)

### Jefferson

In [11]:
# https://co.jefferson.wa.us/174/Jail-Inmate-Search
# (To view the full inmate roster click the Clear button then the Search button.)

### Grant

In [12]:
# https://www.grantcountywa.gov/SHERIFF/Corrections/Inmate-Roster.htm

### Gray's Harbor

In [13]:
# http://ghlea.com/JailRosters/GHCJRoster.html

### Ferry

In [14]:
# https://www.ferry-county.com/Courts%20and%20Law/Inmate%20Roster/Inmate_Roster_Page.html
# (in the section that says "MAY 11, 2020 - 8 inmates")

### Clallam

In [15]:
# https://websrv23.clallam.net/NewWorld.InmateInquiry/WA0050000/

### Adams

In [16]:
# https://www.co.adams.wa.us/government/jail_roster_and_booking_information/index.php
# (View Jail Roster Information)