# Analysis of Virginia Court Data: Jupyter Template
### Jon Kropko (jkropko@codeforcharlottesville.org)
### 3/15/2021

The goal is to answer questions for the [Legal Aid Justice Center](https://www.justice4all.org/) using the data compiled by [Ben Schoenfeld](https://github.com/bschoenfeld). We need to supply the LAJC with easy-to-read documents in HTML format that has the following sections:

* **Question**: describes the question we are trying to answer, 
* **Data**: the relevant datasets, 
* **Data Wrangling**: all the code and steps to load the data and prepare it for analysis, 
* **Results**: the results, 
* **Conclusion**: and a succinct description of the answer. 

If you would like to use R, please create a document with all these sections using **R Markdown**. If you are using Python, please create a **Jupyter notebook**.

There are three ways to access the data, and you can whichever method you feel more comfortable with:

1. You can download CSV files one at a time from https://virginiacourtdata.org

2. You can use the user-interface for the makeshift API we've set up. Here, you can enter SQL queries into a text box, execute the code, and download the output in either JSON or CSV format: http://132.145.211.20:8001/criminal-court

3. You can work entirely within an R or Python environment by issuing SQL queries from the following API endpoint:

http://132.145.211.20:8001/criminal-court.csv?

with one API parameter:

* `sql` set equal to the specific SQL query you want to pass to the API

For example, to extract all the data from the Virginia circuit criminal courts in the year 2000, the SQL code is
```
select * from circuit_criminal_2000_anon_00
```
We will need the following libraries:

In [8]:
import numpy as np
import pandas as pd
import requests
import io

To pull this CSV directly into Python, use the following code:

In [9]:
endpoint = "http://132.145.211.20:8001/criminal-court.csv?"
my_query = "select * from circuit_criminal_2000_anon_00"
r = requests.get(endpoint, params = {'sql': my_query})
data = io.StringIO(r.text)
df = pd.read_csv(data, sep=",")

To pull a different selection of the data into Python, simply change the SQL code stored in `my_query`.

The data are now stored as a data frame:

In [10]:
df

Unnamed: 0,HearingDate,HearingResult,HearingJury,HearingPlea,HearingType,HearingRoom,fips,Filed,Commencedby,Locality,...,DrivingRestrictions,RestrictionEffectiveDate,RestrictionEndDate,VAAlcoholSafetyAction,RestitutionPaid,RestitutionAmount,Military,TrafficFatality,AppealedDate,person_id
0,2000-12-19,Dismissed,,,Under Advisement,,91,2000-02-16,General District Court Appeal,COMMONWEALTH OF VA,...,,,,,,,,,,227220000000460
1,2000-09-19,Dismissed,,,Trial,,91,2000-05-19,J&Dr Appeal,COMMONWEALTH OF VA,...,,,,,,,,,,352110000000068
2,2000-09-07,Sent,,,Trial,,91,2000-02-16,General District Court Appeal,COMMONWEALTH OF VA,...,,,,,,,,,,216180000001247
3,2000-09-07,Nolle Prosequi,,,Trial,,91,2000-02-16,General District Court Appeal,COMMONWEALTH OF VA,...,,,,,,,,,,216180000001247
4,2000-09-07,Sent,,,Trial,,91,2000-02-16,General District Court Appeal,COMMONWEALTH OF VA,...,,,,,,,,,,216180000001247
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,2000-12-06,Sent,,,Pre-Sentence Report,,11,2000-09-15,Indictment,COMMONWEALTH OF VA,...,,,,,,,,,,139011000000643
996,2000-12-06,Revoked - Sentence/Probation,,,Show Cause,,11,2000-09-29,Reinstatement,COMMONWEALTH OF VA,...,,,,,,,,,,343120000001044
997,2000-12-06,Revoked - Sentence/Probation,,,Show Cause,,11,2000-09-29,Reinstatement,COMMONWEALTH OF VA,...,,,,,,,,,,343120000001044
998,2000-12-06,Dismissed,,,Probation Reporting,,11,1999-10-06,Direct Indictment,COMMONWEALTH OF VA,...,,,,,,,,,,7071000000429
