# Tool Imports

In [104]:
import pandas as pd


# Overview

This notebook aims to scrape the US Department of Justice's Executive Office for Immigration Review website to obtain current detail for each sitting judge.


## Objectives

- Explore the data availability
- Understand if additional resources are necessary
    - Find if applicable
- Find product relevant data: judge name, locale (district, court location, etc.), biography as a stretch goal.
- Establish best way to extract data into a readable format for backend end point

## Considerations
1) Relevance: How are prior practicing judges handled? An individual attorney may upload a case file for a judge no longer practicing law. How will the system handle this use case?

2) Timeliness: How current is the information provided? 

3) Reliability: How reliable is the data being scraped? Given the data is drawn from a .gov site there is a strong likely hood the information can be relied upon for accuracy.

4) Scalability: Will the solution in this notebook scale?

Product Scope Note: This product formerly included appellate court cases. Cases where the original decision was appealed by either party and elevated to the next step in the immigration court judicial process. Please look for any reference to immigration appeals, BIA, board of appeals, etc. and insure that information is not included unless otherwise requested by the stakeholder. Example web site for exclusion includes: https://www.justice.gov/eoir/board-of-immigration-appeals-bios

# Data Establishment

In [105]:
# Load the data and save as a variable
jurisdiction_data = pd.read_html('https://www.justice.gov/eoir/eoir-immigration-court-listing')


In [106]:
# Set display option value to none allowing a view of all characters
pd.set_option('display.max_colwidth', None)

# Source material for this functionality here
# https://towardsdatascience.com/8-commonly-used-pandas-display-options-you-should-know-a832365efa95


In [107]:
# Brief data content overview

# Length of the data
print('The length of the district data is', len(jurisdiction_data), '.')

# View the first line of data
print('Jurisdiction data example, line 1:', jurisdiction_data[0])


The length of the district data is 31 .
Jurisdiction data example, line 1:                                                                                                                                                                                                                                                                                                                                                                0
0  Arizona | California | Colorado | Connecticut | Florida | Georgia | Hawaii | Illinois  Louisiana | Maryland | Massachusetts | Michigan | Minnesota | Missouri Nebraska | Nevada | New Jersey | New Mexico | New York | North Carolina Northern Mariana Islands | Ohio | Oregon | Pennsylvania | Puerto Rico | Tennessee  Texas | Utah | Virginia | Washington


In [108]:
# View a random line of data

jurisdiction_data[8]


Unnamed: 0,Illinois,Illinois.1,Illinois.2,Illinois.3
0,Court,Address,Immigration Judges,Court Administrator
1,Chicago,"525 West Van Buren Street, Suite 500 Chicago, IL 60607 312-697-5800 Chicago Detained 536 S. Clark Street, Suite 340 Chicago, IL 60605 312-294-8400","Cole, Samuel B. Crites, Elizabeth Curran, Brendan Defoe, Craig A. Klein, Eliza Klosowsky, Michael P. Luskin, Joshua D. McKenna, Patrick M. Naseem, Samia Peyton, Jennifer I. Rosche, Robin Salovaara, Kaarina Saltzman, Eva S.","Barilla, Jody"


In [109]:
# Random data appears to be a dataframe
# Confirm

jurisdiction_data[8].info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Illinois    2 non-null      object
 1   Illinois.1  2 non-null      object
 2   Illinois.2  2 non-null      object
 3   Illinois.3  2 non-null      object
dtypes: object(4)
memory usage: 192.0+ bytes


In [110]:
# Remove the first row in place to default the 
# Accurate label row as the header

illinois_df.drop([0], axis=0)

Unnamed: 0,Court,Address,Immigration Judges,Court Administrator
1,Chicago,"525 West Van Buren Street, Suite 500 Chicago, IL 60607 312-697-5800 Chicago Detained 536 S. Clark Street, Suite 340 Chicago, IL 60605 312-294-8400","Cole, Samuel B. Crites, Elizabeth Curran, Brendan Defoe, Craig A. Klein, Eliza Klosowsky, Michael P. Luskin, Joshua D. McKenna, Patrick M. Naseem, Samia Peyton, Jennifer I. Rosche, Robin Salovaara, Kaarina Saltzman, Eva S.","Barilla, Jody"


In [111]:
# Index into the dataframe for the judge names
illinois_df.iloc[1, 2]


'Cole, Samuel B.  Crites, Elizabeth  Curran, Brendan  Defoe, Craig A.  Klein, Eliza  Klosowsky, Michael P.  Luskin, Joshua D.  McKenna, Patrick M.  Naseem, Samia  Peyton, Jennifer I.  Rosche, Robin  Salovaara, Kaarina  Saltzman, Eva S.'

# Full Court Coverage List

In [112]:
# Load the court listing data
listing_data = pd.read_html('https://www.justice.gov/eoir/immigration-court-administrative-control-list')

# Length of the data
print(len(listing_data))

# View of the data
listing_data[1].head()


2


Unnamed: 0,0,1,2
0,Administrative Control Court An administrative control court is one that creates and maintains records of proceedings for Immigration Courts within an assigned geographic area. See 8 C.F.R. § 1003.11.,Assigned Responsibility The administrative control court may have jurisdiction over: charging documents issued by the following DHS district offices or sub-offices; or charging documents relating to individual aliens in custody at the following detention facilities or Service Processing Centers or incarcerated alien inmates in the custody of Departments of Corrections as specified.,Other Hearing Locations Detail cities or other hearing sites which may be serviced by the administrative control court.
1,"Adelanto Immigration Court 10250 Rancho Rd., STE 201A Adelanto, CA 92301 Back to top","Adelanto Detention Facilities, West Adelanto Detention Facilities (East) - female population Adelanto Detention Facilities (West) - male population Lompoc Federal Correctional Institution United States Penitentiary, Lompoc","Clerical Transfers Allowed Adelanto Detention Facilities, West Lompoc Federal Correctional Institution and United States Penitentiary NOTE: The Lancaster Immigration Court has been closed as of November 1, 2012. Any matter relating to a case heard at that court is now under the jurisdiction of the Adelanto Immigration Court."
2,"Arlington Immigration Court 1901 South Bell Street, Suite 200 Arlington, VA 22202 Back to top","ARLINGTON, VA - DHS DISTRICT OFFICE (including any sub-offices)","Virginia Televideo Sites: Farmville Detention Center Farmville, VA Caroline County Detention Facility Bowling Green, VA"
3,"Atlanta Immigration Court Peachtree Summit Federal Building 401 W. Peachtree Street, Suite 2600 Atlanta, GA 30308","ATLANTA, GA - DHS DISTRICT OFFICE (including any sub-offices except North Carolina and South Carolina)",
4,"Atlanta Immigration Court 180 Ted Turner Drive, SW, Suite 241 Atlanta, GA 30303 Back to top","ATLANTA, GA - DHS DISTRICT OFFICE (including any sub-offices except North Carolina and South Carolina) Irwin County Detention Center, Ocilla, GA Georgia Department of Corrections Diagnostic Center, Jackson, GA",
