## The Heart of Capital Punishment: Execution trends in the American South
#### FACE EXTRACTION CROP OF EXECUTED DEATH ROW CONVICTS IN TEXAS

This project aims to visualize relevant trends in the significance and character of Texas' capital punishment system. In the effort of humanizing the statistics presented, I extracted and captured the images of the individuals executed by the Texas Department of Criminal Justice from the agency's <a href='https://www.tdcj.texas.gov/death_row/dr_executed_offenders.html'>Death Row Information</a> database. 


The final visualization can be found <a href='https://public.flourish.studio/story/622962/'>here</a>

In [None]:
import cv2
import os
import pandas as pd
import requests
import shutil

I used the <a href='https://chrome.google.com/webstore/detail/web-scraper-free-web-scra/jnhgnonknehpejjnehehllkliplmbmhn'> West Scraper</a> extension to extract image links of the convicts from the TDCJ's website. The tool renders the data in CSV format. I manually cleaned the data before uploading it into Jupyter. 

In [None]:
data = pd.read_csv("executions_texas_images.csv")

urls = list(data.image_url_clean)  # creating list of image urls

The image urls led to two different image formats, as shown below.

#### FORMAT 1:
<img src="example_1.jpg" width="100" height="300"/>

#### FORMAT 2:
<img src="example_2.jpg" width="100" height="300"/>

To extract the mugshots from images of format 2, I first used the SHUTIL library to save the images locally from the image urls. 

In [None]:
not_found = []  # generating list of missing images

for image_url in urls:
    if image_url[-5:] != "2.jpg":
        filename = image_url.split("/")[-1]   # generating filename
        r = requests.get(image_url, stream = True)  # getting image url
        if r.status_code == 200:  # verifying image existence
            r.raw.decode_content = True
            found.append(filename)
            with open(filename,'wb') as f:
                shutil.copyfileobj(r.raw, f)  # saving image locally
        else:
            not_found.append(filename)  
            
not_found

Finally, I used the CV2 Library to identify and extract the convict's faces as new images. 

In [None]:
for filename in os.listdir("images"):
    image = cv2.imread(os.path.join("images",filename))  # reading image
    name = filename[:-4]  # setting filename
    print(name)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)  # setting grayscale

    faceCascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")  # detecting face
    faces = faceCascade.detectMultiScale(
        gray,
        scaleFactor=1.3)

    print("[INFO] Found {0} Faces!".format(len(faces)))

    for (x, y, w, h) in faces:  # extracting face
        cv2.rectangle(image, (x, y), (x + w, y + h), (128, 128, 128), 2)
        roi_color = image[y:y + h, x:x + w] 
        print("[INFO] Object found. Saving locally.") 
        cv2.imwrite(name + '_face.jpg', roi_color)  # saving locally