# Scrape Texas death row inmates

In this exercise, we're going to scrape [a table of Texas death row inmates](http://www.tdcj.state.tx.us/death_row/dr_offenders_on_dr.html) into a csv file.

First, import the libraries we'll need:

In [None]:
import requests
from bs4 import BeautifulSoup
import csv

Use `requests` to fetch the page and feed its contents to `BeautifulSoup`:

In [None]:
url = 'http://www.tdcj.state.tx.us/death_row/dr_offenders_on_dr.html'

r = requests.get(url)

soup = BeautifulSoup(r.text, 'html.parser')

Now _view source_ on the page and figure out how we're going to isolate that table. There are a couple ways to do this -- I'm going to target its class.

In [None]:
dr_table = soup.find('table', {'class': 'os'})

- Find all of the rows in the table.
- In a for loop, iterate over the rows (skip the first one) and print each one

In [None]:
dr_rows = dr_table.find_all('tr')[1:]

for row in dr_rows:
    print(row)

Let's look at each cell in each row. Using `find_all`, get a list of cells in each row and print the list:

In [None]:
for row in dr_rows:
    cols = row.find_all('td')
    print(cols)

Let's assign descriptive names to each cell based on its position in the list.

You will use bracket notation to access items in a list -- keep in mind that counting starts at zero in Python. For instance: `your_list[0]` returns the first item in `your_list`.

The first item in our list of cells (`[0]`) is the inmate's ID number. Next is a link to his/her detail page (which we'll extract by accessing the nested a tag's `href` property; we'll also prepend the URL base to make it an absolute link), then his/her last name, etc.

In [None]:
for row in dr_rows:
    cols = row.find_all('td')

    id_number = cols[0].text
    detail_link = 'http://www.tdcj.state.tx.us/death_row/' + cols[1].a['href']
    last_name = cols[2].text
    first_name = cols[3].text
    dob = cols[4].text
    sex = cols[5].text
    race = cols[6].text
    date_received = cols[7].text
    county = cols[8].text
    date_offense = cols[9].text
    print(id_number, detail_link, last_name, first_name, dob, sex, race, date_received, county, date_offense)

Finally, let's write the data to a local csv file. Same setup as when we opened the practice table, except this time we're going to open a file in `w` mode -- let's call it _tx-death-row.csv_ -- and we're going to use a `csv.writer` object to take lists of values and write them to the file, one row at a time.

The loop we just made will go inside the `with` block.

In [None]:
with open('tx-death-row.csv', 'w') as outfile:
    writer = csv.writer(outfile)
    headers = ['id', 'link', 'last', 'first', 'dob', 'sex', 'race',
               'admission_date', 'county', 'offense_date']
    
    writer.writerow(headers)
    
    for row in dr_rows:
        cols = row.find_all('td')

        id_number = cols[0].text
        detail_link = 'http://www.tdcj.state.tx.us/death_row/' + cols[1].a['href']
        last_name = cols[2].text
        first_name = cols[3].text
        dob = cols[4].text
        sex = cols[5].text
        race = cols[6].text
        date_received = cols[7].text
        county = cols[8].text
        date_offense = cols[9].text
        writer.writerow([id_number, detail_link, last_name, first_name, dob, sex,
                         race, date_received, county, date_offense])    