# Writing data to file

Typically, the last step in a web scrape job is to write your results to file. In this case, we're going write out the results to a CSV file, and to do that we'll use Python's built-in [`csv`](https://docs.python.org/3/library/csv.html) module.

To start with, let's import our dependencies:

In [1]:
import csv

import requests
import bs4

For the sake of time, here's the working code to scrape the death row inmates into lists from the previous notebook -- go ahead and run these three cells:

In [9]:
dr_page = requests.get('https://www.tdcj.state.tx.us/death_row/dr_offenders_on_dr.html')

In [10]:
soup = bs4.BeautifulSoup(dr_page.text, 'html.parser')
table = soup.find('table', {'class': 'tdcj_table'})
rows = table.find_all('tr')

In [11]:
for item in rows[1:]:
    cells = item.find_all('td')
    inmate_id = cells[0].text
    link = 'https://www.tdcj.texas.gov/death_row/' + cells[1].a['href']
    last = cells[2].text
    first = cells[3].text
    dob = cells[4].text
    gender = cells[5].text
    race = cells[6].text
    intake_date = cells[7].text
    county = cells[8].text
    offense_date = cells[9].text
    inmate_data = [inmate_id, link, last, first, dob, gender, race, intake_date, county, offense_date]
    print(inmate_data)

['999614', 'https://www.tdcj.texas.gov/death_row/dr_info/lovekristopher.html', 'Love', 'Kristopher', '03/23/1984', 'M', 'Black', '11/15/2018', 'Dallas', '09/02/2015']
['999613', 'https://www.tdcj.texas.gov/death_row/dr_info/lewishoward.html', 'Lewis', 'Howard', '09/20/1967', 'M', 'Black', '11/09/2018', 'Walker', '07/24/2013']
['999612', 'https://www.tdcj.texas.gov/death_row/dr_info/comptondillion.html', 'Compton', 'Dillion', '07/27/1994', 'M', 'Black', '11/06/2018', 'Jones', '07/16/2016']
['999611', 'https://www.tdcj.texas.gov/death_row/dr_info/irsanali.html', 'Irsan', 'Ali', '12/27/1957', 'M', 'Other', '08/20/2018', 'Harris', '01/15/2018']
['999610', 'https://www.tdcj.texas.gov/death_row/dr_info/delacruzisidro.html', 'Delacruz', 'Isidro', '10/07/1990', 'M', 'Hispanic', '04/26/2018', 'Tom Green', '09/02/2014']
['999609', 'https://www.tdcj.texas.gov/death_row/dr_info/delacerdajason.html', 'Delacerda', 'Jason', '07/26/1977', 'M', 'Hispanic', '03/08/2018', 'Hardin', '08/17/2011']
['999608

To write this data to a CSV using Python's `csv` module, the steps are:
- Open a new file in "write" mode
- Write the headers (which we will define)
- As we loop over the rows of data and extract them, write the final list to file

Let's start by defining a list of strings that will become our header row in the CSV:

In [4]:
headers = ['inmate_id', 'link', 'last', 'first', 'dob', 'gender',
           'race', 'intake_date', 'county', 'offense_date']

To open a new file, we'll use a `with` statement and the `open()` function with a few arguments to specify how we write to file:
- `inmate-data.csv`: The name of the file to write to
- `'w'`: specifies that we're opening the file in "write" mode
- `newline=''`: This is a Windows-specific argument to ensure that we don't end up with a bunch of blank extra lines 🙄
- `encoding='utf-8'`: The file encoding

We'll also specify a variable to give us a handle to the open file: `in_file`

Next, we'll create a `csv.writer` object that will handle actually writing the data to file. Let's start by just writing the headers to our new file using the `writerow` method:

In [5]:
with open('inmate-data.csv', 'w', newline='', encoding='utf-8') as in_file:
    writer = csv.writer(in_file)
    writer.writerow(headers)

It worked! Now we can whang in the data we scraped out of the page earlier -- copy and past the code in the cell above that begins `for item in rows[1:]:` underneath where the headers are being written. Then it's just a matter of trading out `print()` for `writer.writerow()`

Proper indentation is key here! Everything under the `for` statement should be indented an extra four spaces, otherwise things will break.

In [12]:
with open('inmate-data.csv', 'w', newline='', encoding='utf-8') as in_file:
    writer = csv.writer(in_file)
    writer.writerow(headers)
    for item in rows[1:]:
        cells = item.find_all('td')
        inmate_id = cells[0].text
        link = 'https://www.tdcj.texas.gov/death_row/' + cells[1].a['href']
        last = cells[2].text
        first = cells[3].text
        dob = cells[4].text
        gender = cells[5].text
        race = cells[6].text
        intake_date = cells[7].text
        county = cells[8].text
        offense_date = cells[9].text
        inmate_data = [inmate_id, link, last, first, dob, gender, race, intake_date, county, offense_date]
        writer.writerow(inmate_data)

## Your turn

In groups, [scrape the first page of this table of FDA warning letters](https://www.fda.gov/ICECI/EnforcementActions/WarningLetters/2018/default.htm) into a CSV.