# Filter a CSV

We're going to use built-in Python modules - programs really - to download a csv file from the Internet and save it locally.

CSV stands for comma-separated values. It's a common file format a file format that resembles a spreadsheet or database table in a text file.

So first, let's import two built-in Python modules: urllib and csv. 

* ```urllib``` is a module that allows Python to make http requests to URLs on the web to fetch HTML. It contains a submodule called request. And inside there we want a specific method called urlretrieve

* ```csv``` is a module that helps Python work with tabular data extracted from spreadsheets and databases

In [87]:
from urllib.request import urlretrieve
import csv

We're going to download a csv file. What should we name it?

In [88]:
downloaded_file = "banklist.csv"

Now we need a URL to a CSV file out on the Internet.

For this project we're going to download a CSV file that the [FDIC](https://www.fdic.gov/bank/individual/failed/banklist.html) compiles of all the banks that have failed since October 1, 2000.

The file we want is at https://s3.amazonaws.com/datanicar/banklist.csv.

If the internet is uncooperative, we can also use the local version of the file in the ```project1/data/``` directory, and structure out code a little differently.

To do this, we use that program within the ```urllib``` module to download the file and save it to our project folder. It's called ```urlretrieve``` and for our purposes starting out think of it as a way to download a file from the Internet.

`urlretrieve` takes two arguments to download a file. First specify our target URL, and then we give it a name for the file we want to create.

In [89]:
urlretrieve("https://s3.amazonaws.com/datanicar/banklist.csv", downloaded_file)

('banklist.csv', <http.client.HTTPMessage at 0x110c740f0>)

The output shows we successfully downloaded the file and saved it

Let's open a new file so we can filter just the data we want

In [90]:
filtered_file = open('california_banks.csv', 'w')

We will use the writer method to write data to a file by passing in the name of the new file as the first argument and delimiter as the the second.

Then we will go ahead and use python's csv reader to open the file and see what is inside.

We specify the name of the file we just created, and we add a setting so we can open and read almost any CSV file.

In [91]:
# create our output
output = csv.writer(filtered_file, delimiter=',')

# open our downloaded file
with open(downloaded_file, 'r') as file:
    # use python's csv reader to access the contents
    # and create an object that represents the data
    csv_data = csv.reader(file)
    
    # write our header row to the output csv
    header_row = next(csv_data)
    print(header_row)    
    output.writerow(header_row)
    
    # loop through each row of the csv
    for row in csv_data:

        # now we're going to use an IF statement
        # to find items where the state field
        # is equal to California
        if row[2] == 'CA':
            
            # write the row to the new csv file
            output.writerow(row)          
        
            # and print the row to the terminal
            print(row)

            # print the data type to the terminal
            print(type(row))

            # print the length of the row to the terminal
            print(len(row))            
            
        # otherwise continue on
        else:
            continue
            
# close the output file
filtered_file.close()

['Bank Name', 'City', 'ST', 'CERT', 'Acquiring Institution', 'Closing Date', 'Updated Date']
['Frontier Bank, FSB D/B/A El Paseo Bank', 'Palm Desert', 'CA', '34738', 'Bank of Southern California, N.A.', '7-Nov-14', '10-Nov-16']
<class 'list'>
7
['Palm Desert National Bank', 'Palm Desert', 'CA', '23632', 'Pacific Premier Bank', '27-Apr-12', '7-Dec-15']
<class 'list'>
7
['Citizens Bank of Northern California', 'Nevada City', 'CA', '33983', 'Tri Counties Bank', '23-Sep-11', '7-Jan-18']
<class 'list'>
7
['San Luis Trust Bank, FSB', 'San Luis Obispo', 'CA', '34783', 'First California Bank', '18-Feb-11', '12-Sep-16']
<class 'list'>
7
['Charter Oak Bank', 'Napa', 'CA', '57855', 'Bank of Marin', '18-Feb-11', '29-Jan-19']
<class 'list'>
7
['Canyon National Bank', 'Palm Springs', 'CA', '34692', 'Pacific Premier Bank', '11-Feb-11', '19-Aug-14']
<class 'list'>
7
['First Vietnamese American Bank', 'Westminster', 'CA', '57885', 'Grandpoint Bank', '5-Nov-10', '29-Jan-19']
<class 'list'>
7
['Western C