# Opening Files
For this excercise you will be learning how to open and read in a file. To have files to work with on the system you must first upload them into the system.
1. Go to [http://tiny.cc/6x5mzz](http://tiny.cc/6x5mzz) to download the file to your local machine.
1. On the left side there is a folder icon that when clicked on will show Files in the session storage. There is an icon that looks like a file with a up arrow on it. Click on it and select the `state_codes.json` file you downloaded from [http://tiny.cc/6x5mzz](http://tiny.cc/6x5mzz).
1. Repeat the previousl steps for [http://tiny.cc/8x5mzz](http://tiny.cc/8x5mzz) (`area-codes-usa.csv`), and [http://tiny.cc/fx5mzz](http://tiny.cc/fx5mzz) (`uszips.csv`).

Once you've uploaded the files into the system you are ready to try and open your first file. You can open and read the contents of a text file using the format:
```python
with open(filename_goes_here,'r') as file_variable_name:
  data_storage_variable = file_variable_name.read()
```
The `filename_goes_here` can be a variable or literal string with the name of the file to be opened. The `'r'` parameter in the open statement says that the file is to be opened for reading. The `with` statement at the beginning says to treat the result the expression before the `as` as the variable name after it (in this case the `file_variable_name`). In practice it is good to open files using the `with` block so that the file is automatically closed when the block of code within the `with` is completed. The `.read()` method reads in the contents of the file and since we are assigning it to a variable will place it in that variable (in this case the `data_storage_variable`). The `filename_goes_here`, `file_variable_name`, and `data_storage_variable` variable names could be any valid variable name.

1. Create a variable called `state_code_filename` to hold the filename of the `state_codes.json`.
1. Open the `state_codes.json` file and read the contents into a variable named `state_code_raw_data`.
1. You should be able to print out the contents of the file using print the first few lines of file using the line:
```python
print('\n'.join(state_code_raw_data.split('\n')[:5]))
```
and should see something like:
```
[
  {
    "state": "Alabama",
    "abbrev": "Ala.",
    "code": "AL"
```
1. Verify your code works with the test block afterward.

Data Sources: The data for `state_codes.json` is from [here](https://worldpopulationreview.com/states/state-abbreviations), the data for `area-codes-usa.csv` is from [here](https://dedolist.com/lists/business/area-codes-usa/csv/), and the data for `uszips.csv` is from [here](https://simplemaps.com/data/us-zips).


In [3]:
# Replace ... with your code
state_code_filename = 'state_codes.json'
with open(state_code_filename,'r') as m:
    state_code_raw_data = m.read()
    
print('\n'.join(state_code_raw_data.split('\n')[:5]))

[
  {
    "state": "Alabama",
    "abbrev": "Ala.",
    "code": "AL"


Run the code block below after running your code to check that it is correct. If you see an output of `Correct` below the block everything is correct. If you see something like `AssertionError`, then you have a mistake.

In [4]:
# Don't modify this code
import hashlib
assert 'state_code_filename' in locals()
assert type(state_code_filename) == str
assert state_code_filename == 'state_codes.json'
assert 'state_code_raw_data' in locals()
assert type(state_code_raw_data) == str
assert len(state_code_raw_data) == 3773
assert hashlib.md5(state_code_raw_data.encode('utf-8')).hexdigest() == 'aa6718b8b969137de605a87f67b15331'
print('Correct')

Correct


# Loading JSON Files
For this excercise you will be opening a JSON file and loading it into a python variable as actual data objects and not just plain text. You will also be creating two translation dictionaries for translating between state names and codes. There are multiple ways to load JSON files, but we will be using the [`json.load()`](https://docs.python.org/3/library/json.html#json.load) method. The [`json.load()`](https://docs.python.org/3/library/json.html#json.load) method takes in a file as a parameter, so instead of the line previous line to read in of:
```python
  data_storage_variable = file_variable_name.read()
```
You will instead use:
```python
  data_storage_variable = json.load(file_variable_name)
```
The `data_storage_variable` will be an object with the JSON file contents instead of just a string of the characters in the file. You will also need to iterate over all values in a list using a `for` loop. The syntax of the `for` loop is:
```python
for item in some_list:
  ... # do something with item here
```
For each iteration of the loop `item` will take the value of the items in `some_list`. Now load the JSON file and create a translation dictionary between state names and codes.

1. Make sure that your previous excercise works, and that it has been run, otherwise create the `state_code_filename` variable.
1. Open the `state_codes.json` file and read JSON data into a variable named `state_code_data`.
1. If you print the variable you should see something like:
```
[{'state': 'Alabama', 'abbrev': 'Ala.', 'code': 'AL'}, {'state': 'Alaska', 'abbrev': 'Alaska', 'code': 'AK'}, ...
```
1. Create a empty dictionary variable called `state_name_to_code`, an empty dictionary is just `dict()`.
1. Iterate through all of the states in the `state_code_data` with a for loop, and insert the state code values mapped to their state name keys into the `state_name_to_code` dictionary. The `'state'` key is the name of the state, and the `'code'` key is the code for each state.
1. Repeat the previous two steps for `state_code_to_name` that will translate between codes to names.
1. Verify your code works with the test block afterward.

In [15]:
import json
# Replace ... with your code
state_code_filename = 'state_codes.json'
with open(state_code_filename,'r') as m:
    state_code_data = json.load(m)
    
state_name_to_code = dict()
state_code_to_name = dict()

for state in state_code_data:
    state_name = state['state']
    state_code = state['code']
    state_name_to_code[state_name] = state_code
    state_code_to_name[state_code] = state_name




Run the code block below after running your code to check that it is correct. If you see an output of `Correct` below the block everything is correct. If you see something like `AssertionError`, then you have a mistake.

In [16]:
# Don't modify this code
assert 'state_code_data' in locals()
assert type(state_code_data) == list
assert len(state_code_data) == 51
assert 'state_name_to_code' in locals()
assert type(state_name_to_code) == dict
assert len(state_name_to_code) == 51
assert state_name_to_code['California'] == 'CA'
assert 'state_code_to_name' in locals()
assert type(state_code_to_name) == dict
assert len(state_code_to_name) == 51
assert state_code_to_name['CA'] == 'California'
print('Correct')

Correct


# Loading CSV
For this exercise you will be loading the area and zip code CSV files. Similar to loading in JSON data, there methods to load in CSV data. There are multiple ways to read in CSV data, but for this exercise you will be using the [`csv.DictReader()`](https://docs.python.org/3/library/csv.html#csv.DictReader) to read in the rows of the CSV file as dictionaries. Like the [`json.load()`](https://docs.python.org/3/library/json.html#json.load) method, the [`csv.DictReader()`](https://docs.python.org/3/library/csv.html#csv.DictReader) takes in a file as a parameter; however, it creates a sequence that you will want to convert to a list. So instead of the line previous line to read in of:
```python
  data_storage_variable = file_variable_name.read()
```
You will instead use:
```python
  data_storage_variable = list(csv.DictReader(file_variable_name))
```

1. Create a variable called `area_code_filename` to hold the filename of the `area-codes-usa.csv`.
1. Open the `area-codes-usa.csv` file and read the contents into a variable named `area_code_data`.
1. If you print the first item of the `area_code_data` you should see something like:
```
{'area-code': '201', 'city': 'Bayonne', 'state': 'New Jersey', 'country': 'US', 'latitude': '40.66871', 'longitude': '-74.11431'}
```
1. Create a variable called `zip_code_filename` to hold the filename of the `uszips.csv`.
1. Open the `uszips.csv` file and read the contents into a variable named `zip_code_data`.
1. If you print the first item of the `zip_code_data` you should see something like:
```
{'zip': '00601', 'lat': '18.18027', 'lng': '-66.75266', 'city': 'Adjuntas', 'state_id': 'PR', 'state_name': 'Puerto Rico', ...
```
1. Verify your code works with the test block afterward.

In [21]:
import csv
# Replace ... with your code
area_code_filename = 'area-codes-usa.csv'
zip_code_filename = 'uszips.csv'

with open(area_code_filename, 'r') as a:
    area_code_data = list(csv.DictReader(a))
with open(zip_code_filename, 'r') as z:
    zip_code_data = list(csv.DictReader(z))
    
zip_code_data


[{'zip': '00601',
  'lat': '18.18027',
  'lng': '-66.75266',
  'city': 'Adjuntas',
  'state_id': 'PR',
  'state_name': 'Puerto Rico',
  'zcta': 'TRUE',
  'parent_zcta': '',
  'population': '16773',
  'density': '100.5',
  'county_fips': '72001',
  'county_name': 'Adjuntas',
  'county_weights': '{"72001": 98.73, "72141": 1.27}',
  'county_names_all': 'Adjuntas|Utuado',
  'county_fips_all': '72001|72141',
  'imprecise': 'FALSE',
  'military': 'FALSE',
  'timezone': 'America/Puerto_Rico'},
 {'zip': '00602',
  'lat': '18.36075',
  'lng': '-67.17541',
  'city': 'Aguada',
  'state_id': 'PR',
  'state_name': 'Puerto Rico',
  'zcta': 'TRUE',
  'parent_zcta': '',
  'population': '37083',
  'density': '472.1',
  'county_fips': '72003',
  'county_name': 'Aguada',
  'county_weights': '{"72003": 100}',
  'county_names_all': 'Aguada',
  'county_fips_all': '72003',
  'imprecise': 'FALSE',
  'military': 'FALSE',
  'timezone': 'America/Puerto_Rico'},
 {'zip': '00603',
  'lat': '18.45744',
  'lng': '-67

Run the code block below after running your code to check that it is correct. If you see an output of `Correct` below the block everything is correct. If you see something like `AssertionError`, then you have a mistake.

In [20]:
# Don't modify this code
assert 'area_code_filename' in locals()
assert type(area_code_filename) == str
assert area_code_filename == 'area-codes-usa.csv'
assert 'area_code_data' in locals()
assert type(area_code_data) == list
assert len(area_code_data) == 2766
assert type(area_code_data[0]) == dict
assert 'zip_code_data' in locals()
assert type(zip_code_data) == list
assert len(zip_code_data) == 33788
assert type(zip_code_data[0]) == dict
print('Correct')

Correct


# Combining Data
Now that you have hopefully read in the files successfully, we will try to combine the area code and zip code data into a single dictionary.
1. Create an empty dictionary called `city_state_to_zips`. This will map a tuple of city name and state code for example `('Davis','CA')` to a list of zip codes such as `['95616','95618']`.
1. Iterate through the `zip_code_data` and in the loop do the following:
  1. Create a tuple composed of the city name `'city'` and state code `'state_id'` from the zip code dictionaries that will be used as a key in the `city_state_to_zips` dictionary.
  1. If the tuple is `not in` the `city_state_to_zips` dictionary, then assign the value of that tuple to an empty `list()`. This will allow for appending the zip codes in the next step.
  1. Append the zip code `'zip'` from the zip code dictionary to the list for the city/state in the `city_state_to_zips` dictionary using the `.append()` method.
1. If you print the zip codes for Davis, CA with `print(city_state_to_zips[('Davis','CA')])` you should see `['95616', '95618']`.
1. Create an empty dictionary called `area_code_to_zips`. This will map area codes to a list of zip codes.
1. Iterate through the `area_code_data` and in the loop do the following:
  1. If the `'area-code'` is `not in` the `area_code_to_zips` dictionary, dictionary, then assign the value of that tuple to an empty `list()`. This will allow for extend the list of zip codes in the next step.
  1. Create a tuple composed of the city name `'city'` and state code from the area code dictionaries that will be used as a key in the `city_state_to_zips` dictionary. The area code dictionaries only have state name `'state'`. Use the `state_name_to_code` dictionary to convert the state naem to state code.
  1. If the tuple is `in` the `city_state_to_zips` dictionary, then extend the `area_code_to_zips` zip code list with the list of zips from `city_state_to_zips` using teh `.extend()` method.
1. If you print out the sorted zip codes for '530' with `print(sorted(area_code_to_zips['530']))` you should see something like `['95616', '95618', '95695', '95776', '95926', '95928', '95929', '95969', ...`
1. Verify your code works with the test block afterward.
  



In [27]:
area_code_data

[{'area-code': '201',
  'city': 'Bayonne',
  'state': 'New Jersey',
  'country': 'US',
  'latitude': '40.66871',
  'longitude': '-74.11431'},
 {'area-code': '201',
  'city': 'Bergenfield',
  'state': 'New Jersey',
  'country': 'US',
  'latitude': '40.9276',
  'longitude': '-73.99736'},
 {'area-code': '201',
  'city': 'Cliffside Park',
  'state': 'New Jersey',
  'country': 'US',
  'latitude': '40.82149',
  'longitude': '-73.98764'},
 {'area-code': '201',
  'city': 'Englewood',
  'state': 'New Jersey',
  'country': 'US',
  'latitude': '40.89288',
  'longitude': '-73.97264'},
 {'area-code': '201',
  'city': 'Fair Lawn',
  'state': 'New Jersey',
  'country': 'US',
  'latitude': '40.94038',
  'longitude': '-74.13181'},
 {'area-code': '201',
  'city': 'Fort Lee',
  'state': 'New Jersey',
  'country': 'US',
  'latitude': '40.85093',
  'longitude': '-73.97014'},
 {'area-code': '201',
  'city': 'Hackensack',
  'state': 'New Jersey',
  'country': 'US',
  'latitude': '40.88593',
  'longitude': '-

In [33]:
# Replace ... with your code
city_state_to_zips = dict()
area_code_to_zips= dict()

for z in zip_code_data:
    city = z['city']
    state_id = z['state_id']
    zip_val = z['zip']
    if (city, state_id) not in city_state_to_zips:
        city_state_to_zips[(city, state_id)]= list()
    city_state_to_zips[(city, state_id)].append(zip_val)
    
for a in area_code_data:
    area_code = a['area-code']
    city_name = a['city']
    state_Name = a['state']
    state_Code = state_name_to_code[state_Name]
    
    if area_code not in area_code_to_zips:
        area_code_to_zips[area_code]= list()
        
    new_tuple = (city_name,state_Code)
    
    if new_tuple in city_state_to_zips:
        area_code_to_zips[area_code].extend(city_state_to_zips[new_tuple])
    
    
print(sorted(area_code_to_zips['530']))

    

['95616', '95618', '95695', '95776', '95926', '95928', '95929', '95969', '95973', '95991', '95993', '96001', '96002', '96003', '96150', '96155']


Run the code block below after running your code to check that it is correct. If you see an output of `Correct` below the block everything is correct. If you see something like `AssertionError`, then you have a mistake.

In [34]:
# Don't modify this code
assert 'city_state_to_zips' in locals()
assert type(city_state_to_zips) == dict
assert len(city_state_to_zips) == 28113
assert city_state_to_zips[('Davis','CA')] == ['95616', '95618']
assert 'area_code_to_zips' in locals()
assert type(area_code_to_zips) == dict
assert len(area_code_to_zips) == 298
assert sorted(area_code_to_zips['530']) == ['95616', '95618', '95695', '95776', '95926', '95928', '95929', '95969', '95973', '95991', '95993', '96001', '96002', '96003', '96150', '96155']
print('Correct')

Correct


# Challenge (Saving Files)
For this excercise you will be learning how to open and save data in a file. You can open (or create a new file) and write the contents to a text file format:
```python
with open(filename_goes_here,'w') as file_variable_name:
  file_variable_name.write(data_to_store)
```
**NOTE: When opening a file with `'w'` flag, if the file exists the contents will be removed! The `'a'` flag can be used to no truncate file information; however, the data will be written to the end of the file.**

The `filename_goes_here` can be a variable or literal string with the name of the file to be opened. The `'w'` parameter in the open statement says that the file is to be opened for writing. The `with` statement at the beginning says to treat the result the expression before the `as` as the variable name after it (in this case the `file_variable_name`). In practice it is good to open files using the `with` block so that the file is automatically closed when the block of code within the `with` is completed. The `.write()` method writes the contents of `data_to_store` to the file. The `filename_goes_here`, `file_variable_name`, and `data_to_store` variable names could be any valid variable name.

1. Create a variable called `state_code_out_filename` to hold the filename of the `state_codes_copy.json` (you will be copying the file from first section).
1. Open the `state_codes_copy.json` file and write the contents your variable `state_code_raw_data` from  earlier.
1. You should be able to see that the `state_codes_copy.json` file is in your list of files now (You may have to refresh).
1. Verify your code works with the test block afterward.



In [36]:
# Replace ... with your code
state_code_out_filename = 'state_codes_copy.json'

with open(state_code_out_filename, 'w') as m:
    m.write(str(state_code_raw_data))
    

Run the code block below after running your code to check that it is correct. If you see an output of `Correct` below the block everything is correct. If you see something like `AssertionError`, then you have a mistake.

In [None]:
# Don't modify this code
import os
import filecmp
import hashlib
assert 'state_code_out_filename' in locals()
assert type(state_code_out_filename) == str
assert state_code_out_filename == 'state_codes_copy.json'
assert 'state_code_raw_data' in locals()
assert type(state_code_raw_data) == str
assert len(state_code_raw_data) == 3773
assert hashlib.md5(state_code_raw_data.encode('utf-8')).hexdigest() == 'aa6718b8b969137de605a87f67b15331'
assert os.path.isfile('state_codes.json')
assert os.path.isfile('state_codes_copy.json')
assert filecmp.cmp('state_codes.json','state_codes_copy.json')
print('Correct')

# Challenge (Saving CSV Files)
For this excercise you will be learning how to save data in a CSV formatted file. You will create a file just as you did in the previous section, however, you will be using the `csv.writer` to output in CSV format:
```python
with open(filename_goes_here,'w') as file_variable_name:
  csv_file_variable_name = csv.writer(file_variable_name,quoting=csv.QUOTE_ALL)
  csv_file_variable_name.writerow(list_or_tuple_to_write)
```
The `filename_goes_here` can be a variable or literal string with the name of the file to be opened. The `'w'` parameter in the open statement says that the file is to be opened for writing. The `with` statement at the beginning says to treat the result the expression before the `as` as the variable name after it (in this case the `file_variable_name`). In practice it is good to open files using the `with` block so that the file is automatically closed when the block of code within the `with` is completed. The `csv_file_variable_name` is the variable being assigned to the [`csv.writer()`](https://docs.python.org/3/library/csv.html#csv.writer) type. To create the [`csv.writer()`](https://docs.python.org/3/library/csv.html#csv.writer) the function needs the output file variable, but in this code there is an added specification for the `quoting` of the values. The `csv.QUOTE_ALL` will quote all values in the CSV file, this results in slightly larger files, but could potentially avoid ambiguous interpretations of the data when read later. The [`.writerow()`](https://docs.python.org/3/library/csv.html#csv.csvwriter.writerow) method writes the `list_or_tuple_to_write` values to the file. The most common use of the [`csv.writer()`](https://docs.python.org/3/library/csv.html#csv.writer) has a call to [`.writerow()`](https://docs.python.org/3/library/csv.html#csv.csvwriter.writerow) to output the header of the CSV file, and then a loop with a call to [`.writerow()`](https://docs.python.org/3/library/csv.html#csv.csvwriter.writerow) in it to write each data row of the file. The `filename_goes_here`, `file_variable_name`, `csv_file_variable_name`, and `list_or_tuple_to_write` variable names could be any valid variable name.

1. Create a variable called `zip_to_area_code_filename` to hold the filename of the `zip_to_area_code.csv` (you will be writing out data from the previous combining data section).
1. Open the `zip_to_area_code.csv` file and create a [`csv.writer()`](https://docs.python.org/3/library/csv.html#csv.writer) called `zip_to_area_code_csv_writer`.
1. Write the header row of `['zip','area code']` out.
1. Iterate through the `area_code_data` and in the loop do the following:
  1. Iterate through each zip code in the area code `area_code_data` and in the loop do the following:
    1. Write the data row of zip code and area code.
1. You should be able to see that the `zip_to_area_code.csv` file is in your list of files now.
1. Verify your code works with the test block afterward.

In [None]:
import csv
# Replace ... with your code
...

Run the code block below after running your code to check that it is correct. If you see an output of `Correct` below the block everything is correct. If you see something like `AssertionError`, then you have a mistake.

In [None]:
# Don't modify this code
import os
import filecmp
import hashlib
assert 'zip_to_area_code_filename' in locals()
assert type(zip_to_area_code_filename) == str
assert zip_to_area_code_filename == 'zip_to_area_code.csv'
assert 'zip_to_area_code_csv_writer' in locals()
assert hasattr(zip_to_area_code_csv_writer,'writerow')
assert callable(zip_to_area_code_csv_writer.writerow)
assert os.path.isfile('zip_to_area_code.csv')
with open('zip_to_area_code.csv','r') as in_file:
  zip_area_dict_list = list(csv.DictReader(in_file))
  assert len(zip_area_dict_list) == 8384
  for zip_area_dict in zip_area_dict_list:
    assert len(zip_area_dict) == 2
    assert 'zip' in zip_area_dict
    assert 'area code' in zip_area_dict
    assert zip_area_dict['area code'] in area_code_to_zips
    assert zip_area_dict['zip'] in area_code_to_zips[zip_area_dict['area code']]
print('Correct')