## Scraping Food Environment Atlas
After looking through the ArcGIS for the data, I thought that the best way to gather this data would actually be through automated downloads. 
1. Dowload the data for a single county and store it in a dataframe
2. Transform the dataframe so that rows are columns
3. Merge the data into a final dataframe


In [218]:
import pandas as pd
import numpy as np
import webbrowser
import glob
import time

In [219]:
counties = ["Anderson",
            "Bedford",
            "Benton",
            "Bledsoe",
            "Blount",
            "Bradley",
            "Campbell",
            "Cannon",
            "Carroll",
            "Carter",
            "Cheatham",
            "Chester",
            "Claiborne",
            "Clay",
            "Cocke",
            "Coffee",
            "Crockett",
            "Cumberland",
            "Davidson",
            "Decatur",
            "DeKalb", 
            "Dickson", 
            "Dyer", 
            "Fayette", 
            "Fentress", 
            "Franklin", 
            "Gibson", 
            "Giles", 
            "Grainger", 
            "Greene", 
            "Grundy", 
            "Hamblen", 
            "Hamilton", 
            "Hancock", 
            "Hardeman", 
            "Hardin", 
            "Hawkins", 
            "Haywood", 
            "Henderson", 
            "Henry", 
            "Hickman", 
            "Houston", 
            "Humphreys", 
            "Jackson", 
            "Jefferson", 
            "Johnson", 
            "Knox", 
            "Lake", 
            "Lauderdale", 
            "Lawrence", 
            "Lewis", 
            "Lincoln", 
            "Loudon", 
            "McMinn", 
            "McNairy", 
            "Macon", 
            "Madison", 
            "Marion", 
            "Marshall", 
            "Maury", 
            "Meigs", 
            "Monroe", 
            "Montgomery", 
            "Moore", 
            "Morgan", 
            "Obion", 
            "Overton", 
            "Perry", 
            "Pickett", 
            "Polk", 
            "Putnam", 
            "Rhea", 
            "Roane", 
            "Robertson", 
            "Rutherford", 
            "Scott", 
            "Sequatchie", 
            "Sevier", 
            "Shelby", 
            "Smith", 
            "Stewart", 
            "Sullivan", 
            "Sumner", 
            "Tipton", 
            "Trousdale", 
            "Unicoi", 
            "Union", 
            "Van Buren", 
            "Warren", 
            "Washington", 
            "Wayne", 
            "Weakley", 
            "White", 
            "Williamson", 
            "Wilson" ]

In [174]:
grocery_access_url = "data:text/csv;charset=utf-8,%20%0A%22State%22,%22TN%22%0A%22County%22,%22"+ county +"%22%0A%22Population,%20low%20access%20to%20store,%202010%22,%227667.187882%22%0A%22Population,%20low%20access%20to%20store,%202015%22,%228150.826633%22%0A%22Population,%20low%20access%20to%20store%20(%25%20change),%202010%20-15%22,%226.307903%22%0A%22Population,%20low%20access%20to%20store%20(%25),%202010%22,%2218.676771%22%0A%22Population,%20low%20access%20to%20store%20(%25),%202015%22,%2219.854883%22%0A%22Low%20income%20&%20low%20access%20to%20store,%202010%22,%223127.597765%22%0A%22Low%20income%20&%20low%20access%20to%20store,%202015%22,%222905.61274%22%0A%22Low%20income%20&%20low%20access%20to%20store%20(%25%20change),%202010%20-%2015%22,%22-7.097621%22%0A%22Low%20income%20&%20low%20access%20to%20store%20(%25),%202010%22,%227.618625%22%0A%22Low%20income%20&%20low%20access%20to%20store%20(%25),%202015%22,%227.077884%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store,%202010%22,%22490.486131%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store,%202015%22,%22710.07845%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store%20(%25%20change),%202010%20-%2015%22,%2244.770342%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store%20(%25),%202010%22,%223.063432%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store%20(%25),%202015%22,%224.434941%22%0A%22SNAP%20households,%20low%20access%20to%20store,%202015%22,%22617.031408%22%0A%22SNAP%20households,%20low%20access%20to%20store%20(%25),%202015%22,%223.853797%22%0A%22Children,%20low%20access%20to%20store,%202010%22,%221790.365531%22%0A%22Children,%20low%20access%20to%20store,%202015%22,%221869.284048%22%0A%22Children,%20low%20access%20to%20store%20(%25%20change),%202010%20-%2015%22,%224.407956%22%0A%22Children,%20low%20access%20to%20store%20(%25),%202010%22,%224.361214%22%0A%22Children,%20low%20access%20to%20store%20(%25),%202015%22,%224.553454%22%0A%22Seniors,%20low%20access%20to%20store,%202010%22,%221423.21609%22%0A%22Seniors,%20low%20access%20to%20store,%202015%22,%221542.465898%22%0A%22Seniors,%20low%20access%20to%20store%20(%25%20change),%202010%20-15%22,%228.378897%22%0A%22Seniors,%20low%20access%20to%20store%20(%25),%202010%22,%223.466862%22%0A%22Seniors,%20low%20access%20to%20store%20(%25),%202015%22,%223.757347%22%0A%22White,%20low%20access%20to%20store,%202015%22,%227431.71159%22%0A%22White,%20low%20access%20to%20store%20(%25),%202015%22,%2218.103166%22%0A%22Black,%20low%20access%20to%20store,%202015%22,%22369.368139%22%0A%22Black,%20low%20access%20to%20store%20(%25),%202015%22,%220.899757%22%0A%22Hispanic%20ethnicity,%20low%20access%20to%20store,%202015%22,%22198.378837%22%0A%22Hispanic%20ethnicity,%20low%20access%20to%20store%20(%25),%202015%22,%220.483238%22%0A%22Asian,%20low%20access%20to%20store,%202015%22,%2249.868993%22%0A%22Asian,%20low%20access%20to%20store%20(%25),%202015%22,%220.121478%22%0A%22American%20Indian%20or%20Alaska%20Native,%20low%20access%20to%20store,%202015%22,%2230.148045%22%0A%22American%20Indian%20or%20Alaska%20Native,%20low%20access%20to%20store%20(%25),%202015%22,%220.073439%22%0A%22Hawaiian%20or%20Pacific%20Islander,%20low%20access%20to%20store,%202015%22,%225.503696%22%0A%22Hawaiian%20or%20Pacific%20Islander,%20low%20access%20to%20store%20(%25),%202015%22,%220.013407%22%0A%22Multiracial,%20low%20access%20to%20store,%202015%22,%22264.226188%22%0A%22Multiracial,%20low%20access%20to%20store%20(%25),%202015%22,%220.643638%22%0A"

In [10]:
webbrowser.open_new_tab(grocery_access_url)

True

### Function to automate downloads

In [184]:
def download(url1, url2, county_list): 
    for county in county_list:
        url = url1+county+url2
        webbrowser.open_new_tab(url)

### Importing files into a df and merging into an empty df

In [118]:
df_test = pd.read_csv("Datasets/download.csv")

In [119]:
df_test = df.set_index("State").transpose().reset_index(drop=True)

In [128]:
df_test.columns.names = ['index']

In [129]:
df2_test = pd.DataFrame(columns=df.columns)

In [132]:
df2_test = df2.append(df.iloc[0])

In [133]:
df2_test.append(df.iloc[0])

index,County,"Population, low access to store, 2010","Population, low access to store, 2015","Population, low access to store (% change), 2010 -15","Population, low access to store (%), 2010","Population, low access to store (%), 2015","Low income & low access to store, 2010","Low income & low access to store, 2015","Low income & low access to store (% change), 2010 - 15","Low income & low access to store (%), 2010",...,"Hispanic ethnicity, low access to store, 2015","Hispanic ethnicity, low access to store (%), 2015","Asian, low access to store, 2015","Asian, low access to store (%), 2015","American Indian or Alaska Native, low access to store, 2015","American Indian or Alaska Native, low access to store (%), 2015","Hawaiian or Pacific Islander, low access to store, 2015","Hawaiian or Pacific Islander, low access to store (%), 2015","Multiracial, low access to store, 2015","Multiracial, low access to store (%), 2015"
0,Hamilton,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
0,Hamilton,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638


In [220]:
def import_and_merge(files:list):
    for i, file in enumerate(files):
        if i==0:
            df = pd.read_csv(file)
            df = df.set_index("State").transpose().reset_index(drop=True)
            df.columns.names = ['index']
        else: 
            df2 = pd.read_csv(file)
            df2 = df2.set_index("State").transpose().reset_index(drop=True)
            df2.columns.names = ['index']
            df = df.append(df2)
    return df


In [232]:
def combined_import(url1, url2, counties, path, name):
    download(url1, url2, counties)
    time.sleep(5)
    file_names = glob.glob(path+"*.csv") # loads all file names from path into a list to iterate over
    final_df = import_and_merge(file_names)
    final_df.reset_index(drop=True).to_pickle(path+name)
    return final_df.reset_index(drop=True)

## Using function on grocery data

In [229]:
url1_grocery = "data:text/csv;charset=utf-8,%20%0A%22State%22,%22TN%22%0A%22County%22,%22"

In [230]:
url2_grocery = "%22%0A%22Population,%20low%20access%20to%20store,%202010%22,%227667.187882%22%0A%22Population,%20low%20access%20to%20store,%202015%22,%228150.826633%22%0A%22Population,%20low%20access%20to%20store%20(%25%20change),%202010%20-15%22,%226.307903%22%0A%22Population,%20low%20access%20to%20store%20(%25),%202010%22,%2218.676771%22%0A%22Population,%20low%20access%20to%20store%20(%25),%202015%22,%2219.854883%22%0A%22Low%20income%20&%20low%20access%20to%20store,%202010%22,%223127.597765%22%0A%22Low%20income%20&%20low%20access%20to%20store,%202015%22,%222905.61274%22%0A%22Low%20income%20&%20low%20access%20to%20store%20(%25%20change),%202010%20-%2015%22,%22-7.097621%22%0A%22Low%20income%20&%20low%20access%20to%20store%20(%25),%202010%22,%227.618625%22%0A%22Low%20income%20&%20low%20access%20to%20store%20(%25),%202015%22,%227.077884%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store,%202010%22,%22490.486131%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store,%202015%22,%22710.07845%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store%20(%25%20change),%202010%20-%2015%22,%2244.770342%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store%20(%25),%202010%22,%223.063432%22%0A%22Households,%20no%20car%20&%20low%20access%20to%20store%20(%25),%202015%22,%224.434941%22%0A%22SNAP%20households,%20low%20access%20to%20store,%202015%22,%22617.031408%22%0A%22SNAP%20households,%20low%20access%20to%20store%20(%25),%202015%22,%223.853797%22%0A%22Children,%20low%20access%20to%20store,%202010%22,%221790.365531%22%0A%22Children,%20low%20access%20to%20store,%202015%22,%221869.284048%22%0A%22Children,%20low%20access%20to%20store%20(%25%20change),%202010%20-%2015%22,%224.407956%22%0A%22Children,%20low%20access%20to%20store%20(%25),%202010%22,%224.361214%22%0A%22Children,%20low%20access%20to%20store%20(%25),%202015%22,%224.553454%22%0A%22Seniors,%20low%20access%20to%20store,%202010%22,%221423.21609%22%0A%22Seniors,%20low%20access%20to%20store,%202015%22,%221542.465898%22%0A%22Seniors,%20low%20access%20to%20store%20(%25%20change),%202010%20-15%22,%228.378897%22%0A%22Seniors,%20low%20access%20to%20store%20(%25),%202010%22,%223.466862%22%0A%22Seniors,%20low%20access%20to%20store%20(%25),%202015%22,%223.757347%22%0A%22White,%20low%20access%20to%20store,%202015%22,%227431.71159%22%0A%22White,%20low%20access%20to%20store%20(%25),%202015%22,%2218.103166%22%0A%22Black,%20low%20access%20to%20store,%202015%22,%22369.368139%22%0A%22Black,%20low%20access%20to%20store%20(%25),%202015%22,%220.899757%22%0A%22Hispanic%20ethnicity,%20low%20access%20to%20store,%202015%22,%22198.378837%22%0A%22Hispanic%20ethnicity,%20low%20access%20to%20store%20(%25),%202015%22,%220.483238%22%0A%22Asian,%20low%20access%20to%20store,%202015%22,%2249.868993%22%0A%22Asian,%20low%20access%20to%20store%20(%25),%202015%22,%220.121478%22%0A%22American%20Indian%20or%20Alaska%20Native,%20low%20access%20to%20store,%202015%22,%2230.148045%22%0A%22American%20Indian%20or%20Alaska%20Native,%20low%20access%20to%20store%20(%25),%202015%22,%220.073439%22%0A%22Hawaiian%20or%20Pacific%20Islander,%20low%20access%20to%20store,%202015%22,%225.503696%22%0A%22Hawaiian%20or%20Pacific%20Islander,%20low%20access%20to%20store%20(%25),%202015%22,%220.013407%22%0A%22Multiracial,%20low%20access%20to%20store,%202015%22,%22264.226188%22%0A%22Multiracial,%20low%20access%20to%20store%20(%25),%202015%22,%220.643638%22%0A"

In [233]:
grocery_access_df = combined_import(url1_grocery, 
                                    url2_grocery, 
                                    counties, 
                                    "Datasets/grocery_access/", 
                                    "grocery_access_df")

In [234]:
grocery_access_df

index,County,"Population, low access to store, 2010","Population, low access to store, 2015","Population, low access to store (% change), 2010 -15","Population, low access to store (%), 2010","Population, low access to store (%), 2015","Low income & low access to store, 2010","Low income & low access to store, 2015","Low income & low access to store (% change), 2010 - 15","Low income & low access to store (%), 2010",...,"Hispanic ethnicity, low access to store, 2015","Hispanic ethnicity, low access to store (%), 2015","Asian, low access to store, 2015","Asian, low access to store (%), 2015","American Indian or Alaska Native, low access to store, 2015","American Indian or Alaska Native, low access to store (%), 2015","Hawaiian or Pacific Islander, low access to store, 2015","Hawaiian or Pacific Islander, low access to store (%), 2015","Multiracial, low access to store, 2015","Multiracial, low access to store (%), 2015"
0,Sumner,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
1,Wilson,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
2,Marion,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
3,Crockett,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
4,Houston,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
90,Shelby,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
91,McNairy,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
92,Coffee,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638
93,Anderson,7667.187882,8150.826633,6.307903,18.676771,19.854883,3127.597765,2905.61274,-7.097621,7.618625,...,198.378837,0.483238,49.868993,0.121478,30.148045,0.073439,5.503696,0.013407,264.226188,0.643638


## At this point, I realized that the data was actually stored in the URL - which means I could not just replace the county name to access different data. I went back to the website to find more information on the API and discovered a csv with all the data I needed.
This project is still here for me to refer back to. 