# Checkpoint Three: Cleaning Data

Now you are ready to clean your data. Before starting coding, provide the link to your dataset below.

My dataset:

Import the necessary libraries and create your dataframe(s).

In [2]:
import pandas as pd

mushrooms_df = pd.read_csv("mo_mushrooms.csv")
mushrooms_df.head(5) #first five rows reveals dataframe's columns and values to verify it loaded correctly

Unnamed: 0,id,id_raw,foray_key_raw,foray_key,fungi_name,fungi_name_raw,current_fungi_name,current_fungi_name_raw,common_name,common_name_raw,...,foray_date,foray_date_raw,foray_location,foray_location_raw,id_2,id_raw_2,month,month_1,month_raw,month_raw_1
0,2,2,75,2002/09/21 - Engelmann Woods,Daedaleopsis confragosa,Daedaleopsis confragosa,Daedaleopsis confragosa var. confragosa,Daedaleopsis confragosa var. confragosa,Blushing bracket,Blushing bracket,...,2002-09-21,2002-09-21 05:00:00,Engelmann Woods,Engelmann Woods,75,75,,,,
1,3,3,75,2002/09/21 - Engelmann Woods,Coprinus atramentarius,Coprinus atramentarius,Coprinopsis atramentaria,Coprinopsis atramentaria,,,...,2002-09-21,2002-09-21 05:00:00,Engelmann Woods,Engelmann Woods,75,75,,,,
2,4,4,75,2002/09/21 - Engelmann Woods,Galiella rufa,Galiella rufa,Galiella rufa,Galiella rufa,Hairy rubber cup,Hairy rubber cup,...,2002-09-21,2002-09-21 05:00:00,Engelmann Woods,Engelmann Woods,75,75,,,,
3,5,5,75,2002/09/21 - Engelmann Woods,Clavicorona pyxidata,Clavicorona pyxidata,Artomyces pyxidatus,Artomyces pyxidatus,Crown-tipped Coral,Crown-tipped Coral,...,2002-09-21,2002-09-21 05:00:00,Engelmann Woods,Engelmann Woods,75,75,,,,
4,6,6,75,2002/09/21 - Engelmann Woods,Trichaptum biformis,Trichaptum biformis,,,,,...,2002-09-21,2002-09-21 05:00:00,Engelmann Woods,Engelmann Woods,75,75,,,,


## Missing Data

Test your dataset for missing data and handle it as needed. Make notes in the form of code comments as to your thought process.

In [2]:
num_null = mushrooms_df.isnull().sum()

num_null

# foray_location has 3 missing values - below I find more data about the mushrooms with no foray location

# common_name has 3819 missing values but is not used in our analysis - so rows with missing values from this category are kept

# current_fungi_name has 416 missing - this column indicates when species/genus designation may have changed
  # To incorporate the most useful data from both columns, 
  # below I impute values from 'fungi_name' to replace any missing/incorrect values in 'current_fungi_name'.

# month category is entirely missing but is important for researching mushroom abundance by month
    # I pull month values from 'foray_date'

id                           0
id_raw                       0
foray_key_raw                0
foray_key                    0
fungi_name                   0
fungi_name_raw               0
current_fungi_name         416
current_fungi_name_raw     416
common_name               3819
common_name_raw           3819
id_1                         0
id_raw_1                     0
site_key_raw                 0
site_key                     0
foray_date                   0
foray_date_raw               0
foray_location               3
foray_location_raw           3
id_2                         0
id_raw_2                     0
month                     8774
month_1                   8774
month_raw                 8774
month_raw_1               8774
dtype: int64

In [3]:
# dropping all columns that are irrelevant to analysis

mushrooms_df = mushrooms_df.drop(columns=['id', 'id_raw', 'id_1', 'id_2', 'id_raw_1', 'id_raw_2',
                                          'site_key_raw', 'site_key', 'month_1', 'month_raw',
                                          'fungi_name_raw', 'current_fungi_name_raw', 
                                          'common_name_raw', 'foray_date_raw', 
                                          'foray_key_raw', 'foray_key',
                                          'foray_location_raw', 'month_raw_1'])

mushrooms_df.head(10)

Unnamed: 0,fungi_name,current_fungi_name,common_name,foray_date,foray_location,month
0,Daedaleopsis confragosa,Daedaleopsis confragosa var. confragosa,Blushing bracket,2002-09-21,Engelmann Woods,
1,Coprinus atramentarius,Coprinopsis atramentaria,,2002-09-21,Engelmann Woods,
2,Galiella rufa,Galiella rufa,Hairy rubber cup,2002-09-21,Engelmann Woods,
3,Clavicorona pyxidata,Artomyces pyxidatus,Crown-tipped Coral,2002-09-21,Engelmann Woods,
4,Trichaptum biformis,,,2002-09-21,Engelmann Woods,
5,Calocera cornea,Calocera cornea,Club-like tuning fork,2002-09-21,Engelmann Woods,
6,Cortinarius corrugatus,Cortinarius corrugatus,,2002-10-26,Pickle Spring,
7,Gymnopilus spectabilis,Gymnopilus junonius,Spectacular rustgill,2002-10-26,Pickle Spring,
8,Amanita virosa,Amanita virosa,European Destroying Angel,2002-10-26,Pickle Spring,
9,Amanita citrina,Amanita citrina,,2002-10-26,Pickle Spring,


In [4]:
# month category is entirely missing but is important for researching mushroom abundance by month
# pulling year, month, dayd values from 'foray_date' for new columns

mushrooms_df['year'] = ''
mushrooms_df['day'] = ''
mushrooms_df['month'] = ''

for i in range(len(mushrooms_df['foray_date'])):
    mushrooms_df['year'][i] = mushrooms_df['foray_date'][i].split('-')[0]
    mushrooms_df['year'][i] = int(mushrooms_df['year'][i])
    mushrooms_df['month'][i] = mushrooms_df['foray_date'][i].split('-')[1]
    mushrooms_df['month'][i] = int(mushrooms_df['month'][i])
    mushrooms_df['day'][i] = mushrooms_df['foray_date'][i].split('-')[2]
    mushrooms_df['day'][i] = int(mushrooms_df['day'][i])

In [5]:
# shows which mushrooms in our dataset have no foray location listed and when they were found

no_location = []
date_found = []

for i in range(len(mushrooms_df['foray_location'])):
    if pd.isna(mushrooms_df['foray_location'][i]):
        no_location.append(mushrooms_df['current_fungi_name'][i])
        date_found.append(mushrooms_df['foray_date'][i])
        
for i in range(len(no_location)):
    print(f"{no_location[i]} : {date_found[i]}")

Pluteus americanus : 2019-09-07
Pluteus cervinus : 2019-09-07
Polyporus squamosus : 2019-09-07


## Irregular Data

Detect outliers in your dataset and handle them as needed. Use code comments to make notes about your thought process.

In [6]:
year_list = []
year_count = {}

for each in mushrooms_df['year']:
    if each not in year_list:
        year_list.append(each)
year_list.sort()
        
for each in year_list:
    year_count[each] = mushrooms_df['year'].value_counts()[each]
    
year_count

# two years (the first recorded year 1987, and 1994) have unusually low mushroom counts. 1987 count may be explained by it being the first recorded year; 1994 weather data
  # can be consulted to explain that year's low count

{1987: 5,
 1990: 24,
 1992: 196,
 1993: 258,
 1994: 59,
 1995: 232,
 1996: 227,
 1997: 134,
 1998: 224,
 1999: 158,
 2000: 163,
 2001: 244,
 2002: 293,
 2003: 308,
 2004: 314,
 2005: 394,
 2006: 263,
 2007: 213,
 2008: 580,
 2009: 470,
 2010: 484,
 2011: 360,
 2012: 383,
 2013: 676,
 2014: 612,
 2015: 630,
 2016: 478,
 2017: 106,
 2018: 182,
 2019: 104}

In [7]:
location_count = mushrooms_df['foray_location'].value_counts()
location_count[:15] #the top 2 most mushroom-abundant foraging locations contain almost a third of the entire dataset

Mingo National Wildlife Refuge          2393
Babler State Park                       1087
Ha Ha Tonka State Park                   512
Hawn State Park                          462
Rockwood Reservation                     421
Meramec State Park                       399
Forest 44 Conservation Area              365
Cuivre River State Park                  305
Hazlet State Park                        213
Pickle Spring                            194
University of Missouri Forestry Camp     190
Creve Coeur County Park                  143
Lower Meramec Park                       138
Camp Latonka                             107
Tyson Research Center                    106
Name: foray_location, dtype: int64

## Inconsistent Data

Check for inconsistent data and address any that arises. As always, use code comments to illustrate your thought process.

In [8]:
mushrooms_df['current_fungi_name'].dtypes # datatype for this column is 'object' - not all columns are strings

# first, I attempted to .split() current_fungi_name to pull values for other columns
# attempting to .split() current_fungi_name on " " results in an error indicating that one of its values is a float, as seen below:

for each in mushrooms_df['current_fungi_name']:
    each.split(' ')

AttributeError: 'float' object has no attribute 'split'

In [None]:
# incorporating useful data from both current_fungi_name and fungi_name for up-to-date, complete data points
# if an entry in current_fungi_name is missing or is a float,
    # impute entry from fungi_name

mushrooms_df['genus'] = '' # values from current_fungi_name will go here
mushrooms_df['species'] = ''    
    
for i in range(len(mushrooms_df['current_fungi_name'])):
    if mushrooms_df['current_fungi_name'][i] == '-' or '':
        mushrooms_df['current_fungi_name'][i] = mushrooms_df['fungi_name'][i]
    elif type(mushrooms_df['current_fungi_name'][i]) == float:
        mushrooms_df['current_fungi_name'][i] = mushrooms_df['fungi_name'][i]
    else:
        pass

mushrooms_df = mushrooms_df.drop(columns=['fungi_name']) # no longer needed
                                 
for i in range(len(mushrooms_df['current_fungi_name'])):
    mushrooms_df['genus'][i] = mushrooms_df['current_fungi_name'][i].split(' ')[0]
    mushrooms_df['species'][i] = mushrooms_df['current_fungi_name'][i].split(' ')[1]
    
mushrooms_df.head(15)

In [None]:
mushrooms_df.to_csv("mo_mushrooms_cleaned.csv")

## Summarize Your Results

Make note of your answers to the following questions.

1. Did you find all four types of dirty data in your dataset?
2. Did the process of cleaning your data give you new insights into your dataset?
3. Is there anything you would like to make note of when it comes to manipulating the data and making visualizations?

1. I found all types. Because the outlier data I found was still useful to my analysis, I did not alter or remove it.

2. Data cleaning (in particular, pulling data points from the foray_date category) helped me visualize how each category relates to the others.

3. Some of the data will need to be converted to another datatype (e.g. the values in foray_date are strings) for visualization in Tableau.

To improve analysis of the dataset, I added an 'edibility' category based on my own research of each mushtoom type in the dataset. The edibility categories are as follows:

- 0: <b>Unknown</b> ---  Edibility of mushrooms in this category is not yet known or fully understood. In this category, I have included mushrooms that have conflicting reports as to their edibility, as well as mushrooms that are edible for many people but commonly cause adverse reactions in others.

- 1: <b>Poisonous</b> --- Mushrooms in this category are known to be poisonous.

- 2: <b>Inedible</b> --- Mushrooms in this category are not edible for many reasons (e.g. extremely bitter taste, tough texture, or small size). 

- 3: <b>Edible</b> --- Mushrooms in this category can be consumed in some form (e.g. cooked whole or rendered into powder for a tea).

- 4: <b>Medicinal</b> --- Mushrooms in this category can be consumed in some form for medicinal benefits. This category includes most (though not all) of Missouri's commonly-known medicinal mushrooms. Inedible mushrooms that are used for medical research are also <i>not</i> included in this category. Information on medicinal benefits is taken from Wikipedia and the Missouri Department of Conservation book <a href="https://www.amazon.com/Missouris-Wild-Mushrooms-Maxine-Stone/dp/1887247742">Missouri's Wild Mushrooms</a>.

- 5: <b>Psychoactive</b> --- Mushrooms in this category are known to produce non-ordinary states of consciousness when consumed.

Information on edibility was taken from Wikipedia and the Missouri Department of Conservation's online field guide, <a href="https://mdc.mo.gov/field-guide/search?fgSpeciesType=1007">seen here</a>.

To categorize the mushrooms, I used the following function:

In [14]:
def edible(mushroom, edibility):
    for i in range(len(other_df['current_fungi_name'])):
        if other_df['current_fungi_name'][i] == mushroom:
            other_df['edibility'][i] = edibility

In [15]:
mushrooms_df['current_fungi_name'].value_counts()

-                                 1075
Stereum ostrea                     172
Trametes versicolor                159
Auricularia auricula-judae         134
Schizophyllum commune              126
                                  ... 
Tarzetta cupularis                   1
Cortinarius gentilis                 1
Geastrum coronatum                   1
Pseudoarmillariella ectypoides       1
Scleroderma bovista                  1
Name: current_fungi_name, Length: 726, dtype: int64

This information allowed me to research each of the mushroom groups listed above, and pass it into the edible function as seen in the comment below:

In [None]:
# edible('Stereum ostrea', 2)

Once all mushrooms were categorized by edibility, I saved the new dataset as seen in the comment below.

In [None]:
# mushrooms_df.to_csv("mo_mushrooms_cleaned_3.csv")

In [3]:
test_df = pd.read_csv('mo_mushrooms_cleaned_3.csv')

In [85]:
test_df['edibility'].value_counts()

3    2786
2    2583
0    2336
4     639
1     422
5       8
Name: edibility, dtype: int64

In [None]:
# we see above the demographics of the mushrooms in Missouri Mycological Society's dataset.

In [97]:
total = test_df['edibility'].value_counts().sum()
unknown = round((test_df['edibility'].value_counts()[0]/test_df['edibility'].value_counts().sum()) * 100, 2)
poisonous = round((test_df['edibility'].value_counts()[1]/test_df['edibility'].value_counts().sum()) * 100, 2)
inedible = round((test_df['edibility'].value_counts()[2]/test_df['edibility'].value_counts().sum()) * 100)
edible = round((test_df['edibility'].value_counts()[3]/test_df['edibility'].value_counts().sum()) * 100, 2)
medicinal = round((test_df['edibility'].value_counts()[4]/test_df['edibility'].value_counts().sum()) * 100, 2)
psychoactive = round((test_df['edibility'].value_counts()[5]/test_df['edibility'].value_counts().sum()) * 100, 5)


print(f"The Missouri Mycological Society's dataset has {total} mushrooms."
     f"\nUnknown : {unknown} %"
     f"\nPoisonous : {poisonous} %"
     f"\nInedible : {inedible} %"
     f"\nEdible : {edible} %"
     f"\nMedicinal : {medicinal} %"
     f"\nPsychoactive : {psychoactive} %")

The Missouri Mycological Society's dataset has 8774 mushrooms.
Unknown : 26.62 %
Poisonous : 4.81 %
Inedible : 29 %
Edible : 31.75 %
Medicinal : 7.28 %
Psychoactive : 0.09118 %


I also wanted to create a map to visually demonstrate the foraged mushrooms' origins. To do so, I wrote a function that assigned a zip code to each location, as seen below:

In [6]:
test_df['foray_zip_code'] = ''

def zipcode(location, zip_code):
    for i in range(len(test_df['foray_location'])):
        if test_df['foray_location'][i] == location:
            test_df['foray_zip_code'][i] = zip_code

In [13]:
# from collections import Counter

# loc_list = []

# for location in test_df['foray_location']:
#     loc_list.append(location)
        
# Counter(loc_list).most_common()

In [14]:
zipcode('Mingo National Wildlife Refuge', 63960)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [15]:
zipcode('Babler State Park', 63005)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [16]:
zipcode('Ha Ha Tonka State Park', 65020)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [17]:
zipcode('Hawn State Park', 63670)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [18]:
zipcode('Rockwood Reservation', 63038)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [19]:
zipcode('Meramec State Park', 63038)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [20]:
zipcode('Forest 44 Conservation Area', 63049)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [21]:
zipcode('Cuivre River State Park', 63379)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [22]:
zipcode('Hazlet State Park', 62231)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [23]:
zipcode('Pickle Spring', 63670)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [24]:
zipcode('University of Missouri Forestry Camp', 63901)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [25]:
zipcode('Creve Coeur County Park', 63146)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [26]:
zipcode('Lower Meramec Park', 63128)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [27]:
zipcode('Camp Latonka', 63966)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [28]:
zipcode('Tyson Research Center', 63025)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [29]:
zipcode('Sam A Baker Park', 63956)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [30]:
zipcode('Engelmann Woods', 63055)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [31]:
zipcode('Duck Creek Conservation Area', 63960)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [32]:
zipcode('Charleville Winery', 63670)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [33]:
zipcode('Labarque Creek Conservation Area', 63069)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [34]:
zipcode('Coldwater Conservation Area', 63964)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [35]:
zipcode('Pere Marquette State Park', 62037)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [36]:
zipcode('Rock Bridge State Park', 65203)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [37]:
zipcode('Washington State Park', 63020)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [38]:
zipcode('Confluence Pointe State Park', 63386)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [39]:
zipcode('Castlewood State Park', 63021)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [40]:
zipcode('Busiek State Forest and Wildlife Area', 65669)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [41]:
zipcode('Silver Mines', 65401)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [42]:
zipcode('Otter Slough Conservation Area', 63841)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [43]:
zipcode('LaBarque Creek Conservation Area', 63069)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [44]:
zipcode('August Bush Conservation Area', 63304)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [45]:
zipcode('Amidon Memorial Conservation Area', 63645)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [46]:
zipcode('Big Springs State Park', 63965)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [47]:
zipcode('Greensfelder County Park', 63069)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [48]:
zipcode('Lake Wappapello State Park', 63967)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [49]:
zipcode('Forest Park', 63112)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [50]:
zipcode('Moone Athey Farm, Augusta', 63332)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [52]:
zipcode('Emmenegger Park', 63122)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [None]:
zipcode('Graham Cave State Park', 63361)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [54]:
zipcode('Cape Girardeau Conservation Nature Center', 63701)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [55]:
zipcode('Shaw Nature Reserve', 63039)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [56]:
zipcode('Matson Hill Park', 63341)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [57]:
zipcode('Cupola Pond Natural Area', 65401)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [58]:
zipcode('Mark Twain State Park', 65283)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [59]:
zipcode('Cow Creek Park', 65611)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [60]:
zipcode('Trail of Tears State Park, Missouri', 63755)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [61]:
zipcode('Tower Tee Golf Course', 63123)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [62]:
zipcode('West Tyson County Park', 63025)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [63]:
zipcode('Big Oak State Park', 63845)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [64]:
zipcode('Burr Oak Woods Nature Center', 64015)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [65]:
zipcode('Pyramid State Park', 62238)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [66]:
zipcode('Weldon Springs Conservation', 63304)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [67]:
zipcode('Blue Pond Natural Area', 63781)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [68]:
zipcode('Kirkwood Park', 63122)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [69]:
zipcode('Vadner Farm', 64076)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [70]:
zipcode('Ted Shanks Wildlife Management', 63433)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [71]:
zipcode('Litzsinger Road Ecology Center', 63124)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [72]:
zipcode('Governor Bond Lake, Illinois', 62246)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [73]:
zipcode('Queeny County Park', 63131)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [74]:
zipcode('Lake Gillespie', 62033)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [75]:
zipcode('Coon Island Conservation Area', 63961)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [76]:
zipcode('Allred Lake Conservation Area', 63954)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [77]:
zipcode('Green Rock Trail', 63069)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [78]:
zipcode('Salt Lick Trail, Valmeyer IL', 62295)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [79]:
zipcode('Peck Ranch State Wildlife Mgmt Area', 63941)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [80]:
zipcode('Mark Twain National Forest', 65401)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [81]:
zipcode('Cowards Hollow', 63937)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [82]:
zipcode('Poplar Bluff', 63901)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [83]:
zipcode('Murphysboro', 62966)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [84]:
zipcode('Puxico, Missouri', 63960)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [85]:
zipcode('Butler County, Missouri', 63901)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [86]:
zipcode('Pulaski County, Arkansas', 72202)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [87]:
zipcode('Holly Ridge Conservation Area', 63822)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [88]:
zipcode('Sand Pond Conservation Area', 63931)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [89]:
zipcode('Big Cane Conservation Area', 63901)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [90]:
zipcode('Lost Creek, Missouri', 63383)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [91]:
zipcode('Johnson Tract', 63934)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [92]:
zipcode('Southern Illinois University', 62901)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
