<table style="font-size: 1em; padding: 0; margin: 0;">

<tr style="vertical-align: top; padding: 0; margin: 0;background-color: #ffffff">
        <td style="vertical-align: top; padding: 0; margin: 0; padding-right: 15px;">
    <p style="background: #182AEB; color:#ffffff; text-align:justify; padding: 10px 25px;">
        <strong style="font-size: 1.0em;"><span style="font-size: 1.2em;"><span style="color: #ffffff;">The Coastal Grain Size Portal (C-GRASP) dataset <br/><em>Will Speiser, Daniel Buscombe, Evan Goldstein</em></strong><br/><br/>
        <strong>> Assign Locations to Samples </strong><br/>
    </p>                       
        
<p style="border: 1px solid #ff5733; border-left: 15px solid #ff5733; padding: 10px; text-align:justify;">
    <strong style="color: #ff5733">The purpose of this notebook</strong>  
    <br/><font color=grey> This notebook will output a dataframe containing all of the data from a chosen C-GRASP dataset with a new field containing the address of each sample. As the API needs to be called for each individual sample, it is recommended that the user selects data sparingly if time is a constrait as processing time may take a while depending on internet connectivity.<font><br/>
    <br/><font color=grey> This notebook provides simple code in order to assign an address/location name to samples within a dataset.<font><br/>    
    <br/><font color=grey> To do so, a user can input a dataset of choice. <font><br/>
    <br/><font color=grey> The notebook then calls in the Open Street Maps geocoder API and uses reverse geocoding to assign an address to a lat/lon location.<font><br/>    
    </p>

In [1]:
import pandas as pd
import geocoder
import requests
import ipywidgets

#### Select a dataset

In [2]:
#Dataset collection widget
zen=ipywidgets.Select(
    options=['Entire Dataset', 'Estimated Onshore Data', 'Verified Onshore Data', 'Verified Onshore Post 2012 Data'],
    value='Entire Dataset',
    # rows=10,
    description='Dataset:',
    disabled=False
)

display(zen)

Select(description='Dataset:', options=('Entire Dataset', 'Estimated Onshore Data', 'Verified Onshore Data', '…

#### Download that dataset

In [4]:
url = 'https://zenodo.org/record/6099266/files/' 
if zen.value=='Entire Dataset':
    filename='dataset_10kmcoast.csv'
if zen.value=='Estimated Onshore Data':
    filename='Data_EstimatedOnshore.csv'
if zen.value=='Verified Onshore Data':
    filename='Data_VerifiedOnshore.csv'
if zen.value=='Verified Onshore Post 2012 Data':
    filename='Data_Post2012_VerifiedOnshore.csv'
print("Downloading {}".format(url+filename))   

Downloading https://zenodo.org/record/5874231/files/Data_EstimatedOnshore.csv


The next cell will download the CGRASP dataset and read it in as a pandas dataframe with variable name `df`

In [5]:
url=(url+filename)
print('Retrieving Data, Please Wait')
#retrieve data
df=pd.read_csv(url)
print('Sediment Data Retrieved!') 

Retrieving Data, Please Wait
Sediment Data Retrieved!


  exec(code_obj, self.user_global_ns, self.user_ns)


Let's take a quick look at the top of the file

In [6]:
df.head()

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,Sample_ID,Sample_Type_Code,Project,dataset,Date,Location_Type,latitude,longitude,...,d25,d30,d50,d65,d75,d84,d90,d95,Notes,unique_id
0,610,610,SPIbeach5,1.0,"SandSnap, image taken by:",sandsnap,2021-11-08,Beach?Y,26.12871,-97.16718,...,0.624976,0.657068,0.785439,0.889342,1.016927,1.131754,1.276942,1.397932,,
1,611,611,SPI6,1.0,"SandSnap, image taken by:",sandsnap,2021-11-08,Beach?Y,26.12899,-97.16713,...,0.624976,0.657068,0.785439,0.889342,1.016927,1.131754,1.276942,1.397932,,
2,612,612,SPI6,1.0,"SandSnap, image taken by:",sandsnap,2021-11-08,Beach?Y,26.12899,-97.16713,...,0.624976,0.657068,0.785439,0.889342,1.016927,1.131754,1.276942,1.397932,,
3,853,853,SPIbeach4,1.0,"SandSnap, image taken by:",sandsnap,2021-11-08,Beach?Y,26.16883,-97.17248,...,0.624976,0.657068,0.785439,0.889342,1.016927,1.131754,1.276942,1.397932,,
4,854,854,SPIbeach3,1.0,"SandSnap, image taken by:",sandsnap,2021-11-08,Beach?Y,26.16885,-97.17284,...,0.624976,0.657068,0.785439,0.889342,1.016927,1.131754,1.276942,1.397932,,


### Add location field
 
This cell adds a new 'Location' column containing the address of each sample extracted from Open Street Maps 

In [None]:
#adding empty column
df["Location"] = ""

#Loop through each sample
count=0
for i in range(0,len(df)):
    try:
        lat=df['latitude'].iloc[i]
        lon=df['longitude'].iloc[i]
        #This next line runs a reverse geocode on your sample lat/lons using OSM
        g=geocoder.osm([lat,lon], method='reverse')
        #This line extracts the address fron the queried OSM json
        df['Location'].iloc[i]=g.json['address']
        count=count+1
    except:
        pass # This skips errors for locations that are not assignable (think offshore samples etc)


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)


Let's view those locations

In [None]:
df['Location']

### Write to file

Finally, define a csv file name for the output dataframe

In [None]:
output_csvfile='../data_plus_locations.csv'

write the data to that csv file

In [None]:
df.to_csv(output_csvfile)