### Initial Steps

In [0]:
#Like last time, get the tools we need at the start
import pandas as pd
import folium
import numpy as np

In [0]:
#Like last time, get the data we'll need at the start. I've been nice again and converted their coordinates to latitude and longitude for you. 
#You'll learn to do this yourself later in the course.
pottery = pd.read_csv('https://raw.githubusercontent.com/ropitz/spatialarchaeology/master/data/antikythera_survey_pottery.csv')
small_finds = pd.read_csv('https://raw.githubusercontent.com/ropitz/spatialarchaeology/master/data/antikythera_survey_small_finds.csv')
survey_data = pd.concat([pottery,small_finds], sort=False, ignore_index=True)

In [0]:
#check things loaded in and combined OK
survey_data.head()

In [0]:
#You can also check the individual files as well as the combined one we made
pottery.head()

In [0]:
# now we need to make a single column that has the most likely period listed in it, so that we can easily filter our data by the most likely period...
survey_data_time = survey_data[['MNLN', 'FNEB1',	'EB2',	'LPrePal', 'FPal', 'SPal', 'TPal', 'PPalPG', 'Geom', 'Arch', 'Class', 'Hell', 'ERom', 'MRom', 'LRom', 'EByz', 'MByz', 'EVen', 'MVen', 'LVen', 'Recent', 'Other']]
survey_data_time.head()

In [0]:
#and get rid of null values and make sure everything is a number
survey_data_time.astype('float64').fillna(0)


In [0]:
#here we take the columns from all the different periods, get the one with the maximum value, and write that columns name to our new 'colmax' field
def returncolname(row, colnames):
    return colnames[np.argmax(row.values)]

survey_data_time['colmax'] = survey_data_time.apply(lambda x: returncolname(x, survey_data_time.columns), axis=1)


In [0]:
#we can check it has all gone well
survey_data_time.head()

In [0]:
#now we can also add our new column back to our original data table by doing a 'merge'
survey_data_maxtime = pd.merge(survey_data, survey_data_time['colmax'], how='inner', left_index=True, right_index=True)
survey_data_maxtime.head()

Right now you are justifiably confused. We'll be talking more about the mess that is 'other people's data' next week. For now, have a look at the documentation for these datasets at: https://archaeologydataservice.ac.uk/catalogue/adsdata/arch-1115-2/dissemination/csv/pottery/documentation/pottery.txt

You'll see they explain that many of those weird abbreviations are periods and that the number in each one represents the chance that a given find belongs to that period. Sometimes I wish people wouldn't use abbreviations like this, but they've defined them in their metadata file, so we can't compain too much.

In [0]:
#Now we can get to making some maps. Like last time, we'll use folium and one of it's plugins
from folium.plugins import HeatMapWithTime


### Map Visualizations with Folium

Generating the base map that will be used throughout this notebook


In [0]:
#get the survey area centre
location_survey=survey_data_maxtime['DDLat'].mean(), survey_data_maxtime['DDLon'].mean()
print(location_survey)


In [0]:
#define a basemap we can reuse. Use the coordiantes for the centre you generated just above to centre the basemap
#This is a variant on how we did things last time...
def generateBaseMap(default_location=[35.870086207930626, 23.301798820980512], default_zoom_start=11):
    base_map = folium.Map(location=default_location, control_scale=True, zoom_start=default_zoom_start)
    return base_map

Arguments:<br><br>
location: Define the default location to zoom at when rendering the map<br>
zoom_start: The zoom level that the map will default to when rendering the map<br>
control_scale: Shows the map scale for a given zoom level

In [0]:
#check the basemap is working
base_map = generateBaseMap()
base_map

**Analysis Question:**<br>
How does the distrubution of finds change between different periods?



In [0]:
#lets get the heatmap tool, like last time
from folium.plugins import HeatMap


Let's start by comparing MRom to LRom, that is middle roman to late roman by putting their data in separate layers.

In [0]:
# make a layer for when each period is more than 50% likely, so you have all the sites that are probably in that period
survey_data_MRom = survey_data_maxtime.loc[(survey_data_maxtime['MRom'] > 50)]
survey_data_ERom = survey_data_maxtime.loc[(survey_data_maxtime['ERom'] > 50)]


In [0]:
# like last time, make heatmaps, but one for each period,  put them in different layers
base_map = generateBaseMap()
mrom = HeatMap(data=survey_data_MRom[['DDLat', 'DDLon', 'MRom']].groupby(['DDLat', 'DDLon']).sum().reset_index().values.tolist(), radius=8, max_zoom=13).add_to(base_map)
erom = HeatMap(data=survey_data_ERom[['DDLat', 'DDLon', 'ERom']].groupby(['DDLat', 'DDLon']).sum().reset_index().values.tolist(), radius=8, max_zoom=13).add_to(base_map)

#give the layers sensible names
mrom.layer_name = 'Middle Roman Distribution'
erom.layer_name = 'Early Roman Distribution'

# add the layer control
folium.LayerControl().add_to(base_map)


In [0]:
base_map

Now try and add some more layers to the map to show other periods!

In [0]:
# make a layer for when the max period is Class or Hell to compare these periods
survey_data_erommax = survey_data_maxtime.loc[(survey_data_maxtime['colmax'] =='ERom')]
survey_data_mrommax = survey_data_maxtime.loc[(survey_data_maxtime['colmax'] =='MRom')]


In [0]:

# like last time, make heatmaps, but one for each period,  put them in different layers
base_map = generateBaseMap()

erommax = HeatMap(data=survey_data_erommax[['DDLat', 'DDLon']].groupby(['DDLat', 'DDLon']).sum().reset_index().values.tolist(), radius=8, max_zoom=13).add_to(base_map)
mrommax = HeatMap(data=survey_data_mrommax[['DDLat', 'DDLon']].groupby(['DDLat', 'DDLon']).sum().reset_index().values.tolist(), radius=8, max_zoom=13).add_to(base_map)

#give the layers sensible names
erommax.layer_name = 'Early Roman Distribution'
mrommax.layer_name = 'Middle Roman Distribution'

# add the layer control
folium.LayerControl().add_to(base_map)
base_map


Thought exercise: The results of these two maps should be similar but slightly different. What is making the difference?

That's all for today. Be sure to save your copy of the notebook in your own repo so I can see it!