<a href="https://colab.research.google.com/github/Andylebrocq/arch-proj-1/blob/master/Spatial%20Archaeology%20Project%20Final.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Spatial Archaeology - Spatial Distances and Central Points**
This notebook is designed to teach you how to use scipy.spatial.distance in order to locate a central space between a number of archaeological points.

We are going to use some data around settlements in the Scottish Highlands during the Clearance Period to try and work out where we think larger nucleated towns and villages should have naturally formed against where new towns were actually built.

We'll be using some free online sources to do a lot of the complicated bits and also the scipy.spatial.distance modelling tools to work this out (don't worry about the terrifying looking maths, it does it all for you... honest!!!).

Oh yeah, we'll get to make some cool maps too! Enjoy.


# Getting our Tools
We're going to need a few different tools to work through this one.

We need to import some, install others and know where to go for the rest

##Import##
Pandas

Matplotlib

Folium

Numpy

##Install & Import##
Geopandas

Scipy

##Online Calculators##
Converting your spatial co-ordinates:
https://www.bgs.ac.uk/data/webservices/convertForm.cfm#bngToLatLng

Calculating a geospatial central point:
http://www.geomidpoint.com/#using


In [0]:
import pandas as pd
import matplotlib
import folium
import numpy as np

In [0]:
!pip install geopandas
import geopandas as gpd

!pip install scipy
import scipy as sp

OK now we should have everything we need to get started except for ...

##Data!!!##

The core data we will be using today was obtained intially from Canmore which and is open source.  The data relates specifically to sites in Sutherland in the Highlands of Scotland, and are specifically settlements cleared during the Highland Clearances of ca 1750-1850 AD.

The data has been cleaned up in two key ways.  The Site Type has been altered to only reflect clearance ssettlements and not other archaeological data from each site.  Secondly I have had to use an online tool to manually change the Site Easting and Northing imformation to decimal Lat/Long co-ordinates.

That tool can be found here and is super simple to use:

https://www.bgs.ac.uk/data/webservices/convertForm.cfm#bngToLatLng

We can read in this initial data as below:


In [0]:
clearance_sites = pd.read_csv("https://raw.githubusercontent.com/Andylebrocq/arch-proj-1/master/Clearance.csv")
clearance_sites.head()

As you should be able to see we have the spatial information in two forms, site names and site types.    Next we want to visualise this on a fairly basic level, just to check that I have done the conversions correctly (ie are all these points still in Sutherland or have I managed to transfer some of them to the North Sea?)


In [0]:
location = clearance_sites['LATITUDE'].mean(), clearance_sites['LONGITUDE'].mean()

m = folium.Map(location=location,zoom_start=10)

for i in range(0,len(clearance_sites)):
    folium.Marker([clearance_sites['LATITUDE'].iloc[i],clearance_sites['LONGITUDE'].iloc[i]]).add_to(m)
        
#To display the map simply ask for the object (called 'm') representation
m

Great news!  Everything is still on dry land, and in such as way as we can easily identify a couple of patterns.

For example, all of these points are geographically inland settlements.  We can also see that a number of them exist in small clusters.


##What To Do With All This Data?

Now we have our data input and checked out for issues what should we do with it?
Well let's ask an archaeological question.

What space should have been occupied by larger settlements had these small clusters been allowed to grow naturally?

So, how do we do this?

There are a number of ways, all of which use very scary and complicated looking mathematical formulae.  Most of these would involve us having to convert our co-ordinates again, into either radians or cartesian versions.... 

For instance to calculate all of our co-ordinates into cartesian we would  need to run the below calculation for each set:


x = R * cos(lat) * cos(lon)

y = R * cos(lat) * sin(lon)

z = R *sin(lat)

However, we would then need to convert our co-ordinates back again once our various centre points had been found.


The good news for us is someone has kindly put together a tool to do all this for us already and it can be found here:

http://www.geomidpoint.com/#using


**Note**: If you are interested in how this works, the links below give explanations of how this all works:

https://stackoverflow.com/questions/10140029/convert-latitude-longitude-in-degree-radians

https://stackoverflow.com/questions/1185408/converting-from-longitude-latitude-to-cartesian-coordinates



In [0]:
projected_centres = pd.read_csv("https://raw.githubusercontent.com/Andylebrocq/arch-proj-1/master/Projected.csv")
projected_centres.head()

As you can see, I have now read in the data for 7 sites that may have become centres for nucleated settlements based on our original data.

I have had to exclude a number of the settlements as they were A) too far away from other sites to really be able to be included in their group, or  B) were on the other of a loch!  (there's a frustrating number of lochs in Sutherland).


lets see where these points sit in a basic map.

In [0]:
location = projected_centres['LATITUDE'].mean(), clearance_sites['LONGITUDE'].mean()

m = folium.Map(location=location,zoom_start=10)

for i in range(0,len(projected_centres)):
    folium.Marker([projected_centres['LATITUDE'].iloc[i],projected_centres['LONGITUDE'].iloc[i]]).add_to(m)
        
m

The final data we need to be able to run our spatial comparisons between projected settlement centres and actual settlement centres are the towns actually built in the period for those cleared from their settlements.

In [0]:
new_settlements = pd.read_csv("https://raw.githubusercontent.com/Andylebrocq/arch-proj-1/master/New.csv")
new_settlements.head()

This data had to be drawn together from historical information and then manually locating the sites and co-ordinates.

Lets have a look at where the people cleared from our first set of sites were moved to over the century

In [0]:
location = new_settlements['LATITUDE'].mean(), new_settlements['LONGITUDE'].mean()

m = folium.Map(location=location,zoom_start=9)

for i in range(0,len(new_settlements)):
    folium.Marker([new_settlements['LATITUDE'].iloc[i],new_settlements['LONGITUDE'].iloc[i]]).add_to(m)
        
m

Visually we can start to see a couple of patterns from our three sets of data, the pre-clearance settlements were close togather, generally inland and central to the region.  The projected settlements reflected this, but the actual newly built settlements seem much more spread apart and on the periphery, coastal and boundaries.


Let's try and make this a wee bit easier to view.


##Bringing Our Data Together

First things first, we have 3 sets of data, and 3 maps.  This takes a lot of scrolling to see what is what.  Lets join up all our data.  

Now, as I like to keep things easy and I had to create 2 of the tables, I have been able to keep the same headings and formats in each.  We could take the easy route and just copy and paste it all together, but our notebook can do this for us.

We need to use a process called concantination

In [0]:
#combine the two datasets together to make a big dataset we call 'all_data'
all_data = pd.concat([clearance_sites,projected_centres,new_settlements], sort=False, ignore_index=True)
all_data.head()

Success!!   Now we can try making a maps that help display what our three previous maps did, but in clearer visuals.

Lets install a couple of other map aking tools

In [0]:
%matplotlib inline
import seaborn as sns
import matplotlib.pyplot as plt
from folium.plugins import MeasureControl

Lets create a base map, based on the central point of all our co-rdinates (Again easiest just to use the online tool)

In [0]:
def generateBaseMap(default_location=[58.213398,-4.158034 ], default_zoom_start=9):
    base_map = folium.Map(location=default_location, control_scale=True, zoom_start=default_zoom_start)
    return base_map

In [0]:
base_map = generateBaseMap()
base_map

Great thats working too.  Lets try mapping all of our points, initial pre-clearance settlements, projected centres and new settlements on the one map.

We need to create 3 categories of site type in this case.

In [0]:
all_data['SITE TYPE'].value_counts()

What I have done here is to run a quick count on each of the tepe of site, now to map them

In [0]:
#codecell_makeabasicmap_ManipulatingyourData_UsingSymbology

#now make a map just like you did before. Note that this time we're adding a scale bar with 'control_scale'
location = all_data['LATITUDE'].mean(), all_data['LONGITUDE'].mean()
m = folium.Map(location=location,zoom_start=10,control_scale = True)

#Assign different colours to the two large site categories - B and C in this case
for i in range(0,len(all_data)):


    site_type = all_data['SITE TYPE'].iloc[i]
    if site_type == 'New Settlement':
        color = 'blue'
    elif site_type == 'Projected':
        color = 'green'
    else:
        color = 'red'
    
# add the markers to the map, using the locations and colours    
    folium.Marker([all_data['LATITUDE'].iloc[i],all_data['LONGITUDE'].iloc[i]],icon=folium.Icon(color=color)).add_to(m)

#type 'm' for map (the variable you set above) to tell the notebook to display your map
m

We can now see all of our points.  Red for pre-clearance settlements, blue for the projected centre points, and finally green for the actual new settlement sites.  

This isn't super clear, so what are we trying to achieve?  We want to see where the differences in the projected spaces and the new settlements, so lets see if we can improve on those.

Let's set up two maps side by side so we can see our conclusions clearly

In [0]:
new_projected = pd.concat([projected_centres,new_settlements], sort=False, ignore_index=True)
new_projected.head()

We've taken just the New Settlement and Projected Settlement data out to map this.

In [0]:
location = new_projected['LATITUDE'].mean(), new_projected['LONGITUDE'].mean()
m = folium.Map(location=location,zoom_start=10,control_scale = True)

for i in range(0,len(new_projected)):


    site_type = new_projected['SITE TYPE'].iloc[i]
    if site_type == 'New Settlement':
        color = 'blue'
    elif site_type == 'Projected':
        color = 'green'
    else:
        color = 'red'
      
    folium.Marker([new_projected['LATITUDE'].iloc[i],new_projected['LONGITUDE'].iloc[i]],icon=folium.Icon(color=color)).add_to(m)

m

##Conclusions

We asked if we could see a change in likely settlement centres over the Highland Clearance Period of 1750-1850.  By calculating midpoints of local clusters of cleared settlements, and then mapping them against the settlement centres that were built over the period, we can see that we get some idea of how this changed:

From the centre to the periphery!

Of course midpoints are only one way of calculating likely settlement locations,next week we will look at how to factor in geographic and environmental factors, like lochs and big hills.... but that's for next time!