# Geographies of *Robinson Crusoe*

Published in 1719, Daniel Defoe's novel recounts the story of a middle-class Englishman––"Robinson Cruose, of York"––who becomes a sailor who shipwrecks on what the title page describes as "an un-inhabited Island on the Coast of AMERICA, near the Mouth of the Great River of Oroonoque."


![image](Crusoe-map.png)A map that was published in the fourth edition of *Robinson Cruose* in 1719:


This lesson will introduce two techniques: 

- [1. Mapping Place Names in *Robinson Crusoe*](#1.-Mapping-Place-Names-in-Robinson-Crusoe)
- [2. Place Names in *Robinson Crusoe* According to the spaCy Algorithm)](#2.-Place-Names-in-Robinson-Crusoe-According-to-the-spaCy-Algorithm)



## 1. Mapping Place Names in *Robinson Crusoe*

Let's say you've come up with a list of place names in the novel. 

In this next section of the notebook, we'll use two libraries, [GeoPy](https://geopy.readthedocs.io/en/stable/) and [Folium](https://python-visualization.github.io/folium/), to create a map from a list of place names in Daniel Defoe's *Robinson Crusoe* (1719).

Below are the steps used to create our map:

### Install Modules

In [38]:
# !pip install geopy
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="YOUR NAME's mapping app", timeout=2)

# import pandas, a library that will let us read in data from a spreadsheet
import pandas as pd
pd.set_option("max_rows", 400)
pd.set_option("max_colwidth", 400)

# Import folilum
import folium

# Define functions
def find_location(row):
    
    place = row['place']
    
    location = geolocator.geocode(place)
    
    if location != None:
        return location.address, location.latitude, location.longitude, location.raw['importance']
    else:
        return "Not Found", "Not Found", "Not Found", "Not Found"
    
def create_map_markers(row, map_name):
    folium.Marker(location=[row['lat'], row['lon']], popup=row['place']).add_to(map_name)

Let's look at an example. Below we're going to get coordinates for a single place name, "York"

In [39]:
location = geolocator.geocode("York")
location

Location(York, Yorkshire and the Humber, England, YO1 8SG, United Kingdom, (53.9590555, -1.0815361, 0.0))

## Le'ts read in our list of place names from *Robinson Crusoe* 
Import a spreadsheet with a list of names gathered from the explantory notes of the Penguin edition.

In [40]:
crusoe_df = pd.read_csv("Cruose-locations-in-Penguin-notes.csv")

Let's check to make sure we imported the right file:

In [41]:
crusoe_df

Unnamed: 0,place
0,Orinoco
1,York
2,Bremen
3,Dunkirk
4,Humber
5,Great Yarmouth
6,Norfolk Coast
7,Salé
8,Strait of Gibraltar
9,Canaries


In [42]:
# Let's use the geoPy library to extract the coordinates from a list of names
crusoe_df[['address', 'lat', 'lon', 'importance']] = crusoe_df.apply(find_location, axis="columns", result_type="expand")
crusoe_df

Unnamed: 0,place,address,lat,lon,importance
0,Orinoco,"Río Orinoco, 8002, Venezuela",7.63367,-64.850407,0.579878
1,York,"York, Yorkshire and the Humber, England, YO1 8SG, United Kingdom",53.959055,-1.081536,0.720326
2,Bremen,"Bremen, Deutschland",53.07582,8.807165,0.765049
3,Dunkirk,"Dunkerque, Nord, Hauts-de-France, France métropolitaine, France",51.034771,2.377252,0.601437
4,Humber,"Humber, Hull, Yorkshire and the Humber, England, DN19 7EW, United Kingdom",53.654226,-0.197158,0.553503
5,Great Yarmouth,"Great Yarmouth, Norfolk, East of England, England, United Kingdom",52.607174,1.731485,0.600039
6,Norfolk Coast,"Norfolk Coast AONB, Norfolk, East of England, England, United Kingdom",52.829627,0.435967,0.50026
7,Salé,"Salé سلا, Préfecture de Salé عمالة سلا, Rabat-Salé-Kénitra ⵔⴱⴰⵟ-ⵙⵍⴰ-ⵇⵏⵉⵟⵔⴰ الرباط-سلا-القنيطرة, Maroc / ⵍⵎⵖⵔⵉⴱ / المغرب",34.044889,-6.814017,0.592458
8,Strait of Gibraltar,"Strait of Gibraltar / مضيق جبل طارق / Estrecho de Gibraltar, Ceuta, España",35.973495,-5.638994,0.848856
9,Canaries,"Canarias, España",28.293578,-16.621447,0.653769


In [43]:
###### Create a base map
crusoe_map = folium.Map(location=[40.35, -74.65], zoom_start=2)

In [44]:
# add a line for dropping any addresses that are not found
found_crusoe_locations = crusoe_df[crusoe_df['address'] != "Not Found"]

In [55]:
# plot our coordinates on our map
found_crusoe_locations.apply(create_map_markers, map_name=crusoe_map, axis='columns')
crusoe_map

### To save map as an HTML file

In [46]:
crusoe_map.save("Crusoe-locations-in-Penguin-notes-map.html")

## 2. Place Names in *Robinson Crusoe* According to the spaCy Algorithm

We've created a map of Robinson Crusoe based on place names that we deteremined. But what if we wanted to do this on a much larger scale? 

### Named Entity Recognition
Using a form of machine learning called "named entity recognition," let's see what a natural language processing algorithm identifies as a place. We're going to use the machine learning algorithm spaCy.

In [None]:
## Let's install spaCy
!pip install -U spacy

In [3]:
import spacy
from spacy import displacy
from collections import Counter
import pandas as pd
pd.options.display.max_rows = 600
pd.options.display.max_colwidth = 400

In [None]:
!python -m spacy download en_core_web_sm

In [5]:
import en_core_web_sm
nlp = en_core_web_sm.load()

In [6]:
filepath = "Crusoe.txt"
text = open(filepath, encoding='utf-8').read()
document = nlp(text)

In [None]:
displacy.render(document, style="ent")

## Using Named Entity Recognition to generate a list of place names
The code below outputs all of the words from *Robinson Crusoe* that the [spaCy](https://spacy.io/api/entityrecognizer) machine learning model has labeled a "place"

In [7]:
for named_entity in document.ents:
    if named_entity.label_ == "GPE":
        print(named_entity)

London
Bremen
England
Flanders
London
Humber
Providence
Newcastle
mirth
Cromer
Yarmouth
London
Yarmouth
Providence
London
Guinea
London
Guinea
London
Guinea
the Canary Islands
Portugal
Spain
Morocco
Morocco
Teneriffe
Gambia
Guinea
Brazil
Guinea
Brazil
Santos
London
Lisbon
England
London
London
London
Portugal
London
Lisbon
Providence
St. Salvador
Guinea
Guinea
Spain
Portugal
Guinea
Guinea
England
England
Brazil
Brazil
America
Brazil
Providence
England
pickaxe
England
Lisbon
Providence
Providence
Providence
England
Brazil
thou
Providence
Providence
Providence
Israel
Barbary
England
America
Providence
England
England
BOAT
Brazil
Jerusalem
periagua
England
Providence
Providence
England
wishes—“O
Providence
England
Yorkshire
England
Providence
thou
Providence
America
Spain
Providence
Providence
England
America
Spain
England
Providence
Brazils
Guinea
Peru
Providence
Providence
Providence
Germany
Providence
Trinidad
St. Martha
America
Benamuckee
Benamuckee
Benamuckee
Benamuckee
Benamuckee
Pr