# `stedsans`

This is a notebook showing the current and most prominent capabilities of `stedsans`. 
It is heavily recommended to run the notebook by using Google Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MalteHB/stedsans/blob/main/notebooks/stedsans_demo.ipynb)

If running the notebook on your local machine consider installing [Anaconda](https://docs.anaconda.com/anaconda/install/) and then install the package `geopandas` to get the pre-built binaries, by using the `conda` package manager from an Anaconda integraged terminal:

```bash
conda install geopandas
```

Installing `stedsans`.

In [None]:
!pip -q install stedsans

If using either Google Colab, Linux or MacOS install `geopandas` using `pip`:

In [None]:
!pip -q install geopandas

If using Windows OS install `geopandas` to by using `conda`:

```bash
conda install geopandas
```

__Importing packages__

In [None]:
import os
import sys

from pathlib import Path

import geopandas

from stedsans import stedsans
from stedsans.data.load_data import Articles, GeoData

## Language capabilities of `stedsans`

In [None]:
danish_sentence = "Malte er mit navn, og jeg bor på Testvej 13, Aarhus C"

default_stedsans = stedsans(sentence = danish_sentence)

default_entities = default_stedsans.extract_entities()

print(default_entities)

## Danish

In [None]:
danish_stedsans = stedsans(danish_sentence, language="danish")

danish_entities = danish_stedsans.extract_entities()

print(danish_entities)

In [None]:
new_danish_sentence = "Jakob er min flotte homies navn, og han bor også i Aarhus C"

danish_sentence_entities = default_stedsans.extract_entities(new_danish_sentence)

print(danish_sentence_entities)

## English

In [None]:
english_sentence = "Hello my name is Malte and i live in Aarhus C"

english_stedsans = stedsans(english_sentence, language="english")

english_entities = english_stedsans.extract_entities()

print(english_entities)

In [None]:
english_sentence_entities = english_stedsans.extract_entities("Jakob is the name of my handsome homie, and he also lives in Aarhus C")

print(english_sentence_entities)

## Geographic capabilities of `stedsans`

In [None]:
txt = "Han bor på Tesvej 13 Aarhus C. Jakob bor i Testparken Aarhus C. MCH Arena er et legendarisk sted. Hun bor tæt på Dejbjerglund Efterskole. I Randers laver man shawarma. LEGOLAND er det fedeste sted. Skanderborg Bryghus laver gode øl. AGF er et ringe hold. Han bor på Ingerslevs Boulevard. Fjordgaarden er en lækker restaurant. Knebel ligger på Mols Djursland. Vestebro ligger vest for Østerbro og tæt på Amager. Bruuns Galleri og Dokk1 er steder i Aarhus."

In [None]:
danmark = GeoData.municipalities()
region_m = danmark[danmark["REGIONNAVN"] == "Region Midtjylland"]

In [None]:
geo_demo = stedsans(sentence = txt, language = 'danish')

# Basic functionality: Extract entities marked as locations (LOC) or organisations (ORG) and find their appertaining coordinates

The data format of the output can be adjusted

In [None]:
coords, df, gdf = geo_demo.get_coordinates()

print(coords)


In [15]:
print(df.head())

NameError: name 'df' is not defined

# Basic visualisation: Plotting points onto a map

Interactive folium map

In [None]:
#geo_demo.plot_locations()

Plotting onto a passed map layer (shp file)

In [None]:
shp_map = geo_demo.plot_locations(layer=danmark)

In [None]:
geo_demo.plot_locations(layer=danmark)

# Perform basic statistcal point pattern tests

These Q-statistics functions enable a quick statistical analysis of distribution of the points by checking for complete spatial randomness. 

In [None]:
# Initialsing a stedsans objects
example = stedsans(sentence=txt)

# Getting quadrat statistics
example.print_entities()


In [None]:
example.get_quad_stats()

In [None]:
# Plotting points with quadrants
example.plot_quad_count(squares = 4)

# Plotting region heatnmaps on a given map layer

This tool gives a beuatiful visual representation of the distribution of the extracted locations. The level of partitioning can be set using the *group_by* parameter.

By default `plot_cloropleth()` plots the world.

In [None]:
 geo_demo.plot_choropleth()

One can also use the argument `layer` to specify a geopandas dataframe to plot on.

In [None]:
danmark_cloropleth = geo_demo.plot_choropleth(layer=danmark)
danmark_cloropleth

In [None]:
denmark_heatmap_by_region = geo_demo.plot_choropleth(layer=danmark, group_by='REGIONNAVN')
denmark_heatmap_by_region

In [None]:
 region_m_heatmap = geo_demo.plot_choropleth(layer=region_m, title = 'Region Midtjylland', group_by = 'DAGI_ID')
 region_m_heatmap

# Aarhus article example for exam paper

In [None]:
aarhus_article = Articles.aarhus()

In [None]:
geo_demo = stedsans(file = aarhus_article, language = 'danish')

In [None]:
coords, df, gdf = geo_demo.get_coordinates()

In [None]:
df

In [None]:
geo_demo.plot_heatmap()

In [None]:
geo_demo.plot_heatmap(limit = 'country', limit_area = 'Danmark')

In [None]:
geo_demo.plot_heatmap(bounding_box=((55.859900,7.630005),(56.613931,10.958862)), bounded=True)

In [None]:
geo_demo.plot_heatmap(bounding_box=((55.9,7.6),(56.6,10.9)), bounded=False)

####  Choropleth

In [None]:
 geo_demo.plot_choropleth()  # This might take a while :)

In [None]:
danmark_cloropleth = geo_demo.plot_choropleth(layer=danmark)
danmark_cloropleth

In [None]:
denmark_heatmap_by_region = geo_demo.plot_choropleth(layer=danmark, group_by='REGIONNAVN')
denmark_heatmap_by_region

In [None]:
 region_m_heatmap = geo_demo.plot_choropleth(layer=region_m, title = 'Region Midtjylland', group_by = 'DAGI_ID')
 region_m_heatmap

# Den Store Danske - Jylland

## Reading in the article

In [None]:
jylland_article = Articles.jylland()

## Initialising stedsans object

In [None]:
geo_demo = stedsans(file = jylland_article, language = 'danish')

## Plotting locations

### Plotting on interactive leaflet map

In [None]:
geo_demo.plot_locations()

## Plotting on shapefile layer

In [None]:
geo_demo.plot_locations(layer=danmark, on_map=True)

In [None]:
geo_demo.plot_locations(layer=region_m, on_map=True)

## Plotting heatmaps

### No restrictions

In [None]:
geo_demo.plot_heatmap()

### Plotting only locations in denmark

In [None]:
geo_demo.plot_heatmap(limit = 'country', limit_area = 'Danmark')

### Boudning search area to Region Midtjylland

In [None]:
geo_demo.plot_heatmap(bounding_box=((55.9,7.6),(56.6,10.9)), bounded=True)

## Choropleth maps

### Plotting choropleth map of points bounded to Region Midtjylland on map of Denmark

In [None]:
geo_demo.plot_choropleth(layer=danmark, title='Jylland - Den Store Danske \n Bounded to Region Midtjylland', group_by='DAGI_ID', bounding_box=((55.9,7.6),(56.6, 10.9)), bounded=True)

### Plotting choropleth map grouped by region

In [None]:
geo_demo.plot_choropleth(layer=danmark, title='Jylland - Den Store Danske \n Grouped by Region', group_by='REGIONNAVN', bounding_box=((54.6,7.8),(57.8, 15.2)), bounded=False)

### Plotting choropleth map grouped by municipalites

In [None]:
geo_demo.plot_choropleth(layer=danmark, title='Jylland - Den Store Danske \n Unbounded', group_by='DAGI_ID')

## Quadrat Statistics

In [None]:
geo_demo.get_quad_stats(limit = 'country', limit_area = 'Danmark')

In [None]:
geo_demo.plot_quad_count(limit = 'country', limit_area = 'Danmark')