# Point data


## Check if all packages have been installed correctly

In [None]:
import geopandas as gpd
import folium
import json
import branca
import rasterio
import rasterio.mask
import matplotlib.pyplot as plt
import numpy as np
from rasterio.plot import show

print('All libraries are downloaded and imported correctly')

## Point data formats

We will start our exploration of `geopandas` functionalities by loading geospatial data from GeoJSON and shapefiles using the `read_file` function.

### GeoJSON files

JSON (JavaScript Object Notation, pronounced /ˈdʒeɪsən/) is an open standard file format and data interchange format that uses human-readable text. JSON is used store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values).

GeoJSON is another open standard format, based on JSON, designed for representing simple geographical features, along with their non-spatial attributes. The features include points, line strings, polygons, and multi-part collections of these types

Here is how a GeoJSON looks like:

In [None]:
with open('Data/point/traj.geojson', 'r') as f:
    data = json.load(f)
print(json.dumps(data, indent = 4, sort_keys=True))

We will start our explorations using a dataset with Point geospatial information for the intersections (kruispunten) of a road network in the Netherlands. 

`geopandas` creates a GeoDataFrame object when loading data from files

In [None]:
df = gpd.read_file('Data/point/kruispunten.geojson')

In [None]:
type(df)

Check how it closely resembles a `pandas` DataFrame

In [None]:
df.head()

Indeed, each column of this GeoDataFrame is a `pandas` Series, such as the "OMSCHR" descriptive column (note: omschrijving = description in Dutch) or "WVK_ID", i.e. the segment id that these junctions belong to.

In [None]:
type(df['OMSCHR'])

However, the last column of the dataset named "geometry" is a `geopandas` GeoSeries

In [None]:
type(df['geometry'])

The name of this column is used to remark its `dtype` which, unlike typical `pandas` dtypes, is marked as `geometry`.

In [None]:
df.dtypes

As specified in the `geopandas` [documentation](https://geopandas.org/en/stable/docs/user_guide/data_structures.html), this GeoSeries column holds a special status, and is usually referred to as the most important property of a GeoDataFrame. For instance, a spatial method is applied to a GeoDataFrame (or a spatial attribute like `area` is called), this commands will always act on the “geometry” column.

A GeoDataFrame may also contain other columns with geometrical objects, but only one column can be the active geometry at a time. To change which column is the active geometry column, one can use the `GeoDataFrame.set_geometry()` method.

The “geometry” column – no matter its name – can be accessed through the geometry attribute (e.g., `df.geometry`), and the name of the geometry column can be found by typing `df.geometry.name`.

In [None]:
df.geometry

In [None]:
df.geometry.name

`geopandas` geometric objects are (usually) `shapely` objects. [Shapely](https://shapely.readthedocs.io/en/stable/manual.html) is a python library used for manipulation and analysis of planar geometric objects.



In [None]:
type(df.loc[0,'geometry'])

While entries in a GeoSeries need not be of the `shapely` same geometric type, certain export operations will fail if this is not the case.

When typed, `shapely` objects yield a basic plot. We can show the coordinates of the object by using the `print` function

In [None]:
df.geometry[0]

In [None]:
print(df.geometry[0])

Information on the coordinate reference system of the data is accessed with `df.crs`. As you can see, the projection is specific for the [Netherlands](https://epsg.io/28992). 

In [None]:
df.crs

### Shape files

We can load the same dataset from shapefiles. As described in the [introduction](0_Introduction.pdf), the shapefile format is a geospatial vector data format for geographic information system (GIS) software. To load shapefile geospatial data, we load the file with the ".shp" extension.

In [None]:
df = gpd.read_file('Data/point/kruispunten.shp')

In [None]:
df.head()

## Point data visualisation with `folium`

Geospatial information can be easily plotted on a map using [Folium](https://python-visualization.github.io/folium/). We usually start by creating a *base map* that serves as the background for our geospatial dataset.

The `folium.Map` method creates a base map of given width and height with either default tilesets or a custom tileset URL. You can think of a *tileset* as a collection of adjacent images that can be joined together to display a map (more efficient than using a single large image). 

`folium` employs [OpenStreetMap](https://www.openstreetmap.org/) as default tileset, although other can be specified using with the `tile` argument.

The following code creates a map centered in South Holland by specifying appropriate latitude and longitude as `location` and starting zoom level in `zoom_start`.

In [None]:
# create
sh_map = folium.Map(
    location=[51.94, 4.46],
    zoom_start=10)

# show
sh_map

We can visualise our geospatial data by adding a GeoJSON layer on the base map using the information contained in our `geopandas` GeoDataFrame. See `folium.features.GeoJson` in [this website](https://python-visualization.github.io/folium/modules.html#folium.features.GeoJsonTooltip) for more information about the function

In [None]:
# create
gjson = folium.features.GeoJson(
    df,
).add_to(sh_map)

# show
sh_map

At the moment, the map just show the locations of our points. To visualise more information on the map, we can use the `GeoJsonPopup` method which creates "pop-up" windows displaying selected features relative to the points being clicked on. We do this by specifying the `fields` of interest in the GeoDataFrame; we use `aliases` to replace the obscure column names.

In [None]:
# create
folium.features.GeoJsonPopup(
    fields=['OMSCHR', 'RIJRTNGHRB'],
    aliases=['Description', 'Information']    
).add_to(gjson)

# show
sh_map