# Geospatial mapping with Pyinaturalist and Folium

Here we will use Pyinaturalist and Folium to create interactive html maps that plot iNaturalist observations. In this example we will create a simple map, where each observation can be clicked on to produce a pop-up with a thumbnail image, basic observation information, as well as a link to the observation page on iNaturalist.

[Folium](https://python-visualization.github.io/folium/latest/index.html) is a very powerful mapping library which takes care of the background details such as downloading and stitching map tiles together. It allows for choosing from several different base map tile providers.

The first thing to do is import all the necessary libraries. The non-standard libraries used in this notebook are Folium, [Dateutil](https://dateutil.readthedocs.io/en/stable/) , [Pandas](https://pandas.pydata.org/), and of course, Pyinaturalist, so you will have to install them using `pip` or `conda`, as per your Python installation.

In [1]:
import json
from datetime import datetime
from os.path import exists

from dateutil import parser, tz
import pandas as pd
import folium
from folium.plugins import HeatMap

from pyinaturalist import (
    get_observations,
    get_places_autocomplete,
    Observation,
    Taxon,
    get_taxa,
    get_taxa_autocomplete,
    pprint,
)

### Get location and taxon IDs, and edit customizations

For this example, we will create a map of all bee observations in the town of Osoyoos, BC. You are free to run the code as is, or you can choose your own favorite taxons and location and produce a map for that. If so, the annotated source code will explain where to make changes.

In order to get observations for a certain geographic area, you need to know how iNaturalist encodes that area in terms of a `place_id`. All geographic places, small or large, are assigned a unique number, and this number can be [non-trivial to find](https://forum.inaturalist.org/t/is-there-a-place-where-i-can-go-to-get-a-list-of-inaturalist-place-ids/4016). Luckily, iNat (and pyinaturalist) provide an API to search on plain text place names. 

In [2]:
# Search for a place ID by name
response = get_places_autocomplete(q='Osoyoos')
pprint(response)

If you already know the taxon id, scientific name, or common name of the taxon(s) you are interested in, then you can add these directly to the `taxon_name=` or `taxon_id=` arguments in the call to `get_observations()`. Otherwise, there is a similar API for retrieving taxon IDs given a search on common or scientific names. You can use the optional `rank=` argument to limit responses to specific taxon ranks. 

In [3]:
response = get_taxa_autocomplete(q='bees', rank=['family', 'epifamily'])
pprint(response)

So from the above two calls we can see that the `place_id` for Osoyoos is '121320', and the `taxon_id` for all bees is '630955' (epifamily anthophila). Let's assign those values to some variables. Note that if you want to include multiple taxons, you can combine them in a list and specify that as the argument to `TAXON_ID`. We'll also create a variable named `DATASET_NAME` which will be the title for both the JSON response we will save locally, and the name of the html file for the completed map.

In [4]:
# Change to something appropriate
DATASET_NAME = "osoyoos_bees"

DATASET_FILENAME = f"{DATASET_NAME}.json"
DATASET_MAPNAME = f"{DATASET_NAME}.html"

# Place id (Choose from results of get_places_autocomplete call above...)
PLACE_ID = 121320

# Use either taxon name or taxon id
# If you are interested in multiple taxons, you can add them in a list:
# TAXON_ID=[6933, 558438]
TAXON_NAME = "Epifamily Anthophila"    # Bees!
TAXON_ID = 630955                      # Bees!

### Get the data from iNat

Now we will get the actual observtion data using Pyinaturalist. It is important to keep in mind that this call does place real demands on the underlying iNaturalist server infrastructure, so it is best practice to not make the call too excessively broad. For example, getting all mallard observations in Canada will be a ridiculously large request. When experimenting with this code, try to create a query with limited geographic scope, specialized and rarer taxons, or both. 

To further save on network resources, the following code will only make the API request if there is no local file named `osoyoos_bees.json` (or whatever the value is that you assigned to `DATSET_NAME`. This will ensure that continued use of the notebook will reload the observations from the locally saved copy, rather than performing another API request.

In [5]:
if not exists(DATASET_FILENAME):
    print("Making API call...")
    
    # The API call. See pyinaturalist documentation for additional arguments
    # you can pass to this call. For example, you can filter based on temporal
    # values, research grade, observations that belong to individual iNat projects,
    # specific users, specific identifiers, and many, many more.
    observations = get_observations(
        #taxon_name=TAXON_NAME,
        taxon_id=TAXON_ID,
        photos=True,
        geo=True,
        geoprivacy='open',
        place_id=PLACE_ID,
        page='all',
    )
    
    # Save results for future usage
    with open(DATASET_FILENAME, 'w') as f:
        json.dump(observations, f, indent=4, sort_keys=True, default=str)
else:
    print("Using local copy of data...")
    

Making API call...


### Read data into a Pandas Dataframe, and extract needed features

In this cell we load the JSON response from disk, and flatten it into a format that can be used to make a Pandas DataFrame. As iNaturalist observations are recorded in UTC, we normalize date values to the local timezone, then extract the observation coordinates for use on the map. Nothing in this cell needs to be customized, it should run as-is for any set of observations.

In [6]:
# Read the local JSON data
with open(DATASET_FILENAME) as f:
    d = json.load(f)

# Flatten nested JSON value, and load data into a Pandas DataFrame
df = pd.json_normalize(d["results"])

# Normalize timezones
local_tz = tz.tzlocal()

def to_local_tz_from_str(s):
    try:
        dt = parser.isoparse(s)
        return dt.astimezone(local_tz)
    except (TypeError, ValueError):
        return None

df['observed_on_local'] = df['observed_on'].apply(to_local_tz_from_str)

# Extract lat/long
df[['lon', 'lat']] = pd.DataFrame(df['geojson.coordinates'].to_list(), index=df.index)

### Create the interactive html map with data plotted

And now the code for the actual map. Folium requires a set of coordinates to center the map tiles on, so we calculate this using the maxima and minima of the lat/long data itself. In this example, we create two base map layers. The first uses the CartoDB Positoron map tiles, and the second uses OpenTopomap tiles. I am a big fan of the OpenTopomap tiles for naturalist data, as it does a great job of displaying local trails through parks and other natural areas, and of course, the topographic lines have their own intrinsic value.

After setting up the base layers, we loop through the observations, and create a plot for each one, including the html code for the pop-ups. We also add a heatmap annotation, which can be toggled on and off. This can be useful to visualize observation hot spots. We create a bounding box, again, using the bounds of the observation data in order to create a reasonable initial zoom level for the map. We include a widget to select between the baselayers, then save the map to a local html file.

In [7]:
# -----------------------------------------------
# Required: DataFrame with 'lat', 'lon', and iNat data already parsed
# Example assumes these fields are available:
# - 'taxon.name' (scientific)
# - 'taxon.preferred_common_name' (common name)
# - 'observed_on_local' (datetime object)
# - 'user.login' (observer username)
# - 'uri' (link to the iNaturalist observation)
# -----------------------------------------------

# Create the base map centered on your area of interest
# Get the bounding box of your data
south = df['lat'].min()
north = df['lat'].max()
west = df['lon'].min()
east = df['lon'].max()

# Calculate rough center of data
center_lat = (south + north) / 2
center_lon = (west + east) / 2

m = folium.Map(
    location=[center_lat, center_lon],
    zoom_start=13,
    tiles=None  # We'll add a custom tile layer next
)

# Add CartoDB basemap
folium.TileLayer(
    tiles='CartoDB Positron',
    name='Carto DB'
).add_to(m)

# Add OpenTopoMap basemap
folium.TileLayer(
    tiles='https://{s}.tile.opentopomap.org/{z}/{x}/{y}.png',
    attr='Map data © OpenStreetMap, SRTM | Map style © OpenTopoMap (CC-BY-SA)',
    name='OpenTopoMap'
).add_to(m)

# Format the datetime as a nice string for popups
def format_datetime(dt):
    try:
        return dt.strftime("%a, %B %d, %Y, %I:%M %p")
    except Exception:
        return "Unknown date/time"

# Loop through observations and add a marker for each one
for _, row in df.iterrows():
    lat = row['lat']
    lon = row['lon']

    scientific = row.get('taxon.name', 'Unknown')
    common = row.get('taxon.preferred_common_name', 'Unknown')
    observer = row.get('user.login', 'Unknown')
    observed = format_datetime(row.get('observed_on_local'))
    link = row.get('uri', '#')
    # This is the iNaturalist default taxon photo
    img_url = row.get('taxon.default_photo.square_url', '')
    # To use the actual observation photo, comment out the above line,
    # and uncomment the line below. 
    #img_url = row['observation_photos'][0]['photo']['url']

    # Construct HTML popup content
    popup_html = f"""
    <img src='{img_url}' width='100'><br>
    <strong>{common}</strong> (<em>{scientific}</em>)<br>
    Observed on: {observed}<br>
    By: <a href="{link}" target="_blank">{observer}</a>
    """

    # Add marker
    folium.CircleMarker(
        location=[lat, lon],
        radius=2,
        color='black',
        fill=True,
        fill_opacity=0.8,
        popup=folium.Popup(popup_html, max_width=300)
    ).add_to(m)

# Auto-fit the map bounds to the data
# This will ensure the zoom-level of the initial  map is reasonable
m.fit_bounds([[south, west], [north, east]])

# Add a toggleable heat map overlay
heat_data = df[['lat', 'lon']].values.tolist()
HeatMap(heat_data, radius=20, name="heatmap").add_to(m)

# Add layer control (useful for multiple basemaps or overlays)
folium.LayerControl().add_to(m)

# Save the map to HTML
m.save(DATASET_MAPNAME)
print(f"Map saved to {DATASET_MAPNAME}")


Map saved to osoyoos_bees.html


While the map has been saved to an html file, which can be opened and viewed using any web browser, it is also possible to render the map directly inside the notebook simply by calling on the map object: 

In [8]:
m

There is a great deal you can do to customize the look and feel of maps created with Folium, from the underlying baselayer map tiles, to the size and shape of the map markers themselves, and the information shown in the pop-ups. Consult the Folium documentation for more information on customizations. The map created here is just the beginning of what you can do.