# Polygons

## Day 3 - 30 Day Map Challenge

For this map I wanted to integrate the [Felt API](https://feltmaps.notion.site/Getting-Started-With-The-Felt-API-69c8b02b7d8e436daa657a04a2dbaffa#32523405d0e64aafbc7cfde5fe13803c) with [Wherobots Cloud](https://wherobots.com) so I could do some geospatial analysis using Spatial SQL and SedonaDB, then publish the results of my analysis to Felt's beautiful web-based mapping tooling.

I decided to use data from [BirdBuddy](https://mybirdbuddy.com/), which publishes data about bird sightings at its smart birdfeeders to find the range of some of my favorite bird species. 

![](https://raw.githubusercontent.com/johnymontana/30-day-map-challenge/a865cfc0524e97ef10e68debfe009088054133fb/img/03-polygons-final.png)

You can follow along by creating a free account on [Wherobots Cloud.](https://www.wherobots.services)

In [None]:
from sedona.spark import *
import geopandas
import requests

In [None]:
config = SedonaContext.builder().appName('bird-buddy-etl').config("spark.hadoop.fs.s3a.bucket.wherobots-geodata.aws.credentials.provider","org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider").getOrCreate()
sedona = SedonaContext.create(config)

## Bird Buddy & Wherobots Cloud File Management

Our data comes from [Bird Buddy](https://live.mybirdbuddy.com/) which makes a smart bird feeder than can identify bird species and (optionally) report their location.

![](https://raw.githubusercontent.com/johnymontana/30-day-map-challenge/a865cfc0524e97ef10e68debfe009088054133fb/img/bird_buddy1.png)

Bird Buddy publishes its data as CSV files so we'll download the latest and then upload the file to our Wherobots Cloud instance via the "Files" tab. The free tier of Wherobots cloud includes free data storage in AWS S3 which we can access within the Wherobots notebook environment.

![](https://raw.githubusercontent.com/johnymontana/30-day-map-challenge/a865cfc0524e97ef10e68debfe009088054133fb/img/files.png)

Once you've uploaded a file you can click the copy file icon to copy the file's S3 path to access the file via the Wherobots notebook environment. Note that these files are private to your Wherobots organization, so the S3 URL below won't be accessible to anyone outside my organization.


In [None]:
S3_URL = "s3://wherobots-geodata/all_metadata_september.csv"

Now we'll load the BirdBuddy CSV data and convert to a SedonaDB DataFrame so we can use Spatial SQL to find the range of each species.

In [None]:
bb_df = sedona.read.format('csv').option('header','true').option('delimiter', ',').load(S3_URL)
bb_df.show(5)

```
+-------------------+--------------------+----------+-----------------+----------------+
|anonymized_latitude|anonymized_longitude| timestamp|      common_name| scientific_name|
+-------------------+--------------------+----------+-----------------+----------------+
|          45.441235|          -122.51253|2023-09...|  dark eyed junco|      junco h...|
|           41.75291|            -83.6242|2023-09...|northern cardinal|cardinalis ca...|
|            43.8762|            -78.9261|2023-09...|northern cardinal|cardinalis ca...|
|            33.7657|            -84.2951|2023-09...|northern cardinal|cardinalis ca...|
|            30.4805|            -84.2243|2023-09...|northern cardinal|cardinalis ca...|
+-------------------+--------------------+----------+-----------------+----------------+
only showing top 5 rows
```

## Spatial SQL With SedonaDB

Now we're ready to use the power of Spatial SQL to analyze our Bird Buddy data. We want to find the range of each species, but first let's explore the data. 

First we'll convert our latitude and longitude fields into Point geometries.

In [None]:
bb_df = bb_df.selectExpr('ST_Point(CAST(anonymized_longitude AS Decimal(24,20)), CAST(anonymized_latitude AS Decimal(24,20))) AS location', 'timestamp', 'common_name', 'scientific_name')
bb_df.createOrReplaceTempView('bb')
bb_df.show(5)

```
+--------------------+--------------------+-----------------+--------------------+
|            location|           timestamp|      common_name|     scientific_name|
+--------------------+--------------------+-----------------+--------------------+
|POINT (-122.51253...|2023-09-01 00:00:...|  dark eyed junco|      junco hyemalis|
|POINT (-83.6242 4...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
|POINT (-78.9261 4...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
|POINT (-84.2951 3...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
|POINT (-84.2243 3...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
+--------------------+--------------------+-----------------+--------------------+
only showing top 5 rows
```

In [None]:
bb_df.count()

```
13972003
```

If we wanted to find all observations of Juncos in the data we can write a SQL query to filter the results and visualize the observations on a map using `SedonaKepler`, the SedonaDB integration for Kepler.gl

In [None]:
junco_df = sedona.sql("SELECT * FROM bb WHERE common_name LIKE '%junco' ")
junco_df.show(5)

```
+--------------------+--------------------+---------------+---------------+
|            location|           timestamp|    common_name|scientific_name|
+--------------------+--------------------+---------------+---------------+
|POINT (-122.51253...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
|POINT (-94.5916 3...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
|POINT (-85.643 31...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
|POINT (-87.7645 3...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
|POINT (-122.16346...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
+--------------------+--------------------+---------------+---------------+
only showing top 5 rows
```

In [None]:
SedonaKepler.create_map(df=junco_df, name='Juncos')

![](https://raw.githubusercontent.com/johnymontana/30-day-map-challenge/a865cfc0524e97ef10e68debfe009088054133fb/img/juncos_map.png)

Based on the map above it looks like Juncos have quite a large range throughout North America. Next, we'll filter the dataset to a few of my favorite bird species, then use the power of Spatial SQL with a `GROUP BY` operation to create convex hulls (polygon geometries) from the individual observations (point geometries) of each species.

In [None]:
range_df = sedona.sql("""
    SELECT common_name, COUNT(*) AS num, ST_ConvexHull(ST_Union_aggr(location)) AS geometry 
    FROM bb 
    WHERE common_name IN ('california towhee', 'steller’s jay', 'mountain chickadee', 'eastern bluebird') 
    GROUP BY common_name 
    ORDER BY num DESC
""")
range_df.show()

```
+------------------+-----+--------------------+
|       common_name|  num|            geometry|
+------------------+-----+--------------------+
|  eastern bluebird|65971|POLYGON ((-80.345...|
|     steller’s jay|37864|POLYGON ((-110.26...|
| california towhee|22007|POLYGON ((-117.05...|
|mountain chickadee| 4102|POLYGON ((-110.99...|
+------------------+-----+--------------------+
```

The Felt API supports file uploads in a variety of formats, but we'll use GeoJSON. We'll convert our SedonaDB DataFrame into a GeoPandas GeoDataFrame and then export to a GeoJSON file so we can upload it to the Felt API.

In [None]:
range_gdf = geopandas.GeoDataFrame(range_df.toPandas(), geometry="geometry")
range_gdf.to_file('birdbuddy_range.geojson', driver='GeoJSON')

Our GeoJSON file looks a bit like this (we've omitted some lines):

```
{
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": {
                "common_name": "eastern bluebird",
                "num": 65971
            },
            "geometry": {
                "type": "Polygon",
                "coordinates": [
                    [
                        [
                            -80.3452,
                            25.6062
                        ],
                        [
                            -98.2271,
                            26.2516
                        ],
                        ...
                        [
                            -80.3452,
                            25.6062
                        ]
                    ]
                ]
            }
        },
        ...
    ]
}
```

## Felt Maps API

If you haven't already, create a free [Felt](https://felt.com/) account and then in your account settings generate a new access token so you'll be able to create maps and upload data via the Felt API.

![](https://raw.githubusercontent.com/johnymontana/30-day-map-challenge/a865cfc0524e97ef10e68debfe009088054133fb/img/felt_api.png)

In [None]:
FELT_TOKEN = '<YOUR_TOKEN_HERE>'

The function below will create a new map in Felt, then create a new layer and upload our GeoJSON file to this layer. See the [Felt API docs](https://feltmaps.notion.site/Felt-Public-API-reference-c01e0e6b0d954a678c608131b894e8e1) for more examples of what's possible with the Felt API.

In [None]:
def create_felt_map(access_token, filename, map_title, layer_name):
    
    # First create a new map using the /maps endpoint
    create_map_response = requests.post(
        f"https://felt.com/api/v1/maps",
        headers={
            "authorization": f"Bearer {access_token}",
            "content-type": "application/json",
        },
        json={"title": map_title},
    )
    create_map_data = create_map_response.json()
    map_id = create_map_data['data']['id']
    map_url = create_map_data['data']['attributes']['url']
    print(create_map_data)
    
    # Next, we'll create a new layer and get a presigned upload url so we can upload our GeoJSON file
    layer_response = requests.post(
    f"https://felt.com/api/v1/maps/{map_id}/layers",
    headers={
        "authorization": f"Bearer {access_token}",
        "content-type": "application/json",
    },
    json={"file_names": [filename], "name": layer_name},
    )
    
    # This endpoint will return a pre-signed URL that we use to upload the file to Felt
    presigned_upload = layer_response.json()
    url = presigned_upload["data"]["attributes"]["url"]
    presigned_attributes = presigned_upload["data"]["attributes"]["presigned_attributes"]
    
    # A 204 response indicates that the upload was successful
    with open(filename, "rb") as file_obj:
        output = requests.post(
            url,
            # Order is important, file should come at the end
            files={**presigned_attributes, "file": file_obj},
        )
    layer_id = presigned_upload['data']['attributes']['layer_id']
    print(output)
    print(layer_id)
    print(presigned_upload)
    
    # Finally, we call the /maps/:map_id/layers/:layer_id/finish_upload endpoint to complete the process
    finish_upload = requests.post(
        f"https://felt.com/api/v1/maps/{map_id}/layers/{layer_id}/finish_upload",
        headers={
            "authorization": f"Bearer {access_token}",
            "content-type": "application/json"},
            json={"filename": filename, "name": layer_name},
    )
    print(finish_upload.json())

In [None]:
create_felt_map(FELT_TOKEN, "birdbuddy_range.geojson", "North American Bird Ranges", "My Favorite Birds")

We've now created a new map in Felt and uploaded our GeoJSON data as a new layer. We can also embed the map in our Jupyter notebook:

In [None]:
from IPython.display import HTML
HTML('<iframe width="1600" height="600" frameborder="0" title="My Favorite Bird Ranges" src="https://felt.com/embed/map/North-American-Bird-Ranges-a4c5cOCaRMiL64KK5N27TA"></iframe>"')

![](https://raw.githubusercontent.com/johnymontana/30-day-map-challenge/a865cfc0524e97ef10e68debfe009088054133fb/img/felt_embed.png)