## Introduction to the dataset

The spotted lanternfly (*Lycorma delicatula*) or SLF is an invasive planthopper that was first introduced to Pennsylvania in 2014. Since then, it has spread rapidly through North America. In addition to being an overwhelmingly common roadside pest, it is also a serious threat to agriculture and forestry. 

This project displays **33307** observations of the spotted lanternfly in North America from 2015 to 2025. I retrieved this dataset from **GBIF.org**, which in turn pulls its data from **iNaturalist's Research-grade Observations**. See the citation here: GBIF.org (19 March 2025) GBIF Occurrence Download  https://doi.org/10.15468/dl.cmm22t.

My dataset contains all North American SLF observations ever recorded on iNaturalist, including concerning observations as far as California, Nova Scotia, and Mexico City. While my data covers the entire continent, I have chosen to focus my visualizations on the Eastern US, which is the epicenter of the SLF's spread. Feel free to zoom out on the first visualization, though — you can actually see scattered observations throughout the entire map.

In [None]:
! pip install plotly.express
! pip install pandas

In [9]:
# imports
import pandas as pd
import plotly.express as px

# load dataframe
df = pd.read_csv("SLF_NA.tsv", sep='\t')
df = df[df['year'] <= 2024]     # filter out 2025 data bc it's incomplete



## Observations Map

This first visualization shows how the annual range of the SLF has changed in the last 10 years. You can zoom around the map, press the play button to see change over time, or hover over points to get specific location data for each observation.

In [12]:
# bubble scatter map similar to gapminder's — you can zoom out!

# aggregate data by year and location
slf_annual_range = df.groupby(["year", "decimalLatitude", "decimalLongitude"]).size().reset_index(name="count")

# color palette for bubbles
pink_scale = [(0, "#F5A4B0"), (0.2, "#F27491"), (0.5, "#D14D76"), (0.8, "#A1265D"), (1, "#7A003D")]

# map
fig = px.scatter_mapbox(
    slf_annual_range,
    lat="decimalLatitude",
    lon="decimalLongitude",
    size="count",
    color="count",
    animation_frame="year",
    hover_data={"count": True, "decimalLatitude": True, "decimalLongitude": True, "year": False},
    title="Annual Range of Spotted Lanternflies",
    mapbox_style="carto-positron",
    size_max=25,
    color_continuous_scale=pink_scale,
    zoom=6,
    center={"lat": 40, "lon": -78},
    height=800,
    width=1100,
)

# hover tooltip formatting
fig.update_traces(marker=dict(size=12), hoverlabel=dict(font=dict(size=16)))

# make bubbles translucent
fig.update_traces(marker=dict(opacity=0.5))

# remove the legend bc it's unhelpful (scale changes year to year)
fig.update_layout(
    coloraxis_showscale=False,
    showlegend=False
)

# title formatting and margins
fig.update_layout(
    title=dict(
        font=dict(size=28, weight = "bold")  # Increase font size for title
    ),
    margin=dict(l=20, r=20, t=100, b=20)  # Add more top margin
)

# keep visualization box size consistent across frames
fig.update_layout(
    autosize=False,  # prevent resizing
)

fig.show()



## Filled Area Plot

My second visualization organizes the number of annual SLF observations by state. It gives a better sense of how each state's pest control measures have worked over time. You can click on the legend to toggle each state's visibility.

In [70]:
# filled area plot: you can click on states in the legend to toggle visibility!

# rename regions
df['stateProvince'] = df['stateProvince'].replace({'Distrito Federal': 'Mexico City', 'District of Columbia': 'Washington D.C.'})

# group observations by year and region
slf_state_year = df.groupby(["year", "stateProvince"]).size().reset_index(name="count")

# remove regions with few observations
slf_state_year = slf_state_year[slf_state_year["count"] > 5]

# filled area plot code (template from plotly gallery: https://plotly.com/python/filled-area-plots/)
fig = px.area(
    slf_state_year, 
    x="year", 
    y="count", 
    color="stateProvince", 
    title="Spotted Lanternfly Observations by Year and Region",
    labels={"count": "Num. of observations"}  # edits some labels
)

# text and bg color formatting
fig.update_layout(
    plot_bgcolor='white',
    
    title=dict(
        font=dict(size = 28, weight= "bold", color = "black"),
        xanchor="left"
    ),
    
    legend_title=None, # remove legend title (self-evident that these are regions / states)
    legend_tracegroupgap=9,  # legend line spacing
    
    xaxis_title=None, # remove x-axis title (self-evident that these are years)
    xaxis=dict(
        range = [2018, 2024],  # clip default time range to 2018-2024 (rapid rise doesn't start till 2018, incomplete data for 2025)
    ),
    yaxis_title=dict(
        font=dict(size=15)
    ),
    margin=dict(l=50, r=50, t=130, b=50),  # adjust margins
    height = 600,
        hoverlabel=dict(
        font=dict(size=16)  # size of hover labels
    ),
)

# axes and color formatting
fig.update_xaxes(
    ticks='outside',
    linecolor='black',
    gridcolor='lightgrey'
)
fig.update_yaxes(
    ticks='outside',
    showline=True,
    linecolor='black',
    gridcolor='lightgrey'
)

# Show plot
fig.show()

