# Tutorial: BirdCLEF Visualizations with Plotly

Visualize some of the geospatial data for this competition (latitude/longitude of the recordings) using Plotly. Good for beginners - enjoy!

...More to come 🚧👷‍♂️

## 1 Plotting with Plotly

Plotly's open source graphing libraries are great for making interactive charts and maps for Python, R, Julia, Javascript, ggplot2, F#, MATLAB®, and Dash. It's quickly becoming standard for rapid development and sharing results with colleagues. The next step with these visualizations is usually to put them in some sort of dashboard or webpage for others to see/interact with. For that you can use Dash. Find out more at: [Plotly.com ](https://plotly.com)

In [None]:
pip install geocoder

In [None]:
import pandas as pd
import numpy as np
import random
import geocoder
import time
import plotly.express as px

In [None]:
path = '/kaggle/input/birdclef-2022/'
train_metadata = pd.read_csv(path + 'train_metadata.csv')

In [None]:
train_metadata.head()

* A tad bit unessasary but good fun, let's also fuzzy up the latitude and longitude values. There's some overlap which would mess up the effect of the plotting in some cases. By adding a small random float to the end of latitude and longitude, we won't have problems with the overlap. If the dataset was larger this could help more.

In [None]:
#because we have some duplicate latitutde/longitutde points, let's fuzzy them up a bit to avoid map overlap
def fuzzy_lat_long(num):
    """Add random 4 digit float to latitude and longitude betwen 0-0.00009. Assumes random is imported.  
    
    Args:    num (float): the latitude or longitude value to 4 decimal places
    
    Returns:    new_num (float): updated latitude or longitude value to 8 decimal places"""

    random_value = round(random.uniform(0, 0.00009),8)
    new_num = num + random_value
    
    return new_num

In [None]:
train_metadata['newlat'] = train_metadata.apply(lambda x: fuzzy_lat_long(x['latitude']), axis=1)
train_metadata['newlong'] = train_metadata.apply(lambda x: fuzzy_lat_long(x['longitude']), axis=1)

In [None]:
train_metadata.head()

### Plotly - Interactive Map
* Play around with the different parameters of px.scatter_geo (remember we used: import plotly.express as px)
* Try different interactive map actions:
 * Drag, zoom, hover, click, download as png, box select, etc. 

In [None]:
#show all birds
fig = px.scatter_geo(train_metadata, #this is the dataframe containing info to populate the map with
                    lat = train_metadata['newlat'], #this is the latitude train_metadata['latitude'] was the original
                    lon = train_metadata['newlong'],#this is the longitude train_metadata['longitude'] was the original
                    color ="primary_label", #select the dataframe column to use to set the color of markers
                    hover_name = "common_name", # column added to hover information
                    projection ="natural earth")
fig.show()

* **What can we learn from this?**
 * For one thing notice how different bird species are unique to different geographic locations (obvious right).
 * Because bird species are more common that others in different regions - you want to make sure your ML models aren't predicting a bird that is highly unlikely to be in the area the sound is recorded. For example if the sound recording is from South Africa, it's unlikely to be a Eurasian Skylark! 

In [None]:
#Let's also visualize only the scored_birds in the map to see what it looks like
scored_birds = ['akiapo', 'aniani', 'apapan', 'barpet', 'crehon', 'elepai', 'ercfra', 'hawama', 'hawcre', 'hawgoo', 'hawhaw', 'hawpet1', 'houfin', 'iiwi', 'jabwar', 'maupar', 'omao', 'puaioh', 'skylar', 'warwhe1', 'yefcan']
train = train_metadata[train_metadata['primary_label'].isin(scored_birds)]

* Let's practice extracting the country from a pair of latitude and longitude coordinates! This can help make visualizations and analyze bird population distributions by country!

In [None]:
def geo_reverse(lat,long):
    """Reverse geolocate country from a latitude and longitude. Assumes random is imported.  
    
    Args:   lat (float): the latitude value to 4 decimal places
            long (float): the longitude value to 4 decimal places
    
    Returns:    country (str): the country based on latitude and longitude"""
    
    time.sleep(0.04) #sleep for 40ms to avoid errors
    g = geocoder.osm([lat, long], method='reverse').json
    country = 'no country'
    if g:
        country = g.get('country')
        return country
    else:
        return country

In [None]:
#check that it works
geo_reverse(12.3910,-1.4930)

In [None]:
#run for all the rows
#train['country'] = train.apply(lambda x: geo_reverse(x['latitude'],x['longitude']), axis=1)
#train.head(10)

In [None]:
#show only the birds in the scored_birds list
fig = px.scatter_geo(train, 
                    lat=train['newlat'],
                    lon=train['newlong'],
                    color="primary_label",
                    hover_name="common_name",
                    projection="natural earth")
fig.show()

* **What can we learn from this?**
 * First off, notice no birds from the scored_birds list are from south america! 
 * Also, notice that almost all birds recordings in a continent are the same. 
 * We should check the distribution of birds in the dataset.

## 2 Bonus! Plotting with Folium
* Try plotting with Folium below. It's another great way to visualize geospatial data! Uncomment the code blocks and run to see how it looks.

## [Tutorial: Play Bird Audio on Map with Folium](https://www.kaggle.com/alexteboul/tutorial-play-bird-audio-on-map-with-folium)

<iframe src='https://xeno-canto.org/122399/embed?simple=1' scrolling='no' frameborder='0' width='340' height='115'></iframe>