### 1. Introduction

We quickly inspect the data with birds songs metadata from Romania. For this, we will use the **pandas_profiling** Python package. 
Then, we represent the geographical distribution of birdsongs recording for Romania, with **folium** and **HeatMap**.


### 2. Load packages and data

In [None]:
import pandas as pd
import numpy as np
from pandas_profiling import ProfileReport
import os
import matplotlib.pyplot as plt
import seaborn as sns 
import datetime as dt
import folium
from folium.plugins import HeatMap, HeatMapWithTime
%matplotlib inline

### 3. Data analysis   

After we load the data and initialize the **pandas_profile**, we run a **ProfileReport**. This will show if we have missing data, what is each feature distribution, what unique or rare values we have, will estimate the usability of each feature (for models) and will also calculate features relative corelation.

In [None]:
data_df = pd.read_csv("/kaggle/input/xenocanto-birds-from-romania/birds_romania.csv")
profile = ProfileReport(data_df, title="Pandas Profiling Report")

In [None]:
profile

### 4. Geographical data distribution

To show the birdsongs recording geographical distribution, we:   
* aggregate the data on **latitude** and **longitude** and count the number of recordings per each geographical position;  
* initialize a folium HeatMap, centered on Romania and with the proper zoom level;  
* show the count of recordings per each position using HeatMap.  

In [None]:
aggregated_df = data_df.groupby(["lat", "lng"])["id"].count().reset_index()
aggregated_df.columns = ['lat', 'lng', 'count']
m = folium.Map(location=[46, 26], zoom_start=7)
max_val = max(aggregated_df['count'])
HeatMap(data=aggregated_df[['lat', 'lng', 'count']],\
        radius=30, max_zoom=18).add_to(m)
m