# 🐧 The Evolution of Puffin's Population Regarding Climate Change

Welcome to the main Jupyter Notebook of our _Python project for Data Science_ ! 
Studying at the engineering school ENSAE, Institut Polytechnique de Paris, this project is part of the 2nd year course, under the supervision of Lino Galiana (Insee) and Romain Avouac (Insee). 

We share a passion for a clumsy but reaaaally cute seabird that is very common in Iceland : **the Atlantic puffin**. Unluckily, its demographic trend is downward. Well, it's not all by chance. Various high-qualited scientific studies have been carried out over the years to demonstrate the challenges facing this seabird species. Its biggest enemy to date is **climate change**, which is affecting its feeding (fish), reproduction and nesting. 

With this notebook, we set out to (re)demonstrate the causal link between global warming and the decline in the Atlantic puffin population. **The aim at the end of this page is to establish medium-term predictions for the evolution of this species, taking into account the different climate scenarios envisaged. Graphical visualization tools will support our modeling results.** 

Are you ready to see the disastrous consequences of climate change on such adorable birds? Let's get started! 
🐧🐧🐧🐧🐧🐧

## 0.1 : Setting up the work environment 

- data access through the cloud MinIO Client (files are in the folder 'diffusion')
- required packages 
- organizated environment 

In [1]:
import pandas as pd
import s3fs
import os

# Geographical vizualisation packages
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Polygon, LineString, Point
import matplotlib
import contextily as ctx
import folium
from folium import LayerControl
from IPython.display import display
import webbrowser

# Creation of two folders to centralize data and results
data_dir = "./data"  
os.makedirs(data_dir, exist_ok=True) 
results_dir = "./results"
os.makedirs(results_dir, exist_ok=True)


# Access information to the cloud MinIO Client (Eve's bucket)
fs = s3fs.S3FileSystem(client_kwargs={"endpoint_url": "https://minio.lab.sspcloud.fr"})
MY_BUCKET = "esam"

## xx : Map visualization of Atlantic puffin distribution worldwide 

The Atlantic puffin is a migratory bird of the North Atlantic and Arctic polar zones. BirdLife International's data enable us to visualize the presence of these birds across the globe. 

### 2.1 MinIO Client cloud data retrieval and overview

_Nota bene : Having encountered difficulties working with files directly in S3 streams (shapefile, .nc, etc.), it is preferable to download files directly from the MinIO Client locally, to ensure optimal operation of the pandas and geopandas ecosystems, which is what the following program does._ 

In [3]:
# Creation of a subfolder to store Shapefile files and downloading of needed files
local_shapefile_subfolder = os.path.join(data_dir, "local_shapefile_files")
os.makedirs(local_shapefile_subfolder, exist_ok=True)

shapefile_elements = ["F_arctica.shp", "F_arctica.shx", "F_arctica.dbf"]
for element in shapefile_elements:
    remote_path = f"{MY_BUCKET}/diffusion/puffin_data/{element}"
    local_path = os.path.join(local_shapefile_subfolder, element)
    with fs.open(remote_path, "rb") as remote_file:
        with open(local_path, "wb") as local_file:
            local_file.write(remote_file.read())

# Lecture of the main shapefile file
local_shapefile_path = os.path.join(local_shapefile_subfolder, "F_arctica.shp")
gdf = gpd.read_file(local_shapefile_path)

### 2.2 Preparing files for graphical display

Geographic data and geometries must be correctly parameterized to be correctly displayed on a map.
The 'Coordinate Reference System' (CRS) EPSG:4326 displays geographic coordinates based on longitude and latitude. Its main uses are : 
- satellite data 
- global cartography 
- geographic data reference system

In [8]:
# Correction of the CRS to obtain the correct lecture of geographic coordinates
if gdf.crs != "EPSG:4326":
    print("The initial CRS is :", gdf.crs)
    gdf = gdf.set_crs(epsg=4326)
    print("The CRS has been redefined as EPSG:4326.")

# Simplification of the geometries
#if len(gdf) > 1000:  # Par exemple, pour les fichiers volumineux
gdf["geometry"] = gdf["geometry"].simplify(tolerance=0.01)
print("Geometries have been simplified.")

# Conversion to 2D geometries
def convert_to_2d(geom):
    if geom is not None and geom.has_z:
        return geom.simplify(0)  # Suppression de la dimension Z
    return geom

gdf["geometry"] = gdf["geometry"].apply(convert_to_2d)

# Checking of validated geometries
invalid_count = (~gdf.is_valid).sum()
print(f"Number of invalided geometries before correction : {invalid_count}")

# Correction of invalided geometries
if invalid_count > 0:
    gdf["geometry"] = gdf["geometry"].buffer(0)

Geometries have been simplified.
Number of invalided geometries before correction : 0


### 2.3 Creation of a reactive folium map to visualize the worldwide presence of Atlantic puffins.

The Atlantic puffin lives on the high seas all year round, but returns to land when it breeds. This map shows the areas of non-breeding and breeding of the species, using reference data from BirdLife International.

The detailed puffin observation databases we use are established in breeding areas. 

In [None]:
# Definition of both layers to display on the map
breeding_zones = gdf[gdf["seasonal"] == 2]
non_breeding_zones = gdf[gdf["seasonal"] == 3]


# Creation of the Folium map and its layers 
m = folium.Map(location=[55, 0], zoom_start=3, tiles="CartoDB Positron")

def create_folium_layer(gdf, name, color):
    """Create a GeoJSON layer for a given GeoDataFrame."""
    return folium.GeoJson(
        gdf,
        name=name,
        tooltip=folium.GeoJsonTooltip(
            fields=["sci_name", "presence", "seasonal"], 
            aliases=["Scientific name", "Presence", "Seasonal"], 
            localize=True,
        ),
        style_function=lambda feature: {
            'fillColor': color,
            'color': 'black',
            'weight': 1,
            'fillOpacity': 0.5,
        },
    )
    """Function to create layer for the map"""

breeding_layer = create_folium_layer(breeding_zones, "Breeding zones", "green")
non_breeding_layer = create_folium_layer(non_breeding_zones, "Non-breeding zones", "lime")

breeding_layer.add_to(m)
non_breeding_layer.add_to(m)


<folium.features.GeoJson at 0x7f1f9dac0320>

### 2.4 Display of the folium map

Two options are possible : 
- display the folium map in the Notebook 
    * concentrates all results in this single notebook 
    * considerably heavier
- generate a HTML link for the folder ./results and generate a local server from the terminal to open a web page 

In [None]:
# 1st option : Displaying the map in the notebook 
display(m)

In [13]:
# 2nd option : HTML Link 

m.save("./results/puffin_distribution_map.html")
print(f"{os.path.abspath('./results')}")

""" Then, copy cell output and enter in the bash : 
cd <absolute path you just copy> python -m http.server 8000

A new web page should open. All you have to do is select the "puffin_distribution_map" ! """

/home/onyxia/work/projet_python_2024_ENSAE/results


' Then, copy cell output and enter in the bash : \ncd <absolute path you just copy> python -m http.server 8000\n\nA new web page should open. All you have to do is select the "puffin_distribution_map" ! '

### About local server  
A **local server** is a web server environment that runs solely on your computer. It uses your machine as a “server” to deliver files, HTML content or web applications to a browser via a local URL (such as http://localhost:8000).
- localhost: A special address that refers to your own machine. 
- Port: A number used to distinguish different services on your machine. By default, a local server often uses port 8000 or 8080.

_Step 1: Launch a local server_
- Open a terminal and navigate to the directory containing your files.
- Run the following command:
    _python -m http.server 8000_

_Step 2: Access the server in a browser_
- Open a browser.
- Go to http://localhost:8000.
You'll see the files in the directory as if you were browsing a website.

Other informations:
- Limited access: here, the local server is only accessible from your own machine.
- Stop server: to stop it, press Ctrl+C in the terminal.
- Applications : web development (test sites or applications locally before deploying them online), data visualization (serve files such as Folium maps or interactive graphic). 