# 📚 Table of Contents

- [1 - INTRODUCTION](#1---introduction)
- [1.1 - Dataset Overview](#11---dataset-overview)
- [2 - Understanding the Biodiversity Data](#2---understanding-the-biodiversity-data)

# 1 - INTRODUCTION

According to United Nations (UN), biodiversity - or biological diversity - is defined as the "variety of life on Earth, in all its forms, from genes and bacteria to entire ecosystems as forests or coral reefs" **[1]**. Biodiversity plays a vital role in sustaining life on Earth by supporting food production, clean water, medicine, climate stability, and economic growth supported, i.e., by robust ecosystem services such as pollination, water purification, and climate regulation. Over half of the global GDP depends on nature, and forests and oceans could act as carbon sinks, absorbing more than half of all carbon emissions **[1]**. 

However, this delicate and invaluable richness is increasingly threatened by human activities. Land use — particularly agriculture — is the leading driver of biodiversity loss, while climate change continues to accelerate species extinction and ecosystem collapse **[1]**.


This project aims to develop practical skills in data analysis and interpretation using Python libraries such as Pandas, Matplotlib, and Seaborn. The datasets, provided by Codecademy, focus on biodiversity — a critical component of ecological resilience and sustainability. Two CSV files are used: species_info.csv, which contains metadata about various species, and observations.csv, which records species sightings across national parks. Through exploratory analysis and visualization, the project seeks to uncover patterns in species distribution and conservation status.

Additionally, the analysis aims to explore ecological representation across protected areas, identifying trends in species richness and conservation priorities.


## 1.1 - Dataset Overview

**`species_info.csv`contains:**
- `category`: Broad classification of the species (e.g., Mammal, Bird, Reptile);
- `scientific_name`: Latin name of the species;
- `common_names`: Common names used to refer to the species;
- `conservation_status`: Indicates if the species is endangered, threatened, or of least concern;

**`observations.csv` contains:**
- `scientific_name`: Latin name of the species observed;
- `park_name`: Name of the national park where the observation was recorded;
- `observations`: Number of times the species was observed in that park;

___

___
# 2 - Understanding the Biodiversity Data

## 2.1 - Loading necessary python libraries

In [7]:
import pandas as pd

## 2.2 - Loading data files and first data explorations

In [10]:
# Loading datasets

observ = pd.read_csv("observations.csv")
spec_info = pd.read_csv("species_info.csv")

In [None]:
# Preview the data

print(observ.head())
print(spec_info.head())

            scientific_name                            park_name  observations
0        Vicia benghalensis  Great Smoky Mountains National Park            68
1            Neovison vison  Great Smoky Mountains National Park            77
2         Prunus subcordata               Yosemite National Park           138
3      Abutilon theophrasti                  Bryce National Park            84
4  Githopsis specularioides  Great Smoky Mountains National Park            85
  category                scientific_name  \
0   Mammal  Clethrionomys gapperi gapperi   
1   Mammal                      Bos bison   
2   Mammal                     Bos taurus   
3   Mammal                     Ovis aries   
4   Mammal                 Cervus elaphus   

                                        common_names conservation_status  
0                           Gapper's Red-Backed Vole                 NaN  
1                              American Bison, Bison                 NaN  
2  Aurochs, Aurochs, Domestic 

___

# 7 - References

**[1]** United Nations (n.d.); Climate Action; *Biodiversity - our strongest natural defense against climate change*. Retrieved October 24, 2025, from https://www.un.org/en/climatechange/science/climate-issues/biodiversity.

**[2]** 