# Biodiversity in National Parks

## Overview
The goal of this analysis is to identify patterns in the conservation status of species across national parks, uncover which species and parks are most at risk, and provide actionable insights to help the National Parks Service focus its conservation efforts effectively.

## Project Goals
- Investigate the distribution of species across parks.
- Analyze conservation statuses and identify trends.
- Visualize data to highlight patterns and insights.
- Provide actionable recommendations for conservation efforts.

In [None]:
# Code to import the required libraries
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns

#configure visualizations for better readability in in plot generated by seaborn and matplotlib
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (10,6)




In [4]:
# we will use pandas to load our csv files.

species = pd.read_csv('species_info.csv')
observations = pd.read_csv('observations.csv')

#preview the dataset using head(). This will just allow us to make sure that the data set is loading properly.
print("Species dataset preview: ")
display(species.head())

print("\nObservations Dataset Preview:")
display(observations.head())

Species dataset preview: 


Unnamed: 0,category,scientific_name,common_names,conservation_status
0,Mammal,Clethrionomys gapperi gapperi,Gapper's Red-Backed Vole,
1,Mammal,Bos bison,"American Bison, Bison",
2,Mammal,Bos taurus,"Aurochs, Aurochs, Domestic Cattle (Feral), Dom...",
3,Mammal,Ovis aries,"Domestic Sheep, Mouflon, Red Sheep, Sheep (Feral)",
4,Mammal,Cervus elaphus,Wapiti Or Elk,



Observations Dataset Preview:


Unnamed: 0,scientific_name,park_name,observations
0,Vicia benghalensis,Great Smoky Mountains National Park,68
1,Neovison vison,Great Smoky Mountains National Park,77
2,Prunus subcordata,Yosemite National Park,138
3,Abutilon theophrasti,Bryce National Park,84
4,Githopsis specularioides,Great Smoky Mountains National Park,85


In [7]:
#Lets now look at unique values in columns
#Using the .unique() method returns an array of unique values from a column

print("Unique Species Categories:")
print(species['category'].unique(),"\n")

print("Conservation Statuses:")
print(species['conservation_status'].unique(),"\n")

print("Park Names:")
print(observations["park_name"].unique(),"\n")

Unique Species Categories:
['Mammal' 'Bird' 'Reptile' 'Amphibian' 'Fish' 'Vascular Plant'
 'Nonvascular Plant'] 

Conservation Statuses:
[nan 'Species of Concern' 'Endangered' 'Threatened' 'In Recovery'] 

Park Names:
['Great Smoky Mountains National Park' 'Yosemite National Park'
 'Bryce National Park' 'Yellowstone National Park'] 

