# Introduction
The Association of Zoos and Aquariums (AZA) is a non-profit, independent accrediting organization representing more than 250 facilities in the United States and internationally. These facilities may participate in the AZA’s cooperatively managed Species Survival Plan (SSP) programs, which aim to manage a species’ ex situ (meaning “outside of its natural habitat” - e.g. in zoos) population. 

The International Union for Conservation of Nature (IUCN), composed of both governmental and non-governmental organizations, is the global authority on the status of the natural world. The IUCN’s Red List of Threatened Species is the world’s most comprehensive source of information on the global extinction risk status of animal, fungus, and plant species. Evaluated species are classified into one of eight categories: 
- Extinct (EX)
- Extinct in the Wild (EW)
- Critically Endangered (CR)
- Endangered (EN)
- Vulnerable (VU)
- Near Threatened (NT)
- Least Concern (LC)
- Data Deficient (DD)

Additionally, the direction of change of a species’ population size over time is assessed and categorized as one of the following:

- Increasing
- Decreasing
- Stable
- Unknown
- Unspecified

This project aims to analyze animal species with dedicated AZA SSPs and their IUCN Red List global extinction risk statuses to determine if SSPs prioritize the world’s most vulnerable species. 

# Load Data

In [627]:
import pandas as pd #Python library for data manipulation and analysis


SSP dataset created with data retrieved from the Association of Zoos and Aquariums' Animal Program Database. 

AZA. 2024. Animal Program Database. https://www.aza.org.

In [None]:
ssp = pd.read_csv('0225_aza_ssp.csv') #Read in SSP data
ssp.head #Display first 5 rows

Red List dataset created with data retrieved from the International Union for Conservation of Nature's Red List of Threatened Species. 

IUCN. 2024. The IUCN Red List of Threatened Species. Version 2024-2. https://www.iucnredlist.org.

In [None]:
redlist = pd.read_csv('0225_iucn_assessments.csv') #Read in Red List data
redlist.head #Display first five rows

# Clean and Merge Data

In [None]:
print(redlist.dtypes, ssp.dtypes) #Chek for dtype errors, none found

In [None]:
ssp.drop( #Drop unneeded SSP columns
    ['program_type',
     'genus_name',
     'species_name',
     'subspecies'],
    axis=1, #Axis=1 is columns, axis=0 would be rows
    inplace=True) #In place=True modifies existing dataframe, in place=False would return a new dataframe
ssp.head #Display first 5 rows

In [None]:
redlist.drop( #Drop unneeded Red List columns
    ['taxon',
     'common_name',
     'genus_name',
     'species_name',
     'assessment_scope'],
    axis=1, #Axis=1 is columns, axis=0 would be rows
    inplace=True) #In place=True modifies existing dataframe, in place=False would return a new dataframe
redlist.head #Display first 5 rows

In [None]:
print(ssp.isnull().any()) #Check for null values, none found
print(redlist.isnull().any()) #Check for null values, none found

In [None]:
redlist = redlist.set_index('scientific_name')
ssp = ssp.set_index('scientific_name')
ssp_vs_redlist = ssp.join(redlist, lsuffix='_SSP', rsuffix='_RL') #Combine data at column 'scientific_name'
print(ssp_vs_redlist.shape)
ssp_vs_redlist.head() #Display first 5 rows

In [None]:
#Strip whitespace and normalize case in all columns 
ssp_vs_redlist['taxon'] = ssp_vs_redlist['taxon'].str.strip().str.upper()
ssp_vs_redlist['common_name'] = ssp_vs_redlist['common_name'].str.strip().str.upper()
ssp_vs_redlist['assessment'] = ssp_vs_redlist['assessment'].str.strip().str.upper()
ssp_vs_redlist['population_trend'] = ssp_vs_redlist['population_trend'].str.strip().str.upper()
ssp_vs_redlist.head() #Display first 5 rows

In [None]:
print(ssp_vs_redlist.shape) #Size before removing duplicate values
duplicates = ssp_vs_redlist[ssp_vs_redlist.duplicated()] #Check for duplicate values
print(duplicates) #Show duplicate values
ssp_vs_redlist = ssp_vs_redlist.drop_duplicates() #Drop duplicate values
print(ssp_vs_redlist.shape) #Size after removing duplicate values

In [None]:
print(ssp_vs_redlist.describe())
print (ssp_vs_redlist.info())

# Analysis

In [None]:
assessment_count = ssp_vs_redlist.groupby('assessment').assessment.count() #Group ssp_vs_redlist by assessment
assessment_count = assessment_count.sort_values()
print(assessment_count)
print(assessment_count.describe()) #Descriptive statistics

In [None]:
population_trend_count = ssp_vs_redlist.groupby('population_trend').population_trend.count() #Group ssp_vs_redlist by population_trend
population_trend_count = population_trend_count.sort_values()
print(population_trend_count)
print(population_trend_count.describe()) #Descriptive statistics

In [None]:
assessment_and_trend = ssp_vs_redlist.groupby(['assessment', 'population_trend']).assessment.count() #Group ssp_vs_redlist by assessment AND population_trend
print(assessment_and_trend)
print(assessment_and_trend.describe()) #Descriptive statistics

# Visualization

In [641]:
# Python library for data visualization
import matplotlib.pyplot as plt #Python library for data visualization

In [None]:
labels = 'CR', 'EN', 'VU', 'NT', 'LC', 'DD' #Define data
values = [46, 53, 44, 24, 133, 1] #Retrieved from analysis and manually entered
colors = ['#008080', '#e19464', '#ced7b1', '#7eab98', '#ca562c', '#f1cf9e'] #Hex codes for custom colors
fig, ax = plt.subplots(figsize=(10, 10)) #Create pie chart
# fig creates a figure, ax creates an axes object
# .subplot() prefered over .figure() because it gives more control
ax.pie(values, labels=labels, colors=colors, autopct='%1.1f%%')
# ax.pie prefered over plt.pie because it gives more control
# autopct formats percentage labels, %1.1f%% specifies 1 decimal place
ax.set_title('Species Distribution by Extinction Risk Status')
plt.show() #Display pie chart

In [None]:
labels = 'Increasing', 'Decreasing', 'Stable', 'Unknown', 'Unspecified' #Define data
values = [26, 199, 55, 19, 2] #Retrieved from analysis and manually entered
colors = ['#008080', '#e19464', '#ced7b1', '#7eab98', '#ca562c'] #Hex codes for custom colors
fig, ax = plt.subplots(figsize=(10, 10)) #Create pie chart
ax.pie(values, labels=labels, autopct='%1.1f%%', colors=colors)
ax.set_title('Species Distribution by Population Trend')
plt.show() #Display pie chart

In [None]:
labels = ['CR - Increasing', 'CR - Decreasing', 'CR - Stable', 'CR - Unknown', 'CR - Unspecified', 
              'EN - Increasing', 'EN - Decreasing', 'EN - Etable', 'EN - Unknown',
              'VU - Increasing', 'VU - Decreasing', 'VU - Stable', 'VU - Unknown', 
              'NT - Decreasing', 'NT - Unknown', 
              'LC - Increasing', 'LC - Decreasing', 'LC - Stable', 'LC - Unknown', 
              'DD - Decreasing',] #Define data
values = [3/285*100, 32/285*100, 2/278*100, 1/285*100, 2/285*100, 
          4/285*100, 37/285*100, 3/285*100, 1/285*100, 
          2/285*100, 37/285*100, 1/285*100, 2/285*100, 
          23/285*100, 1/285*100,
          17/285*100, 53/285*100, 49/285*100, 14/285*100,
          1/285*100] #Retrieved from analysis and entered manually, assessment-trend value / total number of species * 100 to make a percentage
colors = ['#008080', '#008080', '#008080', '#008080', '#008080',
          '#e19464', '#e19464', '#e19464', '#e19464', 
          '#ced7b1', '#ced7b1', '#ced7b1', '#ced7b1', 
          '#7eab98', '#7eab98',
          '#ca562c', '#ca562c', '#ca562c', '#ca562c', 
          '#f1cf9e'] #Hex codes for custom colors
fig, ax = plt.subplots(figsize=(10, 10))
bars = ax.bar(labels, values, color=colors) #Create bar chart
plt.xlabel('Extinction Risk Status and Population Trend')
plt.ylabel('Percentage of Species')
plt.title('Species Distribution by Extinction Risk Status and Population Trend')
plt.xticks(rotation=45, # Adjust layout
           ha='right') #Aligns labels to avoid overlap
plt.tight_layout()
for bar in bars: #Add labels on each bar
    height = bar.get_height()
    label = f'{height:.1f}%' #Create label and format to have 1 decimal place
    plt.text(bar.get_x() + bar.get_width() / 2, #Specifies label location as center of bar
             height, label, ha='center', va='bottom', fontsize=10, rotation=60)
plt.show() #Display bar chart

# Discussion

### Summary 

The purpose of this project was to analyze animal species with dedicated AZA SSPs and their IUCN Red List global extinction risk statuses to determine if SSPs prioritize the world’s most vulnerable species.

To complete this analysis, I created two datasets: one with data retrieved from the Association of Zoos and Aquariums’ Animal Program Database and one with data retrieved from the International Union for Conservation of Nature’s Red List of Threatened species.

This data was read, cleaned, manipulated, and analyzed using Pandas. All visualizations were created using Matplotlib. A data dictionary for each dataset is included in the repository, and instructions on how to run the project using a virtual environment can be found in the README.

### Findings

As can be seen from the pie chart titled ‘Species Distribution by Extinction Risk Status’, over half of all species with dedicated SSPs are classified as ‘near threatened’ or ‘least concerned’, the 2 categories least at risk of extinction. Conversely, the populations of over half of the species with dedicated SSPs are classified as ‘decreasing’, which can be seen from the pie chart titled ‘Species Distribution by Population Trend’. The bar chart titled ‘Species Distribution by Extinction Risk Statistics and Population Trend’ unites these results and reveals what percent of SSP species are classified as each unique combination of extinction risk and population trend.

### Implications 

The results of my analysis suggest that species with dedicated SSPs are likely to be faced with a low risk of extinction and a population that is decreasing in size. Whether these parameters are taken into account when determining for which species to establish dedicated SSPs remains unknown.

It is important to note that SSPs are not the only avenue by which AZA facilities protect vulnerable species. Though a species may not have its own SSP, it is likely safeguarded by at least one of the AZA’s many other conservation initiatives.

### Future Plans

I plan to continue working on and improving this project in the coming months. My first course of action will be rework instances in which I manually entered data rather than referencing an object. I would also like to explore other methods of creating visualizations, perhaps creating a tableau dashboard. Ultimately, I would like to convert all tasks into custom functions to improve performance.