# DATA271 Final Project - Weather and Animal Migration Patterns in California

---

## Research Details

---

### Introducing the Problem
This project performs a statistical investigative process to explore and analyze animal migration data in conjunction with meteorological datasets to identify environmental factors that influence the movement patterns of species across California. We aim to understand how temperature, precipitation, and other atmospheric conditions affect when and where animals migrate, and whether these patterns are shifting in response to broader climate variability. 

By pairing ecological tracking data from Movebank with detailed weather reports from NOAA and NASA’s Daymet, this project investigates correlations between environmental change and behavioral shifts in migratory species. The goal is to use spatial and temporal analysis to uncover trends and stressors that can inform environmental monitoring, conservation planning, and scientific understanding of wildlife ecology.

---

### Addressing the Problem
The approach will involve joining datasets on both geographic coordinates and dates to conduct spatiotemporal analysis. We’ll investigate time series trends of both weather conditions and migration activity to determine if patterns emerge—such as species arriving earlier due to warming winters, or retreating from drought-affected areas.

Additionally, we'll explore the possibility of species migrations being driven not just by weather, but also by ecological relationships such as predator-prey dynamics. By cross-referencing species presence and timing, we can identify potential cases where weather-driven migration may be influenced—or confounded—by avoidance of other species.

This work contributes toward a better understanding of how climate variability and atmospheric anomalies impact wildlife ecosystems, particularly within a climate-sensitive and biodiversity-rich region like California.

---

### Analysis Breakdown
We ask the following questions before conducting our official exploratory data analysis:

- What weather patterns or atmospheric variables (e.g., temperature, precipitation, wind patterns) correlate most strongly with migration timing or intensity for species in California?
- Are there identifiable climate thresholds or seasonal shifts that serve as predictors for migratory events?
- Can we spatially and temporally map migration patterns alongside climate variables to detect meaningful trends or changes over time?
- How might ongoing climate variability affect the predictability and consistency of migration behaviors in the near future?
- Are any migratory shifts better explained by predator-prey relationships than by weather factors, and can predator presence be used as a confounding control in determining causality?

Our analysis will be broken down into the following stages:

1. **Explore Individual Datasets**  
   Clean and summarize each dataset—checking for null values, date and location alignment, and variable consistency. Generate summary statistics and visualizations to understand baseline structure and behavior.

2. **Analyze Combined Datasets**  
   Merge animal movement and climate datasets on time and location. Create layered time series visualizations and heat maps to understand migratory behavior in relation to environmental variables.

3. **Evaluate Significance and Observational Limitations**  
   Evaluate both the correlation strength and ecological plausibility of observed patterns. Recognize the observational nature of the data and avoid overextending conclusions where causality cannot be proven.

4. **Answer Research Questions**  
   Revisit initial questions in light of findings. Highlight where results support or refute assumptions about how atmospheric conditions drive migration, and discuss the role of other ecological pressures.

5. **Recommendations & Further Exploration**  
   Suggest data-driven implications for conservation or climate adaptation strategies. Propose new directions for data collection (e.g., more granular predator presence data), and identify gaps or limitations in the current analysis.

---

### Datasets
1. [NOAA National Centers for Environmental Information](https://www.ncei.noaa.gov/) – historical and real-time climate data  
2. [NASA's Daymet](https://daymet.ornl.gov) – gridded daily weather and climatology variables  
3. [Movebank](https://www.movebank.org/) – open-source animal movement data across species  

---

### Libraries & Modules
- **Pandas:** For time series wrangling and merging datasets  
- **Numpy:** Core scientific computing and vectorized calculations  
- **Matplotlib & Seaborn:** Visualizations for temporal and distributional trends  
- **Plotly:** Interactive visualizations and geographic mapping  
- **Geopandas:** Spatial joins and mapping of migration paths and weather zones  

---

### Project Resources
- [GitHub Repository](https://github.com/toritotony/Data271FinalProject)


## Collecting Data

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sodapy import Socrata
import random
import numpy as np
from plotnine import *


## Clean Data

## Gather Statistics

## Analyze Statistics 

## Use Above to Answer Questions using Inferential Statistics and Prediction

## Answer Questions and Conclude Findings

## References and Citations