# Exploring Maritime Activities and the S\&P 500

This notebook examines the relationship between AIS maritime traffic and the S\&P 500 index prices. The analysis is organized into four main steps:

1. **Data Collection**
2. **Data Cleaning**
3. **Data Analysis**
4. **Presentation**

---

## Table of Contents

1. [Project Overview](#project-overview)
2. [Installation and Setup](#installation-and-setup)
3. [Database Connection](#database-connection)
4. [Data Loading](#data-loading)
5. [Dataframe Summaries](#dataframe-summaries)
6. [Helper Functions](#helper-functions)
7. [Main Analysis and Visualizations](#main-analysis-and-visualizations)
8. [Subset Analysis](#subset-analysis)
9. [Conclusion](#conclusion)

---

## 1. Project Overview {#project-overview}

We explore how daily vessel counts derived from AIS (Automatic Identification System) data correlate with the daily and monthly close prices of the S\&P 500 index. The workflow is:

* **AIS Processor**: Downloads, cleans, and filters seven days of AIS data at a time, removes duplicates and non-commercial vessels, maps each vessel to its nearest port, and stores the cleaned data.
* **Financial Fetcher**: Uses `yfinance` to retrieve historical closing prices for the S\&P 500 (^GSPC) and individual components, scraping the current ticker list from Wikipedia.
* **Database Storage**: Saves cleaned datasets in a PostgreSQL database for efficient querying and reproducibility.
* **Pipeline Class**: Encapsulates all data retrieval, cleaning, and storage logic.

---

## 2. Installation and Setup {#installation-and-setup}

In [None]:
# Create and activate a virtual environment (optional)
# python -m venv .venv && source .venv/bin/activate

!pip install -r requirements.txt

#Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import altair as alt
from Pipeline.pipeline import Pipeline