# Natural Gas Price Analysis & Forecasting
### JPMorgan Chase – Quantitative Research Style Project

**Objective:** Analyze historical monthly natural gas prices, identify trends and seasonality, and build a model to estimate prices for any past date and extrapolate one year into the future.

**Data Range:** 31 Oct 2020 – 30 Sep 2024

## 1. Import Required Libraries
We use Python libraries commonly used in quantitative research and data analysis.

In [None]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
from sklearn.linear_model import LinearRegression


## 2. Load and Inspect the Data
The dataset contains monthly natural gas prices at month-end. We load the CSV and convert the date column to datetime format.

In [None]:

df = pd.read_csv('/mnt/data/Nat_Gas.csv')
df['Dates'] = pd.to_datetime(df['Dates'])
df = df.sort_values('Dates')
df.head()


## 3. Exploratory Data Analysis (EDA)
We visualize the historical price movement to understand volatility, trend direction, and possible seasonality.

In [None]:

plt.figure(figsize=(12,6))
plt.plot(df['Dates'], df['Prices'], marker='o')
plt.title('Monthly Natural Gas Prices')
plt.xlabel('Date')
plt.ylabel('Price')
plt.grid(True)
plt.show()


### Observations
- Prices fluctuate significantly over time
- Sharp increases can be linked to supply constraints or geopolitical events
- Seasonal demand (winter heating) may impact prices

## 4. Feature Engineering
We convert dates into numerical format (ordinal) to allow regression-based modeling.

In [None]:

df['Date_Ordinal'] = df['Dates'].map(datetime.toordinal)
X = df[['Date_Ordinal']]
y = df['Prices']


## 5. Model Selection – Linear Regression
We use linear regression as a baseline model to capture the overall trend. This approach is simple, interpretable, and suitable for extrapolation.

In [None]:

model = LinearRegression()
model.fit(X, y)


## 6. Price Estimation Function
This function accepts any date and returns an estimated natural gas price. It works for historical dates as well as one year into the future.

In [None]:

def estimate_gas_price(input_date):
    input_date = pd.to_datetime(input_date)
    ordinal_date = input_date.toordinal()
    return model.predict([[ordinal_date]])[0]

# Example usage
estimate_gas_price('2025-06-30')


## 7. One-Year Extrapolation
We extend the dataset by one year to visualize projected prices.

In [None]:

future_dates = pd.date_range(df['Dates'].max(), periods=13, freq='M')
future_ordinals = future_dates.map(datetime.toordinal)
future_prices = model.predict(future_ordinals.values.reshape(-1,1))

plt.figure(figsize=(12,6))
plt.plot(df['Dates'], df['Prices'], label='Historical')
plt.plot(future_dates, future_prices, '--', label='Forecast (1 Year)')
plt.legend()
plt.title('Natural Gas Price Forecast')
plt.xlabel('Date')
plt.ylabel('Price')
plt.grid(True)
plt.show()


## 8. Why This Approach Was Used
- **Monthly data** → Avoids noise from daily volatility
- **Regression model** → Transparent and easy to explain to stakeholders
- **Extrapolation** → Suitable for indicative long-term pricing

### Factors Affecting Gas Prices
- Seasonal demand (winter heating)
- Global supply and storage levels
- Geopolitical events
- Energy policy changes

## 9. Conclusion
This notebook demonstrates a structured quantitative approach to commodity price analysis using Python. The model provides quick price estimates for any given date and supports business decisions for long-term storage contracts.