# Basic Decline Curve Analysis

This notebook demonstrates the fundamentals of decline curve analysis using Arps models.

## What You'll Learn
- Load and visualize production data
- Fit Arps decline models (exponential, harmonic, hyperbolic)
- Generate production forecasts
- Evaluate model performance
- Create professional visualizations

In [7]:
%pip install pandas
%pip install matplotlib
%pip install pydca

# Import required libraries
import sys
sys.path.insert(0, '..')
import logging
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pydca import dca
from pydca.logging_config import configure_logging, get_logger

# Configure logging
configure_logging(level=logging.INFO)
logger = get_logger(__name__)

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


Defaulting to user installation because normal site-packages is not writeable
Collecting pydca
  Using cached pydca-1.23.tar.gz (98 kB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Collecting scipy==1.3.1 (from pydca)
  Using cached scipy-1.3.1.tar.gz (23.6 MB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'error'
Note: you may need to restart the kernel to use updated packages.


  error: subprocess-exited-with-error
  
  × pip subprocess to install build dependencies did not run successfully.
  │ exit code: 1
  ╰─> [67 lines of output]
      Ignoring numpy: markers 'python_version == "3.5" and platform_system != "AIX"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.6" and platform_system != "AIX"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.5" and platform_system == "AIX"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.6" and platform_system == "AIX"' don't match your environment
      Ignoring numpy: markers 'python_version >= "3.7" and platform_system == "AIX"' don't match your environment
      Collecting wheel
        Using cached wheel-0.45.1-py3-none-any.whl.metadata (2.3 kB)
      Collecting setuptools
        Using cached setuptools-80.9.0-py3-none-any.whl.metadata (6.6 kB)
      Collecting Cython>=0.29.2
        Using cached cython-3.2.4-cp311-cp31

ModuleNotFoundError: No module named 'pydca'

## 1. Load Production Data

We'll start by loading sample production data from a single well.

In [None]:
# Load sample well data
df = pd.read_csv('data/sample_well_data.csv')
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')

# Display first few rows
logger.info("Production Data Summary")
logger.info(f"Period: {df.index[0].strftime('%Y-%m')} to {df.index[-1].strftime('%Y-%m')}")
logger.info(f"Data points: {len(df)}")
logger.info(f"Peak production: {df['oil_bbl'].max():.0f} bbl/month")
logger.info(f"Current production: {df['oil_bbl'].iloc[-1]:.0f} bbl/month")
logger.info(f"Cumulative production: {df['oil_bbl'].sum():.0f} bbl")

df.head()

## 2. Visualize Historical Production

In [None]:
# Plot production history
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(df.index, df['oil_bbl'], 'o-', linewidth=2, markersize=4, label='Historical Production')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Oil Production (bbl/month)', fontsize=12)
ax.set_title('Well Production History', fontsize=14, fontweight='bold')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 3. Fit Arps Decline Models

We'll compare three types of Arps decline models:
- **Exponential**: Constant decline rate (b=0)
- **Harmonic**: Decline rate decreases linearly (b=1)
- **Hyperbolic**: Decline rate decreases with exponent b (0<b<1)

In [None]:
# Create series for forecasting
series = df['oil_bbl']

# Forecast with different Arps models
models = {
    'Exponential': 'exponential',
    'Harmonic': 'harmonic',
    'Hyperbolic': 'hyperbolic'
}

forecasts = {}
metrics = {}

for name, kind in models.items():
    # Generate 12-month forecast
    forecast = dca.forecast(series, model='arps', kind=kind, horizon=12)
    forecasts[name] = forecast
    
    # Evaluate on historical data
    metric = dca.evaluate(series, forecast)
    metrics[name] = metric
    
    logger.info(f"{name} Model:")
    logger.info(f"  RMSE: {metric['rmse']:.2f} bbl/month")
    logger.info(f"  MAE: {metric['mae']:.2f} bbl/month")
    logger.info(f"  SMAPE: {metric['smape']:.2f}%")

## 4. Compare Model Forecasts

In [None]:
# Plot all forecasts together
fig, ax = plt.subplots(figsize=(14, 7))

# Plot historical data
ax.plot(series.index, series.values, 'o-', 
        color='blue', linewidth=2, markersize=5, label='Historical', zorder=5)

# Plot forecasts
colors = ['red', 'green', 'orange']
for (name, forecast), color in zip(forecasts.items(), colors):
    # Extract forecast portion only
    forecast_part = forecast.iloc[len(series):]
    ax.plot(forecast_part.index, forecast_part.values, '--', 
            color=color, linewidth=2, label=f'{name} Forecast', alpha=0.8)

# Add vertical line at forecast start
ax.axvline(x=series.index[-1], color='gray', linestyle=':', linewidth=2, label='Forecast Start')

ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Oil Production (bbl/month)', fontsize=12)
ax.set_title('Arps Model Comparison', fontsize=14, fontweight='bold')
ax.legend(loc='best')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 5. Model Performance Comparison

In [None]:
# Create metrics comparison DataFrame
metrics_df = pd.DataFrame(metrics).T
logger.info("Model Performance Metrics:")
logger.info(metrics_df.round(2))

# Find best model
best_model = metrics_df['rmse'].idxmin()
logger.info(f"✓ Best performing model: {best_model} (lowest RMSE)")

In [None]:
# Visualize metrics comparison
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for idx, metric in enumerate(['rmse', 'mae', 'smape']):
    metrics_df[metric].plot(kind='bar', ax=axes[idx], color=['red', 'green', 'orange'])
    axes[idx].set_title(f'{metric.upper()} Comparison', fontsize=12, fontweight='bold')
    axes[idx].set_ylabel(metric.upper())
    axes[idx].set_xlabel('Model')
    axes[idx].grid(True, alpha=0.3, axis='y')
    axes[idx].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 6. Generate Final Forecast with Best Model

In [None]:
# Use best model for final forecast
best_kind = models[best_model]
final_forecast = dca.forecast(series, model='arps', kind=best_kind, horizon=24)

logger.info(f"Final Forecast using {best_model} Model:")
logger.info(f"Forecast for next 6 months:")
forecast_only = final_forecast.iloc[len(series):len(series)+6]
for date, value in forecast_only.items():
    logger.info(f"  {date.strftime('%Y-%m')}: {value:.0f} bbl/month")

# Calculate decline rate
current_rate = series.iloc[-1]
forecast_6m = forecast_only.iloc[5]
decline_pct = ((current_rate - forecast_6m) / current_rate) * 100
logger.info(f"Expected 6-month decline: {decline_pct:.1f}%")

## 7. Professional Visualization

In [None]:
# Create publication-ready plot
dca.plot(series, final_forecast, 
         title=f'Production Forecast - {best_model} Arps Model',
         filename='production_forecast.png')

logger.info("Plot saved as 'production_forecast.png'")

## Summary

In this notebook, we:
1. Loaded and visualized production data
2. Fitted three types of Arps decline models
3. Generated and compared forecasts
4. Evaluated model performance
5. Selected the best model based on metrics
6. Created professional visualizations

## Next Steps

- **Notebook 02**: Learn economic evaluation and reserves estimation
- **Notebook 03**: Analyze multiple wells simultaneously
- **Notebook 04**: Explore advanced ML forecasting models