# Gas Prices (Henry Hub)

**How Gas Prices impact energy stocks**

It depends on the company type.

#### If gas prices rise:
**Gas-focused producers (E&Ps) ‚Üí üìà Strong positive impact**
- Higher revenue, higher margins

**Integrated majors ‚Üí üìà Mild positive**
- Gas is part of portfolio

**Refiners ‚Üí üòê Neutral / mixed**
- Gas is more an input than revenue driver

**Utilities / power generators ‚Üí üìâ Often negative**
- Higher fuel costs squeeze margins

#### If gas prices fall:
- Producers ‚Üí üìâ Revenue declines
- Utilities ‚Üí üìà Fuel costs improve

#### Why the impact can be muted
- Many firms hedge prices
- Long-term contracts smooth volatility
- Stocks price expectations, not spot price alone

In [1]:
import pandas as pd
from pathlib import Path
from pandas_datareader import data as web

# Configuration
BASE_DIR = Path(r"D:\MS_Data_Science_Thesis\Data_Extraction")
OUTPUT_DIR = BASE_DIR / "Raw_Data_Folder"
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

# FRED series ID for Henry Hub Natural Gas Spot Price
SERIES_ID = "DHHNGSP"
START_DATE = "2010-01-01"

print(f"Data will be saved to: {OUTPUT_DIR}")

Data will be saved to: D:\MS_Data_Science_Thesis\Data_Extraction\Raw_Data_Folder


In [3]:
def fetch_and_clean_gas_prices(series_id, start_date):
    """
    Fetches FRED data, removes missing values, and standardizes columns in one go.
    """
    print(f"Fetching {series_id} from FRED...")
    
    try:
        # Fetch data
        df = web.DataReader(series_id, "fred", start=start_date)
        
        # Efficient cleaning: Drop NAs, reset index, and rename
        df = df.dropna().reset_index()
        df.columns = ["date", "gas_price_dollars_per_mmbtu"]
        
        # Ensure correct datatypes
        df["date"] = pd.to_datetime(df["date"])
        df["gas_price_dollars_per_mmbtu"] = pd.to_numeric(df["gas_price_dollars_per_mmbtu"])
        
        return df.sort_values("date")
    
    except Exception as e:
        print(f"Error fetching data: {e}")
        return pd.DataFrame()

In [5]:
# 1. Run the pipeline
gas_price_df = fetch_and_clean_gas_prices(SERIES_ID, START_DATE)

# 2. Save if data was retrieved
if not gas_price_df.empty:
    file_path = OUTPUT_DIR / "gas_price_daily.csv"
    gas_price_df.to_csv(file_path, index=False)
    
    # 3. Quick QA
    print(f"‚úÖ Successfully saved {len(gas_price_df)} rows.")
    print(f"üìç Location: {file_path}")
    display(gas_price_df.head())
else:
    print("‚ùå Process failed. Check your internet connection or FRED Series ID.")

Fetching DHHNGSP from FRED...
‚úÖ Successfully saved 4060 rows.
üìç Location: D:\MS_Data_Science_Thesis\Data_Extraction\Raw_Data_Folder\gas_price_daily.csv


Unnamed: 0,date,gas_price_dollars_per_mmbtu
0,2010-01-04,6.09
1,2010-01-05,6.19
2,2010-01-06,6.47
3,2010-01-07,7.51
4,2010-01-08,6.56
