# Historical Market Data API

This notebook creates a Flask API to fetch historical market data from Yahoo Finance.

## Features:
- Handles JSON POST requests
- Fetches data from Yahoo Finance
- Returns comprehensive market data including prices, volume, dividends, and splits
- Supports multiple interval granularities (1D, 1W, 1M, 5M)

## To Run:
1. Install required packages: `pip install flask yfinance pandas`
2. Run all cells in this notebook
3. The Flask API will start on http://localhost:5000

In [1]:
# Install required packages
!pip install flask yfinance pandas requests



In [2]:
import yfinance as yf
import pandas as pd
from flask import Flask, request, jsonify
from datetime import datetime
import json
import threading
import time

In [3]:
# Initialize Flask app
app = Flask(__name__)

def get_company_symbol(company_name):
    """
    Try to get the stock symbol for a company name.
    This is a simplified approach - in production, you'd use a proper symbol lookup service.
    """
    # Common company name to symbol mappings
    symbol_map = {
        'apple': 'AAPL',
        'microsoft': 'MSFT',
        'google': 'GOOGL',
        'alphabet': 'GOOGL',
        'amazon': 'AMZN',
        'tesla': 'TSLA',
        'meta': 'META',
        'facebook': 'META',
        'netflix': 'NFLX',
        'nvidia': 'NVDA',
        'intel': 'INTC',
        'ibm': 'IBM',
        'oracle': 'ORCL',
        'salesforce': 'CRM',
        'adobe': 'ADBE'
    }

    # Check if it's already a symbol (uppercase, short)
    if company_name.isupper() and len(company_name) <= 5:
        return company_name

    # Look up in our mapping
    return symbol_map.get(company_name.lower(), company_name.upper())

def format_market_data(ticker_data, symbol, company_name, start_date, end_date, interval):
    """
    Format the market data into the required structure
    """
    try:
        # Get company info
        ticker = yf.Ticker(symbol)
        info = ticker.info

        # Get historical data
        hist = ticker.history(start=start_date, end=end_date, interval=interval, actions=True)

        if hist.empty:
            return {"error": f"No data found for {symbol} in the specified date range"}

        # Format the data
        formatted_data = []

        for date, row in hist.iterrows():
            data_point = {
                "symbol": symbol,
                "company_name": company_name,
                "company_short_name": info.get('shortName', symbol),
                "date": date.strftime('%Y-%m-%dT%H:%M:%S'),
                "open_price": round(float(row['Open']), 2) if pd.notna(row['Open']) else None,
                "high_price": round(float(row['High']), 2) if pd.notna(row['High']) else None,
                "low_price": round(float(row['Low']), 2) if pd.notna(row['Low']) else None,
                "close_price": round(float(row['Close']), 2) if pd.notna(row['Close']) else None,
                "adjusted_close_price": round(float(row['Close']), 2) if pd.notna(row['Close']) else None,
                "volume": int(row['Volume']) if pd.notna(row['Volume']) else 0,
                "dividends": round(float(row['Dividends']), 4) if pd.notna(row['Dividends']) and row['Dividends'] > 0 else None,
                "stock_splits": float(row['Stock Splits']) if pd.notna(row['Stock Splits']) and row['Stock Splits'] > 0 else None
            }
            formatted_data.append(data_point)

        return {
            "status": "success",
            "symbol": symbol,
            "company_name": company_name,
            "company_short_name": info.get('shortName', symbol),
            "start_date": start_date,
            "end_date": end_date,
            "interval": interval,
            "data_points": len(formatted_data),
            "data": formatted_data
        }

    except Exception as e:
        return {"error": f"Error fetching data: {str(e)}"}

In [4]:
@app.route('/historical-data', methods=['POST'])
def get_historical_data():
    """
    API endpoint to get historical market data

    Expected JSON payload:
    {
        "company_name": "Apple",
        "start_date": "01/01/2023",
        "end_date": "31/12/2023",
        "interval": "1d"  // Optional: 1d, 1wk, 1mo, 5d
    }
    """
    try:
        # Get JSON data from request
        data = request.get_json()

        if not data:
            return jsonify({"error": "No JSON data provided"}), 400

        # Extract required fields
        company_name = data.get('company_name')
        start_date_str = data.get('start_date')
        end_date_str = data.get('end_date')
        interval = data.get('interval', '1d')  # Default to daily

        # Validate required fields
        if not all([company_name, start_date_str, end_date_str]):
            return jsonify({
                "error": "Missing required fields: company_name, start_date, end_date"
            }), 400

        # Parse dates (expecting DD/MM/YYYY format)
        try:
            start_date = datetime.strptime(start_date_str, '%d/%m/%Y').strftime('%Y-%m-%d')
            end_date = datetime.strptime(end_date_str, '%d/%m/%Y').strftime('%Y-%m-%d')
        except ValueError:
            return jsonify({
                "error": "Invalid date format. Please use DD/MM/YYYY format"
            }), 400

        # Validate interval
        valid_intervals = ['1d', '1wk', '1mo', '5d']
        if interval not in valid_intervals:
            return jsonify({
                "error": f"Invalid interval. Must be one of: {valid_intervals}"
            }), 400

        # Get company symbol
        symbol = get_company_symbol(company_name)

        # Fetch and format market data
        result = format_market_data(None, symbol, company_name, start_date, end_date, interval)

        if "error" in result:
            return jsonify(result), 404

        return jsonify(result), 200

    except Exception as e:
        return jsonify({"error": f"Internal server error: {str(e)}"}), 500

@app.route('/health', methods=['GET'])
def health_check():
    """Health check endpoint"""
    return jsonify({"status": "healthy", "message": "Historical Market Data API is running"}), 200

@app.route('/', methods=['GET'])
def home():
    """Home endpoint with API documentation"""
    return jsonify({
        "message": "Historical Market Data API",
        "endpoints": {
            "POST /historical-data": "Get historical market data",
            "GET /health": "Health check",
            "GET /": "This documentation"
        },
        "example_request": {
            "company_name": "Apple",
            "start_date": "01/01/2023",
            "end_date": "31/01/2023",
            "interval": "1d"
        }
    }), 200

In [5]:
# Interactive user input function
def get_user_input_and_fetch_data():
    """
    Interactive function to get user input and fetch market data
    """
    print("=== Historical Market Data Fetcher ===")
    print()

    # Get company name
    company_name = input("Enter the company name: ").strip()

    if not company_name:
        print("Error: Company name cannot be empty")
        return

    print("\nDate range format: DD/MM/YYYY")
    print("Enter the Date Range:")

    # Get date range
    start_date_str = input("Start Date: ").strip()
    end_date_str = input("End Date: ").strip()

    if not start_date_str or not end_date_str:
        print("Error: Both start and end dates are required")
        return

    # Get interval
    print("\nAvailable intervals:")
    print("1. 1d (Daily)")
    print("2. 1wk (Weekly)")
    print("3. 1mo (Monthly)")
    print("4. 5d (5 Days)")

    interval_choice = input("\nSelect interval (1-4, default is 1 for daily): ").strip()

    interval_map = {
        '1': '1d',
        '2': '1wk',
        '3': '1mo',
        '4': '5d'
    }

    interval = interval_map.get(interval_choice, '1d')

    try:
        # Parse dates
        start_date = datetime.strptime(start_date_str, '%d/%m/%Y').strftime('%Y-%m-%d')
        end_date = datetime.strptime(end_date_str, '%d/%m/%Y').strftime('%Y-%m-%d')

        print(f"\nFetching data for {company_name} from {start_date} to {end_date} with {interval} interval...")

        # Get symbol and fetch data
        symbol = get_company_symbol(company_name)
        result = format_market_data(None, symbol, company_name, start_date, end_date, interval)

        if "error" in result:
            print(f"Error: {result['error']}")
            return

        # Display results
        print(f"\n=== Results ===")
        print(f"Symbol: {result['symbol']}")
        print(f"Company: {result['company_short_name']}")
        print(f"Data Points: {result['data_points']}")
        print(f"Interval: {result['interval']}")

        # Show first few data points
        print("\nFirst 5 data points:")
        for i, data_point in enumerate(result['data'][:5]):
            print(f"\nDate: {data_point['date']}")
            print(f"  Open: ${data_point['open_price']}")
            print(f"  High: ${data_point['high_price']}")
            print(f"  Low: ${data_point['low_price']}")
            print(f"  Close: ${data_point['close_price']}")
            print(f"  Volume: {data_point['volume']:,}")
            if data_point['dividends']:
                print(f"  Dividend: ${data_point['dividends']}")
            if data_point['stock_splits']:
                print(f"  Stock Split: {data_point['stock_splits']}")

        if len(result['data']) > 5:
            print(f"\n... and {len(result['data']) - 5} more data points")

        # Save to JSON file
        filename = f"{symbol}_{start_date}_{end_date}_{interval}.json"
        with open(filename, 'w') as f:
            json.dump(result, f, indent=2)
        print(f"\nData saved to: {filename}")

    except ValueError as e:
        print(f"Error: Invalid date format. Please use DD/MM/YYYY format")
    except Exception as e:
        print(f"Error: {str(e)}")

In [6]:
# Function to run Flask app in a separate thread
def run_flask_app():
    """Run Flask app in a separate thread"""
    app.run(host='0.0.0.0', port=5000, debug=False, use_reloader=False)

# Start Flask API server
print("Starting Flask API server...")
flask_thread = threading.Thread(target=run_flask_app, daemon=True)
flask_thread.start()

# Wait a moment for server to start
time.sleep(2)
print("Flask API server started on http://localhost:5000")
print("\nAPI Endpoints:")
print("- GET  /          : API documentation")
print("- GET  /health    : Health check")
print("- POST /historical-data : Get historical market data")
print("\nExample POST request to /historical-data:")
print(json.dumps({
    "company_name": "Apple",
    "start_date": "01/01/2023",
    "end_date": "31/01/2023",
    "interval": "1d"
}, indent=2))

Starting Flask API server...
 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5000
 * Running on http://172.28.0.12:5000
INFO:werkzeug:[33mPress CTRL+C to quit[0m


Flask API server started on http://localhost:5000

API Endpoints:
- GET  /          : API documentation
- GET  /health    : Health check
- POST /historical-data : Get historical market data

Example POST request to /historical-data:
{
  "company_name": "Apple",
  "start_date": "01/01/2023",
  "end_date": "31/01/2023",
  "interval": "1d"
}


In [7]:
# Interactive mode - Run this cell to use the interactive input
print("=== Interactive Mode ===")
print("Run the cell below to start interactive data fetching")
print("Or use the Flask API endpoints above for programmatic access")

=== Interactive Mode ===
Run the cell below to start interactive data fetching
Or use the Flask API endpoints above for programmatic access


In [8]:
# Run this cell for interactive user input
get_user_input_and_fetch_data()

=== Historical Market Data Fetcher ===

Enter the company name: google

Date range format: DD/MM/YYYY
Enter the Date Range:
Start Date: 14/01/2023
End Date: 14/01/2024

Available intervals:
1. 1d (Daily)
2. 1wk (Weekly)
3. 1mo (Monthly)
4. 5d (5 Days)

Select interval (1-4, default is 1 for daily): 4

Fetching data for google from 2023-01-14 to 2024-01-14 with 5d interval...

=== Results ===
Symbol: GOOGL
Company: Alphabet Inc.
Data Points: 50
Interval: 5d

First 5 data points:

Date: 2023-01-19T00:00:00
  Open: $90.1
  High: $92.97
  Low: $90.01
  Close: $92.41
  Volume: 37,000,400

Date: 2023-01-24T00:00:00
  Open: $97.43
  High: $98.93
  Low: $96.53
  Close: $97.03
  Volume: 33,078,500

Date: 2023-02-03T00:00:00
  Open: $102.22
  High: $107.07
  Low: $101.88
  Close: $104.06
  Volume: 65,309,300

Date: 2023-02-08T00:00:00
  Open: $101.35
  High: $102.43
  Low: $97.37
  Close: $98.69
  Volume: 94,743,500

Date: 2023-02-13T00:00:00
  Open: $94.09
  High: $94.55
  Low: $93.2
  Close: $

## Supported Companies

The system includes mappings for common companies like:
- Apple, Microsoft, Google/Alphabet, Amazon, Tesla
- Meta/Facebook, Netflix, NVIDIA, Intel, IBM
- Oracle, Salesforce, Adobe

You can also use stock symbols directly (e.g., "AAPL", "MSFT", etc.)

## Data Fields Returned

- Symbol / Company Name
- Company Short Name
- Date / Timestamp (ISO 8601 format)
- Open Price, High Price, Low Price, Close Price
- Adjusted Close Price
- Volume (number of shares traded)
- Dividends (per share, on ex-dividend dates)
- Stock Splits (split ratio on split dates)

## Interval Options

- 1d: Daily data
- 1wk: Weekly data
- 1mo: Monthly data
- 5d: 5-day intervals