
# How to build an Automated Quant Database

Quants analyze market data, often using internal databases at firms like Jane Street, Man Group, and Goldman Sachs. Creating your own stock price database is essential for integrating various data points—like stock prices, economic trends, and custom analytics—to enhance market research efficiency.

With the rise of free market data, now is an ideal time to start storing and analyzing it. This issue will guide you on how to:

- Use SQLite to build a database
- Download stock data for free
- Store the data in a database
- Automate the entire process

Sources:
- https://www.sqlite.org/
- https://docs.python.org/3/library/sqlite3.html
- https://www.pyquantnews.com/the-pyquant-newsletter/how-to-build-an-automated-quant-database

In [None]:
pip install yfinance

In [2]:
from sys import argv

import pandas as pd
import yfinance as yf
import sqlite3

In [3]:
def get_stock_data(symbol, start, end):
    data = yf.download(symbol, start=start, end=end)
    data.reset_index(inplace=True)
    data.rename(columms = {
        'Date': 'date',
        'Open': 'open',
        'High': 'high',
        'Low': 'low',
        'Close': 'close',
        'Adj Close': 'adj_close',
        'Volume': 'volume'
    }, inplace=True)
    data['symbol'] = symbol
    return data

In [4]:
def save_data_range(symbol, start, end, con):
    data = get_stock_data(symbol, start, end)
    data.to_sql('stock_data', 
                con, 
                if_exists='append', 
                index=False)

In [5]:
def save_last_trading_session(symbol, con):
    today = pd.Timestamp.today().normalize()
    data = get_stock_data(symbol, today, today)
    if not data.empty:
        data.to_sql('stock_data', 
                    con, 
                    if_exists='append', 
                    index=False)

In [None]:
def retrieve_stockdata(symbol, start_date, end_date, con):
    query = f"""
    SELECT * FROM stock_data 
    WHERE symbol = '{symbol}' 
    AND date BETWEEN '{start_date}' AND '{end_date}'
    """
    return pd.read_sql_query(query, con)

In [None]:
# Main Execution block
if __name__ == '__main__':
    con = sqlite3.connect("08_market_data.sqlite")

    # User input
    symbol = input("Enter stock symbol: ").strip().upper()
    start_date = input("Enter start date (YYYY-MM-DD): ").strip()
    end_date = input("Enter end date (YYYY-MM-DD): ").strip()

    # Save data range
    save_data_range(symbol, start_date, end_date, con)
    print(f"Data for {symbol} from {start_date} to {end_date} saved to database.")

    # Save last trading session data
    save_last_trading_session(symbol, con)
    print(f"Last trading session data for {symbol} saved to database.")

    # Retrieve and display data
    query = f"SELECT * FROM stock_data WHERE symbol='{symbol}'"
    df_stock = pd.read_sql_query(query, con)

    # Close the connection
    con.close()

    # Display the data
    print("\nStock Data:")
    print(df_stock.head())