
# How to build an Automated Quant Database

Quants analyze market data, often using internal databases at firms like Jane Street, Man Group, and Goldman Sachs. Creating your own stock price database is essential for integrating various data points—like stock prices, economic trends, and custom analytics—to enhance market research efficiency.

With the rise of free market data, now is an ideal time to start storing and analyzing it. This issue will guide you on how to:

- Use SQLite to build a database
- Download stock data for free
- Store the data in a database
- Automate the entire process

Sources:
- https://www.sqlite.org/
- https://docs.python.org/3/library/sqlite3.html
- https://www.pyquantnews.com/the-pyquant-newsletter/how-to-build-an-automated-quant-database

In [5]:
pip install yfinance

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [6]:
from sys import argv

import pandas as pd
import yfinance as yf
import sqlite3

In [7]:
def get_stock_data(symbol, start, end):
    data = yf.download(symbol, start=start, end=end)
    data.reset_index(inplace=True)
    data.rename(columms = {
        'Date': 'date',
        'Open': 'open',
        'High': 'high',
        'Low': 'low',
        'Close': 'close',
        'Adj Close': 'adj_close',
        'Volume': 'volume'
    }, inplace=True)
    data['symbol'] = symbol
    return data

In [8]:
def save_data_range(symbol, start, end, con):
    data = get_stock_data(symbol, start, end)
    data.to_sql('stock_data', 
                con, 
                if_exists='append', 
                index=False)

In [9]:
def save_last_trading_session(symbol, con):
    today = pd.Timestamp.today().normalize()
    data = get_stock_data(symbol, today, today)
    if not data.empty:
        data.to_sql('stock_data', 
                    con, 
                    if_exists='append', 
                    index=False)

In [10]:
if __name__ == '__main__':
    con = sqlite3.connect("08_market_data.sqlite")

    if argv[1] == "bulk":
        symbol = argv[2]
        start = argv[3]
        end = argv[4]
        save_data_range(symbol, start, end, con)
        print(f"Data for {symbol} from {start} to {end} saved to database.")
    elif argv[1] == "last":
        symbol = argv[2]
        save_last_trading_session(symbol, con)
        print(f"Last trading session data for {symbol} saved to database.")
    else:
        print("Invalid command. Use 'bulk' or 'last'.")

Invalid command. Use 'bulk' or 'last'.
