# Real Time Data Stream

This project is to start a real time data feed and build infrastructure needed to maintain it.

The subject of our data stream will be financial data, specifically for Crypto which has as 24/7 market.

### Outline:
1) Database
2) API
3) Data Storage
4) Analysis
5) Visualization

## Initialization

In [5]:
import pandas as pd
import numpy as np
import sqlite3 #needed for database
#datastream imports
import websocket
import json
import threading
import time
from datetime import datetime

## Database

SQLite database to store the price data.

In [6]:
conn = sqlite3.connect('db/pricing_data.db') #establish connection to the databse
c = conn.cursor() #create a cursor object

#creating the pricing data table with predefined schema
c.execute('''
    CREATE TABLE IF NOT EXISTS price_ticks(
          timestamp TEXT
          symbol TEXT
          price REAL
          volume INT
          recieved_at TEXT)
    ''')

conn.commit() #save these changes
conn.close() #close the connection

## API

Selecting a data provider that provides real time, free to publish data.

Of the options, I am deciding on Finnhub. Other contenders were Alpha Vantage.

In [7]:
# import finnhub

# # Setup client
# finnhub_client = finnhub.Client(api_key="d0amcgpr01qm3l9meas0d0amcgpr01qm3l9measg")

# quote = finnhub_client.quote('BINANCE:BTCUSDT')
# print(quote)

In [8]:
# API_KEY = 'd0amcgpr01qm3l9meas0d0amcgpr01qm3l9measg'
# SYMBOL = 'BINANCE:BTCUSDT'
# URL = f'https://finnhub.io/api/v1/quote?symbol={SYMBOL}&token={API_KEY}'

# while True:
#     try:
#         response = requests.get(URL)
#         data = response.json()
#         print(f"[{datetime.utcnow().isoformat()}] Price: {data['c']}")
#         time.sleep(5)
#     except Exception as e:
#         print("Error:", e)
#         time.sleep(5)

In [None]:
API_KEY = 'd0amcgpr01qm3l9meas0d0amcgpr01qm3l9measg'
SYMBOL = 'BINANCE:BTCUSDT'
trade_buffer = [] #temporary storage for trades, clears at predetermined intervals
seconds_per_refresh = 5 #measure for how often the trade_buffer refreshed in seconds

# recieves every message, makes sure its a trade, and then appends that information to trade buffer
def on_message(ws, message):
    global trade_buffer
    data = json.loads(message)
    if data.get('type') == 'trade':
        trade_buffer.extend(data['data'])

# prints errors
def on_error(ws, error):
    print("WebSocket error:", error)

# indicates the connection is closed
def on_close(ws):
    print("WebSocket closed")

# indicates the connection has opened
def on_open(ws):
    print("WebSocket connection opened")
    ws.send(json.dumps({
        "type": "subscribe",
        "symbol": SYMBOL
    }))

def process_trades():
    global trade_buffer
    while True:
        time.sleep(seconds_per_refresh)
        trades = trade_buffer.copy()
        trade_buffer.clear()

        if trades:
            print(f"\n[{datetime.utcnow().isoformat()}] {len(trades)} trades:")
            for t in trades:
                ts = datetime.utcfromtimestamp(t['t'] / 1000).isoformat()
                print(f"  Time: {ts} | Price: {t['p']} | Volume: {t['v']}")
        else:
            print(f"\n[{datetime.utcnow().isoformat()}] No new trades.")

websocket.enableTrace(False)
ws = websocket.WebSocketApp(f"wss://ws.finnhub.io?token={API_KEY}",
                            on_message=on_message,
                            on_error=on_error,
                            on_close=on_close)
ws.on_open = on_open

t = threading.Thread(target=process_trades_every_5_seconds)
t.daemon = True
t.start()

ws.run_forever()

WebSocket connection opened


  print(f"\n[{datetime.utcnow().isoformat()}] {len(trades)} trades:")
  ts = datetime.utcfromtimestamp(t['t'] / 1000).isoformat()



[2025-05-03T18:46:06.551996] 8 trades:
  Time: 2025-05-03T18:46:02.105000 | Price: 96100 | Volume: 0.01558
  Time: 2025-05-03T18:46:02.960000 | Price: 96100.01 | Volume: 0.00034
  Time: 2025-05-03T18:46:03.115000 | Price: 96100 | Volume: 0.00013
  Time: 2025-05-03T18:46:03.202000 | Price: 96100.01 | Volume: 0.00023
  Time: 2025-05-03T18:46:03.437000 | Price: 96100.01 | Volume: 0.00065
  Time: 2025-05-03T18:46:04.018000 | Price: 96100 | Volume: 0.0006
  Time: 2025-05-03T18:46:04.481000 | Price: 96100.01 | Volume: 0.00238
  Time: 2025-05-03T18:46:04.859000 | Price: 96100.01 | Volume: 0.00038

[2025-05-03T18:46:11.553515] 32 trades:
  Time: 2025-05-03T18:46:05.108000 | Price: 96100.01 | Volume: 0.00029
  Time: 2025-05-03T18:46:05.360000 | Price: 96100.01 | Volume: 0.00014
  Time: 2025-05-03T18:46:05.737000 | Price: 96100 | Volume: 7e-05
  Time: 2025-05-03T18:46:07.798000 | Price: 96100.01 | Volume: 0.00104
  Time: 2025-05-03T18:46:08.247000 | Price: 96100.01 | Volume: 0.00014
  Time: 202


[2025-05-03T18:46:26.556850] 18 trades:
  Time: 2025-05-03T18:46:20.263000 | Price: 96100.01 | Volume: 0.00104
  Time: 2025-05-03T18:46:20.267000 | Price: 96100.01 | Volume: 0.00021
  Time: 2025-05-03T18:46:20.340000 | Price: 96100.01 | Volume: 0.0001
  Time: 2025-05-03T18:46:20.417000 | Price: 96100.01 | Volume: 0.00052
  Time: 2025-05-03T18:46:20.460000 | Price: 96100.01 | Volume: 7e-05
  Time: 2025-05-03T18:46:21.036000 | Price: 96100 | Volume: 0.0005
  Time: 2025-05-03T18:46:20.861000 | Price: 96100 | Volume: 8e-05
  Time: 2025-05-03T18:46:20.996000 | Price: 96100.01 | Volume: 0.00059
  Time: 2025-05-03T18:46:21.249000 | Price: 96100.01 | Volume: 0.0001
  Time: 2025-05-03T18:46:21.400000 | Price: 96100.01 | Volume: 0.0001
  Time: 2025-05-03T18:46:21.418000 | Price: 96100.01 | Volume: 0.00052
  Time: 2025-05-03T18:46:21.429000 | Price: 96100.01 | Volume: 0.00263
  Time: 2025-05-03T18:46:22.395000 | Price: 96100.01 | Volume: 6e-05
  Time: 2025-05-03T18:46:22.763000 | Price: 96100.01

  print(f"\n[{datetime.utcnow().isoformat()}] No new trades.")



[2025-05-03T18:46:31.558197] No new trades.

[2025-05-03T18:46:36.559712] No new trades.

[2025-05-03T18:46:41.560615] No new trades.

[2025-05-03T18:46:46.561280] No new trades.

[2025-05-03T18:46:51.563291] No new trades.

[2025-05-03T18:46:56.563811] No new trades.

[2025-05-03T18:47:01.564854] No new trades.

[2025-05-03T18:47:06.565321] No new trades.

[2025-05-03T18:47:11.566300] No new trades.

[2025-05-03T18:47:16.567054] No new trades.

[2025-05-03T18:47:21.567347] No new trades.

[2025-05-03T18:47:26.568269] No new trades.

[2025-05-03T18:47:31.569438] No new trades.

[2025-05-03T18:47:36.570463] No new trades.

[2025-05-03T18:47:41.570964] No new trades.

[2025-05-03T18:47:46.576009] No new trades.

[2025-05-03T18:47:51.577202] No new trades.

[2025-05-03T18:47:56.577780] No new trades.

[2025-05-03T18:48:01.578754] No new trades.

[2025-05-03T18:48:06.579597] No new trades.

[2025-05-03T18:48:11.580553] No new trades.

[2025-05-03T18:48:16.581314] No new trades.

[2025-05-