# Abstract

In an era where financial markets move at machine speed and technological automation is rapidly advancing, algorithmic trading has transformed the landscape for both institutional and retail investors. At the heart of this transformation are robust, well-designed Application Programming Interfaces (APIs) that enable automated trading strategies to connect efficiently with diverse market venues, access real-time data, and execute orders programmatically. In this proposal, we present a comprehensive team-based academic project aimed at designing and implementing a flexible, object-oriented trading API—focusing sharply on architectural patterns (API gateways), generic connectivity, flight search as an illustrative data model, and critical mechanisms such as throttling (rate limiting) and authentication.
This proposal targets students and developers interested in understanding modern API-driven systems, integrating the principles of object-oriented programming (OOP). We emphasize how OOP supports scalable, maintainable, and extensible APIs—capable of supporting algorithmic trading and adaptable to broader data domains such as travel aggregation. The approach will utilize current industry best practices, integrate wide-ranging academic and technical sources, and provide a roadmap to implementation, teamwork, and testing.
Objectives
Our project will address the following main objectives:
1. Access financial data through API

    •	For now, we will adhere to Finnhub Stock APIs, as they have a detailed API documentation and its basic functionality is free to use

    •	They have their own error handling schemes, so we will stick to it

    •   [Finnhub API documentation](https://finnhub.io/docs/api/introduction)

2. Thorough Exploratory Data Analysis

    •   Since we are most likely to deal with time-series data, some econometrics will play significant role

    •   In addition, it is ideal to visualize data in some other forms (histogram, heatmap, etc.), which we will figure out

3. Develop trading strategy

    •   Further details to be discussed

4. Backtest

    •   TBD

# Midway Report

### Current Progress

So far, we have decided what API to use and designed a workflow for each team member. With aporiximately equal weighted tasks, we assured that each team member may contribute to the project.

### Challenges & Problems

The Biggest challenge we encountered so far was that we were haeding to a wrong direction. Initially, we thought the main goal of this project was to build our own API, and we assigned task based on this misconception.
Thus, we were not able to really put effective effort on this project and we had to demolish everything we built up after we figured out this strategis mistake.
Another issue, which may be minimal or significant, is that most functionality of Finnhub API are not free to use, so if we want to continue use it for free, we may encounter some issues

In [5]:
%pip install seaborn

In [6]:
import numpy as np
import pandas as pd
import seaborn as sns
import statsmodels.api as sm
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, classification_report, precision_recall_fscore_support
from sklearn.model_selection import train_test_split
import scipy.stats as stats
from scipy.stats import ttest_ind, chisquare, normaltest

In [1]:
# engine file

import datetime
from datetime import date
import pandas as pd

try:
    import yfinance as yf
except ImportError:
    print("yfinance not installed. 'yahoo' source will not work.")
    yf = None


class MarketDataQuery:
    def __init__(self,
                 symbol: str,  # symbol means permno
                 time_frame: str,
                 start_date: str,
                 end_date: str,
                 frequency: str = '1d', # '1m', '2m', '5m', '15m', '30m', '60m', '90m', '1h', '1d', '5d', '1wk', '1mo', '3mo'
                 source: str = 'yahoo'): # default source is yahoo, it will ask yfinance to provide the data
        self.symbol = symbol
        self.time_frame = time_frame
        self.start_date = datetime.datetime.strptime(start_date, '%Y-%m-%d')  # convert string to datetime, opposite to strftime
        self.end_date = datetime.datetime.strptime(end_date, '%Y-%m-%d')
        self.frequency = frequency
        self.source = source

        self._validate()

    # you have to validate your query condition each time
    def _validate(self):
        if self.start_date > self.end_date:
            raise ValueError('Start date must be before end date')
        valid_frequencies = ['1m', '2m', '5m', '15m', '30m', '60m', '90m', '1h', '1d', '5d', '1wk', '1mo', '3mo']

        if self.frequency not in valid_frequencies:
            raise ValueError('Frequency must be one of {}'.format(valid_frequencies))

    def fetch(self):
        if self.source == 'test':

            freq_map = {'1m': 'T', # these are pandas alias
                        '2m': '2T',
                        '5m': '5T',
                        '15m': '15T',
                        '30m': '30T',
                        '60m': '60T',
                        '90m': '90T',
                        '1h': 'H',
                        '1d': 'D',
                        '5d': '5D',
                        '1wk': 'W',
                        '1mo': 'MS',
                        '3mo': '3MS'} # 'D'
            pd_freq = freq_map.get(self.frequency)
            if not pd_freq:
                raise ValueError('Test source does not support frequency: {self.frequency}')

            dates = pd.date_range(start=self.start_date, end=self.end_date, freq=pd_freq) # get() looks up value of self.frequency in freq_map
            if len(dates) == 0 and self.start_date <= self.end_date: # Handle case where start/end are same day
                dates = pd.to_datetime([self.start_date]) # set the start_date as dates

            prices = [100 + i*0.5 for i in range(len(dates))]  # add 0.5 each day

            df = pd.DataFrame({'date': dates, 'price': prices})
            return df

        elif self.source == 'yahoo':
            if yf is None:
                raise ImportError("yfinance is not installed. Cannot use 'yahoo' source.")

            df = yf.download( # input your parameter
                self.symbol,
                start=self.start_date.strftime('%Y-%m-%d'), # convert the datetime object to string, then send it to API
                end=self.end_date.strftime('%Y-%m-%d'),
                interval=self.frequency,
            )
            df.reset_index(inplace=True)
            return df

        else:
            raise ValueError(f'Unknown source: {self.source}')


# PriceBar is a container after getting the data
class PriceBar:
    def __init__(self, date, open_price, close_price, high_price, low_price, volume):
        self.date = date
        self.open = open_price
        self.close = close_price
        self.high = high_price
        self.low = low_price
        self.volume = volume

    def mid_price(self):
        return (self.high + self.low) / 2

    def is_bullish(self):
        return self.open < self.close

    def is_bearish(self):
        return self.open > self.close

    def __repr__(self):
        return (f'PriceBar(date={self.date}, open={self.open}, close={self.close}, '
                f'high={self.high}, low={self.low}, volume={self.volume})')


class TradeOrder:
    def __init__(self, symbol, side, quantity, order_type='market', price=None):
        self.symbol = symbol.upper()
        self.side = side.upper()  # 'BUY' or 'SELL'
        self.quantity = quantity
        self.order_type = order_type.lower()
        self.price = price
        # These are now set internally, not passed as args
        self.timestamp = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        self.status = 'pending'

    def cancel(self):
        if self.status == 'pending':
            self.status = 'cancelled'

    def __repr__(self):
        return (f'TradeOrder(symbol={self.symbol}, side={self.side}, quantity={self.quantity}, '
                f'type={self.order_type}, price={self.price}, status={self.status})')


class MarketOrder(TradeOrder):
    def __init__(self, symbol, side, quantity):
        super().__init__(symbol, side, quantity, order_type='market')

    def execute(self, market_price):
        self.price = market_price
        self.status = 'filled'


class LimitOrder(TradeOrder):
    def __init__(self, symbol, side, quantity, limit_price):
        super().__init__(symbol, side, quantity, order_type='limit', price=limit_price)

    def execute(self, market_price):
        if self.side == 'BUY' and market_price <= self.price:
            self.status = 'filled'
        elif self.side == 'SELL' and market_price >= self.price:
            self.status = 'filled'


class OrderReceipt:
    def __init__(self, symbol, side, order, timestamp, executed_price=None, executed_quantity=0, status='pending'):
        self.order_id = id(order)
        self.symbol = symbol.upper()
        self.side = side.upper()
        self.original_quantity = order.quantity  # order here is an object from TradeOrder
        self.executed_quantity = executed_quantity
        self.executed_price = executed_price
        self.timestamp = timestamp  # Use passed timestamp
        self.status = status

    def __repr__(self):
        return (f'OrderReceipt(symbol={self.symbol.upper()}, side={self.side.upper()}, '
                f'executed_qty={self.executed_quantity}/{self.original_quantity}, '
                f'executed_price={self.executed_price}, status={self.status}, '
                f'timestamp={self.timestamp})')


class IConnector:  # defines a contract for how your system talks to any broker.
    def getMarketData(self, symbol, start_date, end_date):
        raise NotImplementedError('Subclasses must implement getMarketData')

    def submitOrder(self, order):
        raise NotImplementedError('Subclasses must implement submitOrder')

    def getAccountInfo(self):
        raise NotImplementedError('Subclasses must implement getAccountInfo')


class MockBrokerConnector(IConnector):  # provides a dummybroker
    def __init__(self):
        self.cash_balance = 100000.0
        self.positions = {}  # e.g., {'AAPL': 100}
        self.order_history = []
        self.current_market_price = 100.0  # Added for predictable testing

    def getMarketData(self, symbol, start_date, end_date):
        dates = pd.date_range(start=start_date, end=end_date, freq='D')
        prices = [100 + i * 0.5 for i in range(len(dates))]
        df = pd.DataFrame({'date': dates, 'symbol': symbol, 'price': prices})
        return df

    def submitOrder(self, order):
        # Use the broker's current market price
        order.execute(self.current_market_price)
        self.order_history.append(order)

        if order.status == 'filled':
            if order.side == 'BUY':
                cost = order.price * order.quantity
                self.cash_balance -= cost
                self.positions[order.symbol] = self.positions.get(order.symbol, 0) + order.quantity
            elif order.side == 'SELL':
                revenue = order.price * order.quantity
                self.cash_balance += revenue
                self.positions[order.symbol] = self.positions.get(order.symbol, 0) - order.quantity

        # Return a receipt
        receipt = OrderReceipt(
            symbol=order.symbol,
            side=order.side,
            order=order,
            timestamp=datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
            executed_price=order.price if order.status == 'filled' else None,
            executed_quantity=order.quantity if order.status == 'filled' else 0,
            status=order.status
        )
        return receipt

    def getAccountInfo(self):
        return {
            'cash_balance': self.cash_balance,
            'positions': self.positions,
            'order_history': self.order_history}

yfinance not installed. 'yahoo' source will not work.


In [None]:
# test engine file

import pytest
import pandas as pd
import datetime
from trading_system import (
    MarketDataQuery,
    PriceBar,
    TradeOrder,
    MarketOrder,
    LimitOrder,
    OrderReceipt,
    MockBrokerConnector
)


# --- Tests ---

def test_market_data_query_test_source_daily():
    """Tests the MarketDataQuery with the 'test' source."""
    query = MarketDataQuery(  # input the query condition, six parameters here totally
        symbol='TEST',
        time_frame='D1',
        start_date='2023-01-01',
        end_date='2023-01-05',
        frequency='1d',
        source='test'
    )
    df = query.fetch()

    # assert will raise an error if the statement return false
    assert isinstance(df, pd.DataFrame)
    assert 'date' in df.columns
    assert 'price' in df.columns
    assert len(df) == 5
    assert df['price'].iloc[0] == 100.0
    assert df['price'].iloc[-1] == 102.0  # 100 + 4 * 0.5


def test_market_data_query_test_source_weekly():
    """Tests the MarketDataQuery with the 'test' source for weekly freq."""
    query = MarketDataQuery(
        symbol='TEST',
        time_frame='W1',
        start_date='2023-01-01',
        end_date='2023-01-31',
        frequency='1wk',  # Test a different valid freq
        source='test'
    )
    df = query.fetch()

    # isinstance check if df is a DataFrame object from pd
    assert isinstance(df, pd.DataFrame)
    assert 'date' in df.columns
    assert len(df) == 5  # 5 weeks in Jan 2023
    assert df['price'].iloc[0] == 100.0
    assert df['price'].iloc[-1] == 102.0  # 100 + 4 * 0.5


def test_market_data_query_validation():
    """Tests the validation logic in MarketDataQuery."""
    # Test invalid date range
    with pytest.raises(ValueError, match='Start date must be before end date'):
        MarketDataQuery('TEST', 'D1', '2023-01-05', '2023-01-01')

    # Test invalid frequency
    with pytest.raises(ValueError, match="Frequency must be one of .* '3mo'"):
        MarketDataQuery('TEST', 'D1', '2023-01-01', '2023-01-05', frequency='yearly')

    # Test invalid source
    with pytest.raises(ValueError, match='Unknown source: bloomberg'):
        query = MarketDataQuery('TEST', 'D1', '2023-01-01', '2023-01-05', source='bloomberg')
        query.fetch()


def test_price_bar():
    """Tests the PriceBar class methods."""
    # Bullish bar
    bull_bar = PriceBar('2023-01-01', 100, 110, 115, 95, 1000)
    assert bull_bar.is_bullish()
    assert not bull_bar.is_bearish()
    assert bull_bar.mid_price() == (115 + 95) / 2

    # Bearish bar
    bear_bar = PriceBar('2023-01-02', 110, 100, 115, 95, 1000)
    assert not bear_bar.is_bullish()
    assert bear_bar.is_bearish()

    # Doji (neutral)
    doji_bar = PriceBar('2023-01-03', 100, 100, 105, 95, 1000)
    assert not doji_bar.is_bullish()
    assert not doji_bar.is_bearish()


def test_trade_order():
    """Tests the base TradeOrder class."""
    order = TradeOrder('AAPL', 'buy', 100)

    # Test init
    assert order.symbol == 'AAPL'
    assert order.side == 'BUY'
    assert order.status == 'pending'
    assert order.order_type == 'market'
    assert isinstance(order.timestamp, str)

    # Test cancel pending
    order.cancel()
    assert order.status == 'cancelled'

    # Test cancel filled (should not change)
    order.status = 'filled'
    order.cancel()
    assert order.status == 'filled'

    # Test repr
    assert 'TradeOrder' in repr(order)
    assert 'AAPL' in repr(order)


def test_market_order():
    """Tests the MarketOrder class."""
    order = MarketOrder('MSFT', 'sell', 50)

    # Test init
    assert order.symbol == 'MSFT'
    assert order.side == 'SELL'
    assert order.quantity == 50
    assert order.order_type == 'market'
    assert order.status == 'pending'

    # Test execute
    order.execute(market_price=123.45)
    assert order.status == 'filled'
    assert order.price == 123.45


def test_limit_order():
    """Tests the LimitOrder class execution logic."""
    # Test BUY limit
    buy_order = LimitOrder('GOOG', 'buy', 10, 150.00)

    # Case 1: Market price is
    buy_order.execute(market_price=151.00)
    assert buy_order.status == 'pending'

    # Case 2: Market price is at limit
    buy_order.execute(market_price=150.00)
    assert buy_order.status == 'filled'

    # Case 3: Market price is below limit
    buy_order.status = 'pending'  # Reset
    buy_order.execute(market_price=149.00)
    assert buy_order.status == 'filled'

    # Test SELL limit
    sell_order = LimitOrder('TSLA', 'sell', 5, 200.00)

    # Case 1: Market price is below limit
    sell_order.execute(market_price=199.00)
    assert sell_order.status == 'pending'

    # Case 2: Market price is at limit
    sell_order.execute(market_price=200.00)
    assert sell_order.status == 'filled'

    # Case 3: Market price is above limit
    sell_order.status = 'pending'  # Reset
    sell_order.execute(market_price=201.00)
    assert sell_order.status == 'filled'


def test_order_receipt():
    """Tests the OrderReceipt data container."""
    order = MarketOrder('AMD', 'buy', 20)
    order.execute(99.0)

    receipt = OrderReceipt(
        symbol=order.symbol,
        side=order.side,
        order=order,
        # It's only checking if the receipt.timestamp attribute correctly hold that value
        timestamp='2023-01-01 12:00:00',
        executed_price=order.price,
        executed_quantity=order.quantity,
        status=order.status
    )

    assert receipt.order_id == id(order)
    assert receipt.symbol == 'AMD'
    assert receipt.original_quantity == 20
    assert receipt.executed_quantity == 20
    assert receipt.executed_price == 99.0
    assert receipt.status == 'filled'
    assert 'OrderReceipt' in repr(receipt)


# --- Test Class for MockBroker ---

class TestMockBroker:

    def setup_method(self, method):
        """Provides a fresh MockBrokerConnector for each test method."""
        self.broker = MockBrokerConnector()

    def test_mock_broker_init(self):
        """Tests the initial state of the MockBrokerConnector."""
        assert self.broker.cash_balance == 100000.0
        assert self.broker.positions == {}
        assert self.broker.order_history == []

    def test_mock_broker_market_data(self):
        """Tests the mock market data generation."""
        df = self.broker.getMarketData('TEST', '2023-01-01', '2023-01-05')
        assert isinstance(df, pd.DataFrame)
        assert len(df) == 5
        assert 'price' in df.columns

    def test_mock_broker_submit_market_buy(self):
        """Tests submitting a successful market BUY order."""
        order = MarketOrder('AAPL', 'buy', 10)
        receipt = self.broker.submitOrder(order)

        assert order.status == 'filled'
        assert order.price == 100.0  # Mock broker's price
        assert self.broker.cash_balance == 100000.0 - (100.0 * 10)
        # --- FIX: Test for QUANTITY (10 shares), not cost (1000.0) ---
        assert self.broker.positions['AAPL'] == 10
        assert order in self.broker.order_history
        assert receipt.status == 'filled'
        assert receipt.executed_quantity == 10

    def test_mock_broker_submit_market_sell(self):
        """Tests submitting a successful market SELL order."""
        order = MarketOrder('MSFT', 'sell', 5)
        receipt = self.broker.submitOrder(order)

        assert order.status == 'filled'
        assert order.price == 100.0
        assert self.broker.cash_balance == 100000.0 + (100.0 * 5)
        # --- FIX: Test for QUANTITY (-5 shares), not cost (-500.0) ---
        assert self.broker.positions['MSFT'] == -5
        assert order in self.broker.order_history
        assert receipt.status == 'filled'
        assert receipt.executed_quantity == 5

    def test_mock_broker_submit_limit_pending(self):
        """Tests a limit order that does NOT fill."""
        # BUY order with limit 95, market is 100
        order = LimitOrder('TSLA', 'buy', 10, 95.0)
        receipt = self.broker.submitOrder(order)

        assert order.status == 'pending'
        assert self.broker.cash_balance == 100000.0  # No change
        assert 'TSLA' not in self.broker.positions
        assert order in self.broker.order_history
        assert receipt.status == 'pending'
        assert receipt.executed_quantity == 0

    def test_mock_broker_account_info(self):
        """Tests the getAccountInfo method."""
        self.broker.cash_balance = 5000.0
        self.broker.positions = {'XYZ': 1234.5}

        info = self.broker.getAccountInfo()

        assert info['cash_balance'] == 5000.0
        assert info['positions']['XYZ'] == 1234.5
        assert info['order_history'] == []