Part 2: Web-Based Trading System
This section focuses on developing a fully functional web-based trading system that
integrates the machine learning model and API wrapper into an interactive user interface.
The goal is to provide a seamless experience for users to analyze market trends and view
the model-generated signals generated by your ML backend. Additionally, the system must
be deployed to a cloud platform to ensure accessibility and scalability. Emphasis will be
placed on user experience and efficient data retrieval.
2.1 Python API Wrapper for SimFin
Students will construct a Python API wrapper to interact with SimFin. The goal is to simplify
the interaction with the API just by using Python. This wrapper consists of a set of classes
to retrieve share prices or any other useful information.
Requirements
• Implement an object-oriented code to extract data from SimFin.
• The library must allow users to retrieve stock price data for specified companies in
the specified period.
Instructions and suggestions
1. Define the structure of the API. Here I provide an example of a simple Python
wrapper for the API. Take it as a suggestion and simplify or complicate it as you
consider.
1. Build a class PySimFin with the following capabilities.
1. Constructor. This will create a new instance for the class to interact
with the API. In this constructor, you can initialize the base endpoint
and the header with the “api-key”
2. get_share_prices (ticker: str, start: str, end: str). This method will
return DataFrame with all prices for the provided ticker in the
provided time range.
3. get_financial_statement (ticker: str, start: str, end: str). This method
will return DataFrame with financial statements for the ticker
provided in the provided time range.


In [2]:
import requests
import pandas as pd

class PySimFin:

    BASE_URL = "https://backend.simfin.com/api/v3/"  # Base API URL

    def __init__(self, api_key: str):
        """
        Initialize the PySimFin client with the provided API key.
        """
        self.api_key = api_key
        self.headers = {"api-key": self.api_key}

    def _get(self, endpoint: str, params: dict):
        """
        Internal method to send a GET request to SimFin API.
        """
        params["api-key"] = self.api_key  # Add API key to request
        response = requests.get(self.BASE_URL + endpoint, params=params)

        if response.status_code == 200:
            return response.json()
        else:
            print(f"Error {response.status_code}: {response.text}")
            return None

    def get_sim_id(self, ticker: str):
        """
        Fetch SimFin ID for a given ticker.
        """
        endpoint = "companies/general/verbose"
        params = {"ticker": ticker}
        data = self._get(endpoint, params)

        if data and "id" in data:
            return data["id"]
        return None

    def get_share_prices(self, ticker: str, start: str, end: str):
        """
        Fetch historical share prices for a given ticker and time range.
        Returns a DataFrame.
        """
        sim_id = self.get_sim_id(ticker)
        if not sim_id:
            print(f"Error: Could not find SimFin ID for ticker {ticker}")
            return None

        endpoint = "companies/shares/prices"
        params = {"simId": sim_id, "start": start, "end": end}
        data = self._get(endpoint, params)

        if data and "data" in data:
            return pd.DataFrame(data["data"])  # Convert to DataFrame
        return None


In [5]:
import requests
import pandas as pd

class PySimFin():
    BASE_URL = "https://backend.simfin.com/api/v3/"  # Correct Base API URL

    def __init__(self, api_key: str):
        """
        Initialize the PySimFin client with the provided API key.
        """
        self.api_key = api_key
        self.headers = {"Authorization": f"{self.api_key}"}  # Use Bearer token

    def _get(self, endpoint: str, params: dict = None):
        """
        Internal method to send a GET request to SimFin API.
        """
        url = self.BASE_URL + endpoint
        response = requests.get(url, headers=self.headers, params=params)

        if response.status_code == 200:
            return response.json()
        else:
            print(f"Error {response.status_code}: {response.text}")
            return None

    def get_sim_id(self, ticker: str):
        """
        Fetch SimFin ID for a given ticker.
        """
        endpoint = "companies/general/verbose"
        params = {"ticker": ticker}
        data = self._get(endpoint, params)[0]
        print(data)
        if data and "id" in data:
            return data["id"]
        return None

    def get_share_prices(self, ticker: str, start: str, end: str):
        """
        Fetch historical share prices for a given ticker and time range.
        Returns a DataFrame.
        """
        sim_id = self.get_sim_id(ticker)
        if not sim_id:
            print(f"Error: Could not find SimFin ID for ticker {ticker}")
            return None

        endpoint = f"companies/prices/compact"
        params = {"id": sim_id, "start": start, "end": end}
        data = self._get(endpoint, params)
        print(data)
        # Extract columns and data
        columns = data[0]["columns"]
        data = data[0]["data"]

        # Create DataFrame
        df = pd.DataFrame(data, columns=columns)

        # Select only 'Date' and 'Last Closing Price' columns
        df = df[["Date", "Last Closing Price"]]

        # Convert 'Date' column to datetime format
        df["Date"] = pd.to_datetime(df["Date"])

        
        return df
        

    def get_financial_statement(self, ticker: str, start: str, end: str):
        """
        Fetch financial statements (Income, Balance Sheet, Cash Flow) for a given ticker and time range.
        Returns a DataFrame.
        """
        sim_id = self.get_sim_id(ticker)
        if not sim_id:
            print(f"Error: Could not find SimFin ID for ticker {ticker}")
            return None

        endpoint = "companies/statements"
        params = {"simId": sim_id, "start": start, "end": end}
        data = self._get(endpoint, params)

        if data and "statements" in data:
            return pd.DataFrame(data["statements"])  # Convert to DataFrame
        return None


In [6]:
test = PySimFin("79f8076c-cdc4-4ffe-9827-a82f92215739")

In [11]:
appl = test.get_share_prices("AAPL", "2019-04-01", "2019-06-01")
f = test.get_share_prices("F", "2019-04-01", "2019-06-01")

{'id': 111052, 'name': 'APPLE INC', 'ticker': 'AAPL', 'sectorCode': 101001, 'industryName': 'Technology', 'sectorName': 'Computer Hardware', 'endFy': 9, 'numEmployees': 147000, 'companyDescription': 'Apple Inc is an American multinational technology company. It designs, manufactures, and markets mobile communication and media devices, personal computers, and portable digital music players.', 'market': 'US', 'isin': 'US0378331005'}
[{'columns': ['Date', 'Dividend Paid', 'Common Shares Outstanding', 'Last Closing Price', 'Adjusted Closing Price', 'Highest Price', 'Lowest Price', 'Opening Price', 'Trading Volume'], 'name': 'APPLE INC', 'id': 111052, 'ticker': 'AAPL', 'currency': 'USD', 'isin': 'US0378331005', 'data': [['2019-04-01', None, 18429136000, 47.81, 45.81, 47.92, 47.09, 47.91, 111447856], ['2019-04-02', None, 18429136000, 48.51, 46.47, 48.62, 47.76, 47.77, 91062928], ['2019-04-03', None, 18429136000, 48.84, 46.79, 49.12, 48.29, 48.31, 93087320], ['2019-04-04', None, 18429136000, 

In [12]:
f

Unnamed: 0,Date,Last Closing Price
0,2019-04-01,8.98
1,2019-04-02,9.01
2,2019-04-03,9.13
3,2019-04-04,9.24
4,2019-04-05,9.25
5,2019-04-08,9.3
6,2019-04-09,9.21
7,2019-04-10,9.33
8,2019-04-11,9.39
9,2019-04-12,9.45


In [15]:
import requests

url = "https://backend.simfin.com/api/v3/companies/statements/compact?ticker=AAPL&statements=BS,CF&period="

headers = {
    "accept": "application/json",
    "Authorization": "79f8076c-cdc4-4ffe-9827-a82f92215739"
}

response = requests.get(url, headers=headers)

print(response.json())

[{'template': 'GENERAL', 'name': 'APPLE INC', 'id': 111052, 'ticker': 'AAPL', 'currency': 'USD', 'isin': 'US0378331005', 'statements': [{'statement': 'BS', 'columns': ['Fiscal Period', 'Fiscal Year', 'Report Date', 'Publish Date', 'Restated', 'Source', 'TTM', 'Value Check', 'Data Model', 'Cash, Cash Equivalents & Short Term Investments', 'Cash & Cash Equivalents', 'Short Term Investments', 'Accounts & Notes Receivable', 'Accounts Receivable, Net', 'Notes Receivable, Net', 'Unbilled Revenues', 'Inventories', 'Raw Materials', 'Work In Process', 'Finished Goods', 'Other Inventory', 'Other Short Term Assets', 'Prepaid Expenses', 'Derivative & Hedging Assets (Short Term)', 'Assets Held-for-Sale', 'Deferred Tax Assets (Short Term)', 'Income Taxes Receivable', 'Discontinued Operations (Short Term)', 'Miscellaneous Short Term Assets', 'Total Current Assets', 'Property, Plant & Equipment, Net', 'Property, Plant & Equipment', 'Accumulated Depreciation', 'Long Term Investments & Receivables', 'Lo

In [31]:
features_to_include = list(set(['Close', 'Adj. Close', 'Shares (Diluted)', 'Inventories', 'Share Capital & Additional Paid-In Capital', 'Shares (Diluted)_cashflow', 'Depreciation & Amortization', 'Change in Other', 'Change in Fixed Assets & Intangibles', 'Shares (Diluted)_income', 'RSI_14', 'Close', 'Adj. Close', 'Volume', 'Dividend', 'Inventories', 'Total Noncurrent Liabilities', 'Net Income/Starting Line', 'Change in Accounts Receivable', 'Net Cash from Financing Activities', 'Net Change in Cash', 'Income (Loss) from Continuing Operations', 'Net Income', 'Net Income (Common)', 'RSI_14', 'Open', 'Close', 'Dividend', 'Short Term Debt', 'Depreciation & Amortization', 'Non-Cash Items', 'Change in Working Capital', 'Net Cash from Operating Activities', 'Net Cash from Acquisitions & Divestitures', 'Net Cash from Investing Activities', 'Selling, General & Administrative', 'RSI_14', 'Open', 'High', 'Volume', 'Inventories', 'Total Current Liabilities', 'Non-Cash Items', 'Change in Fixed Assets & Intangibles', 'Net Cash from Acquisitions & Divestitures', 'Net Cash from Investing Activities', 'Income Tax (Expense) Benefit, Net', 'SMA_10', 'Volume', 'Other Long Term Assets', 'Cash from (Repurchase of) Equity', 'Operating Expenses', 'Operating Income (Loss)', 'Income Tax (Expense) Benefit, Net', 'SMA_5', 'RSI_14']))


import pandas as pd
import requests

# API endpoint
url = "https://backend.simfin.com/api/v3/companies/statements/compact?ticker=AAPL&statements=BS,CF,PL&period=&fyear=2021&start=2020-01-01&end=2023-12-31"

# API headers with authorization
headers = {
    "accept": "application/json",
    "Authorization": "79f8076c-cdc4-4ffe-9827-a82f92215739"  # Replace with your valid API key
}

# Fetch data from the API
response = requests.get(url, headers=headers)
json_data = response.json()

# Define your list of features to include


# Create an empty list to store the data
statement_data = []

# Extract statement and data from JSON
for company in json_data:
    for stmt in company['statements']:
        statement_type = stmt['statement']
        columns = stmt['columns']
        # Iterate over all rows in the data list
        for data_row in stmt['data']:
            # Create a dictionary with all data for this row
            full_dict = {'Statement': statement_type}
            full_dict.update(dict(zip(columns, data_row)))
            # Filter to only include desired features
            row_dict = {key: full_dict[key] for key in features_to_include if key in full_dict}
            statement_data.append(row_dict)

# Create DataFrame
df = pd.DataFrame(statement_data)

# Display the first few rows
print(df.head())

# Display the number of rows and columns
print(f"\nNumber of rows: {len(df)}")
print(f"Number of columns: {len(df.columns)}")

# Optional: Display column names
print("\nColumn names:")
print(list(df.columns))

# Optional: Save to CSV
df.to_csv('filtered_statement_data.csv', index=False)

   Total Current Liabilities  Share Capital & Additional Paid-In Capital  \
0               1.323330e+11                                5.174400e+10   
1               1.062803e+11                                5.420300e+10   
2               1.076636e+11                                5.498900e+10   
3               1.254810e+11                                5.736500e+10   
4                        NaN                                         NaN   

    Inventories  Short Term Debt  Total Noncurrent Liabilities  \
0  4.973000e+09     1.277573e+10                  5.604200e+10   
1  5.219000e+09     1.301225e+10                  5.295300e+10   
2  5.178000e+09     1.602464e+10                  5.205400e+10   
3  6.580000e+09     1.715008e+10                  1.624310e+11   
4           NaN              NaN                           NaN   

   Other Long Term Assets  Non-Cash Items  Change in Working Capital  \
0            4.327000e+10             NaN                        NaN   
1 

In [22]:
import requests
import pandas as pd

# API endpoint
url = "https://backend.simfin.com/api/v3/companies/statements/compact?ticker=AAPL&statements=BS,CF&period="

# API headers with authorization
headers = {
    "accept": "application/json",
    "Authorization": "79f8076c-cdc4-4ffe-9827-a82f92215739"  # Replace with your valid API key
}

# Fetch data from the API
response = requests.get(url, headers=headers)

# Check if request was successful
if response.status_code == 200:
    data = response.json()
    
    # Inspect the structure of the response
    if isinstance(data, dict) and 'statements' in data:
        statements = data['statements']
        
        # Flatten the JSON into a list of dictionaries
        records = []
        for statement_type in statements:
            for entry in statements[statement_type]:
                records.append(entry)

        # Convert the extracted data into a DataFrame
        df = pd.DataFrame(records)

        # Define the features to keep
        features = {
            'Adj. Close', 'Cash from (Repurchase of) Equity', 'Change in Accounts Receivable',
            'Change in Fixed Assets & Intangibles', 'Change in Other', 'Change in Working Capital',
            'Close', 'Depreciation & Amortization', 'Dividend', 'High',
            'Income (Loss) from Continuing Operations', 'Income Tax (Expense) Benefit, Net',
            'Inventories', 'Net Cash from Acquisitions & Divestitures', 'Net Cash from Financing Activities',
            'Net Cash from Investing Activities', 'Net Cash from Operating Activities',
            'Net Change in Cash', 'Net Income', 'Net Income (Common)', 'Net Income/Starting Line',
            'Non-Cash Items', 'Open', 'Operating Expenses', 'Operating Income (Loss)',
            'Other Long Term Assets', 'RSI_14', 'SMA_10', 'SMA_5', 'Selling, General & Administrative',
            'Share Capital & Additional Paid-In Capital', 'Shares (Diluted)', 'Shares (Diluted)_cashflow',
            'Shares (Diluted)_income', 'Short Term Debt', 'Total Current Liabilities',
            'Total Noncurrent Liabilities', 'Volume'
        }

        # Filter dataframe using the defined features (only keeping columns that exist in df)
        filtered_df = df[[col for col in features if col in df.columns]]

        # Save the filtered data to a CSV file
        filtered_df.to_csv("filtered_data.csv", index=False)

        # Display the first few rows
        print(filtered_df.head())

    else:
        print("Unexpected API response format:", data)

else:
    print(f"Error: Unable to fetch data (Status Code: {response.status_code})")
    print(response.text)


Unexpected API response format: [{'template': 'GENERAL', 'name': 'APPLE INC', 'id': 111052, 'ticker': 'AAPL', 'currency': 'USD', 'isin': 'US0378331005', 'statements': [{'statement': 'BS', 'columns': ['Fiscal Period', 'Fiscal Year', 'Report Date', 'Publish Date', 'Restated', 'Source', 'TTM', 'Value Check', 'Data Model', 'Cash, Cash Equivalents & Short Term Investments', 'Cash & Cash Equivalents', 'Short Term Investments', 'Accounts & Notes Receivable', 'Accounts Receivable, Net', 'Notes Receivable, Net', 'Unbilled Revenues', 'Inventories', 'Raw Materials', 'Work In Process', 'Finished Goods', 'Other Inventory', 'Other Short Term Assets', 'Prepaid Expenses', 'Derivative & Hedging Assets (Short Term)', 'Assets Held-for-Sale', 'Deferred Tax Assets (Short Term)', 'Income Taxes Receivable', 'Discontinued Operations (Short Term)', 'Miscellaneous Short Term Assets', 'Total Current Assets', 'Property, Plant & Equipment, Net', 'Property, Plant & Equipment', 'Accumulated Depreciation', 'Long Term

In [27]:
import pandas as pd
import requests

# API endpoint
url = "https://backend.simfin.com/api/v3/companies/statements/compact?ticker=AAPL&statements=BS,CF&period=&start=2020-01-01&end=2023-12-31"

# API headers with authorization
headers = {
    "accept": "application/json",
    "Authorization": "79f8076c-cdc4-4ffe-9827-a82f92215739"  # Replace with your valid API key
}

# Fetch data from the API
response = requests.get(url, headers=headers)
json_data = response.json()

# Create an empty list to store the data
statement_data = []

# Extract statement and data from JSON
for company in json_data:
    for stmt in company['statements']:
        statement_type = stmt['statement']
        columns = stmt['columns']
        # Iterate over all rows in the data list
        for data_row in stmt['data']:
            # Create a dictionary with statement type and its data
            row_dict = {'Statement': statement_type}
            row_dict.update(dict(zip(columns, data_row)))
            statement_data.append(row_dict)

# Create DataFrame
df = pd.DataFrame(statement_data)

# Display the first few rows
print(df.head())

# Display the number of rows and columns
print(f"\nNumber of rows: {len(df)}")
print(f"Number of columns: {len(df.columns)}")

# Optional: Display column names
print("\nColumn names:")
print(list(df.columns))

# Optional: Save to CSV
df.to_csv('statement_data.csv', index=False)

  Statement Fiscal Period  Fiscal Year Report Date Publish Date  Restated  \
0        BS            Q1         2021  2020-12-31   2021-01-28         1   
1        BS            Q1         2022  2021-12-31   2022-01-28         1   
2        BS            Q1         2023  2022-12-31   2023-02-03         1   
3        BS            Q1         2024  2023-12-31   2024-02-02         1   
4        BS            Q2         2020  2020-03-31   2020-05-01         1   

                                              Source  TTM  Value Check  \
0  https://www.sec.gov/Archives/edgar/data/320193...    0            1   
1  https://www.sec.gov/Archives/edgar/data/320193...    0            1   
2  https://www.sec.gov/Archives/edgar/data/320193...    0            1   
3  https://www.sec.gov/Archives/edgar/data/320193...    0            1   
4  https://www.sec.gov/Archives/edgar/data/320193...    0            1   

   Data Model  ...  Increase in Capital Stock  Decrease in Capital Stock  \
0      0.3182  .