# Gain Your Edge by Screening Stocks Using Any Document 🚀
### From Pixels to Ratios: Calculate Financial Ratios Directly From Images 📈
Valuable financial data doesn't always live in clean, structured tables. It's often locked away in PDFs 📄, Excel, Word, Google Sheets, investor reports, and even screenshots 🖼️ — mostly unstructured, inaccessible, and expensive to process.

For decades, the answer was either costly contracts with data vendors 💸 or countless hours of error-prone manual data entry 😫. But what if you could bypass all of that?

## What if you could treat an image of a balance sheet just like a database table? 🤯

This notebook demonstrates the new frontier of financial analysis. We will use Snowflake's built-in **AISQL** capabilities 🤖 to feed a multimodal Large Language Model an image of a company's balance sheet. With a simple prompt, we will direct the AI to:

- ### See and Understand the contents of the image. 👀
- ### Extract the specific values for Total Liabilities and Total Equity. 🔢
- ### Calculate the Debt-to-Equity ratio on the fly. 🧮
- ### No manual transcription. No complex data pipelines. 
    - ### Just a picture, a prompt, and powerful insights delivered in seconds. ⚡

Ready to unlock the data trapped in your documents? Let's dive in and run the cells below! 🚀


# Goals
- ## Calculate the Debt-to-Equity Ratio Using a Balance Sheet Image. 
- ## Screen for Stocks with a Certain Debt-to-Equity Ratio. 

# Enter Snowflake's Cortex AISQL
Snowflake Cortex **AISQL** allows users to run powerful analytics on unstructured text and image data using simple SQL functions. It provides built-in, serverless access to a suite of industry-leading Large Language Models (LLMs) from OpenAI, Anthropic, Meta, Mistral AI, and others.

## Key Use Cases:

- ### Data Enrichment: 
    - Extract entities (names, dates, locations) to enhance metadata and streamline data validation processes.
- ### Customer Insights: 
    - Perform sentiment and aspect-based analysis on customer tickets and feedback to drive service improvements.
- ### Content Intelligence: 
    - Filter, classify, and summarize content using natural language commands.
- ### Support Global Operations: 
    - Translate and localize multilingual content for international audiences.
- ### Advanced AI Pipelines: 
    - Parse and chunk documents to prepare them for analytics or to be used in Retrieval-Augmented Generation (RAG) applications.
## Core Benefits:

- ### Unified & Secure: 
    - Models are hosted within the Snowflake ecosystem. Your data stays in place, ensuring maximum security, governance, and compliance.
- ### Performance at Scale: 
    - Leverage Snowflake's scalable compute engine for high-performance inference without managing complex infrastructure.
- ### Simplicity: 
    - No need for separate AI/ML tools or data movement. If you know SQL, you can leverage state-of-the-art AI.


# AISQL functions
Task-specific functions are purpose-built and managed functions that automate routine tasks, like simple summaries and quick translations, that don’t require any customization.

## AI_COMPLETE: 
- Generates a completion for a given text string or image using a selected LLM. Use this function for most generative AI tasks.
- This is the updated version of COMPLETE (SNOWFLAKE.CORTEX).

## AI_CLASSIFY: 
- Classifies text or images into user-defined categories.
- This is the updated version of CLASSIFY_TEXT (SNOWFLAKE.CORTEX) with support for multi-label and image classification.

## AI_FILTER: 
- Returns True or False for a given text or image input, allowing you to filter results in SELECT, WHERE, or JOIN ... ON clauses.

## AI_AGG: 
- Aggregates a text column and returns insights across multiple rows based on a user-defined prompt. This function isn’t subject to context window limitations.

## AI_SUMMARIZE_AGG: 
- Aggregates a text column and returns a summary across multiple rows. This function isn’t subject to context window limitations.

## AI_SIMILARITY: 
- Calculates the embedding similarity between two inputs.

## PARSE_DOCUMENT (SNOWFLAKE.CORTEX): 
- Extracts text (using OCR mode) or text with layout information (using LAYOUT mode) from documents in an internal or external stage.

## TRANSLATE (SNOWFLAKE.CORTEX): 
- Translates text between supported languages.

## SENTIMENT (SNOWFLAKE.CORTEX): 
- Extracts sentiment scores from text.

## EXTRACT_ANSWER (SNOWFLAKE.CORTEX): 
- Extracts the answer to a question from unstructured data, provided that the relevant data exists.

## SUMMARIZE (SNOWFLAKE.CORTEX): 
- Returns a summary of the text that you’ve specified.

### Source: [Snowflake Documentation](https://docs.snowflake.com/en/user-guide/snowflake-cortex/aisql#aisql-functions)

# Demo - Calculate Financial Ratios From Images of Balance Sheets.  


In [None]:
# Snowpark Pandas API
# import modin.pandas as spd
# Import the Snowpark pandas plugin for modin
import streamlit as st
# import matplotlib.pyplot as plt
# import snowflake.snowpark.modin.plugin

from snowflake.snowpark.context import get_active_session
# Create a snowpark session
session = get_active_session()

## Balance Sheets for Real Estate Investment Trusts (REIT)
### FY 2024 - Year-Ending December 2024.  
- Welltower Inc. (NYSE: [WELL](https://www.google.com/finance/quote/WELL:NYSE?window=1Y))
- Prologis Inc. (NYSE: [PLD](https://www.google.com/finance/quote/PLD:NYSE?window=1Y))
- Equinix Inc. (NASDAQ: [EQIX](https://www.google.com/finance/quote/EQIX:NASDAQ?window=1Y))
- Digital Realty Trust (NYSE: [DLR](https://www.google.com/finance/quote/DLR:NYSE?window=1Y))
- American Tower Corporation (NYSE: [AMT](https://www.google.com/finance/quote/AMT:NYSE?window=1Y))

In [None]:
image = session.file.get_stream("@FINANCIAL_STMTS_IMAGES_INT_STG/DLR_BALANCE_SHEET_ENDING_DECEMBER_2024.png" , decompress=False).read() 
# Display the image
st.image(image)

In [None]:
image = session.file.get_stream("@FINANCIAL_STMTS_IMAGES_INT_STG/EQIX_BALANCE_SHEET_ENDING_DECEMBER_2024.png" , decompress=False).read() 
# Display the image
st.image(image)

In [None]:
image = session.file.get_stream("@FINANCIAL_STMTS_IMAGES_INT_STG/AMT_BALANCE_SHEET_ENDING_DECEMBER_2024.png" , decompress=False).read() 
# Display the image
st.image(image)

In [None]:
image = session.file.get_stream("@FINANCIAL_STMTS_IMAGES_INT_STG/WELL_BALANCE_SHEET_ENDING_DECEMBEr_2024.png" , decompress=False).read() 
# Display the image
st.image(image)

## Prerequisites
- ### Create a Stage in Snowflake.  
```
CREATE STAGE FINANCIAL_STMTS_IMAGES_INT_STG 
	DIRECTORY = ( ENABLE = true ) 
	ENCRYPTION = ( TYPE = 'SNOWFLAKE_SSE' ) 
	COMMENT = 'An Snowflake-managed Stage to store the images of balance sheets for various Real Estate Investment Trusts.';
```
- ### Download the [REIT_Balance_Sheet_Image_Files.zip](https://github.com/rrprasan/Finance/tree/main/Snowflake/Notebooks/Company_Financials/Stock_Screening_Using_Images) from Github.  
- ### Unzip the file and load the image files into the stage.  

In [None]:
CREATE STAGE FINANCIAL_STMTS_IMAGES_INT_STG 
	DIRECTORY = ( ENABLE = true ) 
	ENCRYPTION = ( TYPE = 'SNOWFLAKE_SSE' ) 
	COMMENT = 'An Snowflake-managed Stage to store the images of balance sheets for various Real Estate Investment Trusts.';

## List the Balance Sheet Image Files in the Stage.  

In [None]:
LIST @FINANCIAL_STMTS_IMAGES_INT_STG

## The Prompt
### This prompt only returns the debt-to-equity ratio without the source data.  
From the provided balance sheet image, perform the following steps:
1.  Identify and extract the numerical value for 'Total Liabilities' for the period ending December 31, 2024.
2.  Identify and extract the numerical value for 'Total Stockholders\' Equity' (or 'Total Equity') for the same period.
3.  Calculate the Debt-to-Equity ratio using the formula: Total Liabilities / Total Stockholders' Equity.

After calculating, your entire response must contain ONLY the final numerical value, formatted to three decimal places. Do not include any text, explanation, currency signs, or preamble.

EXAMPLE:
If Total Liabilities is 100,000 and Total Equity is 50,000, the calculation is 100,000 / 50,000 = 2.
Your required output is:
2.000


If either of the required values cannot be found on the balance sheet, your entire response must be the single word: NULL.

In [None]:
SELECT
    relative_path AS FILE_NAME,
    SPLIT_PART(relative_path, '_', 1) AS TICKER_SYMBOL,
    AI_COMPLETE('claude-3-5-sonnet',
        'From the provided balance sheet image, perform the following steps:
Identify and extract the numerical value for \'Total Liabilities\' for the period ending December 31, 2024.
Identify and extract the numerical value for \'Total Stockholders\' Equity\' (or \'Total Equity\') for the same period.
Calculate the Debt-to-Equity ratio using the formula: Total Liabilities / Total Stockholders\' Equity.
After calculating, your entire response must contain ONLY the final numerical value, formatted to three decimal places. Do not include any text, explanation, currency signs, or preamble.
---
**EXAMPLE:**
If Total Liabilities is 100,000 and Total Equity is 50,000, the calculation is 100,000 / 50,000 = 2.
Your required output is:
2.000
---
If either of the required values cannot be found on the balance sheet, your entire response must be the single word: NULL.',
        TO_FILE(file_url)
    ) AS DEBT_TO_EQUITY
FROM
    DIRECTORY(@DEMODB.EQUITY_RESEARCH.FINANCIAL_STMTS_IMAGES_INT_STG);

## The Prompt that also returns the source data extracted from the images.
### The Extracted Total Liabilities & Total Equity is shared in a JSON array along with the debt-to-equity result.  
From the provided balance sheet image, perform the following steps:
1.  Identify and extract the numerical value for 'Total Liabilities' for the period ending December 31, 2024.
2.  Identify and extract the numerical value for 'Total Stockholders\' Equity' (or 'Total Equity') for the same period.
3.  Calculate the Debt-to-Equity ratio using the formula: Total Liabilities / Total Stockholders' Equity.

After calculating, your entire response must contain ONLY a single, minified JSON object with three keys: "total_liabilities", "total_equity", and "debt_to_equity_ratio".
- Do not use thousands separators in the numbers.
- Do not include any text, explanation, or preamble outside the JSON object.


**EXAMPLE:**
- If Total Liabilities is 100,000 and Total Equity is 50,000, the calculation is 100,000 / 50,000 = 2.
- Your required output is:
{"total_liabilities":100000,"total_equity":50000,"debt_to_equity_ratio":2.000}

If any of the required values cannot be found, return a JSON object with null values.

**Failure Example:**
{"total_liabilities":100000,"total_equity":null,"debt_to_equity_ratio":null}

In [None]:
SELECT
    relative_path AS FILE_NAME,
    SPLIT_PART(relative_path, '_', 1) AS TICKER_SYMBOL,
    AI_COMPLETE('claude-3-5-sonnet',
        'From the provided balance sheet image, perform the following steps:
1.  Identify and extract the numerical value for \'Total Liabilities\' for the period ending December 31, 2024.
2.  Identify and extract the numerical value for \'Total Stockholders\' Equity\' (or \'Total Equity\') for the same period.
3.  Calculate the Debt-to-Equity ratio using the formula: Total Liabilities / Total Stockholders\' Equity.

After calculating, your entire response must contain ONLY a single, minified JSON object with three keys: "total_liabilities", "total_equity", and "debt_to_equity_ratio".
- Do not use thousands separators in the numbers.
- Do not include any text, explanation, or preamble outside the JSON object.

---
**EXAMPLE:**
If Total Liabilities is 100,000 and Total Equity is 50,000, the calculation is 100,000 / 50,000 = 2.
Your required output is:
{"total_liabilities":100000,"total_equity":50000,"debt_to_equity_ratio":2.000}
---

If any of the required values cannot be found, return a JSON object with null values.
**Failure Example:**
{"total_liabilities":100000,"total_equity":null,"debt_to_equity_ratio":null}',
        TO_FILE(file_url) -- TO_FILE is used here
    ) AS DEBT_TO_EQUITY
FROM
    DIRECTORY(@DEMODB.EQUITY_RESEARCH.FINANCIAL_STMTS_IMAGES_INT_STG);

In [None]:
SELECT
    relative_path AS FILE_NAME,
    SPLIT_PART(relative_path, '_', 1) AS TICKER_SYMBOL,
    AI_COMPLETE('claude-3-5-sonnet',
        'From the provided balance sheet image, perform the following steps:
1.  Identify and extract the numerical value for \'Total Liabilities\' for the period ending December 31, 2024.
2.  Identify and extract the numerical value for \'Total Stockholders\' Equity\' (or \'Total Equity\') for the same period.
3.  Calculate the Debt-to-Equity ratio using the formula: Total Liabilities / Total Stockholders\' Equity.

After calculating, your entire response must contain ONLY a single, minified JSON object with three keys: "total_liabilities", "total_equity", and "debt_to_equity_ratio".
- Do not use thousands separators in the numbers.
- Do not include any text, explanation, or preamble outside the JSON object.

---
**EXAMPLE:**
If Total Liabilities is 100,000 and Total Equity is 50,000, the calculation is 100,000 / 50,000 = 2.
Your required output is:
{"total_liabilities":100000,"total_equity":50000,"debt_to_equity_ratio":2.000}
---

If any of the required values cannot be found, return a JSON object with null values.
**Failure Example:**
{"total_liabilities":100000,"total_equity":null,"debt_to_equity_ratio":null}',
        TO_FILE(file_url) -- TO_FILE is used here
    ) AS DEBT_TO_EQUITY
FROM
    DIRECTORY(@DEMODB.EQUITY_RESEARCH.FINANCIAL_STMTS_IMAGES_INT_STG)
WHERE
    PARSE_JSON(DEBT_TO_EQUITY):debt_to_equity_ratio::NUMBER(10,3) < 1;

## Download the Notebook from Github:
https://github.com/rrprasan/Finance/tree/main/Snowflake/Notebooks/Company_Financials/Stock_Screening_Using_Images

## Try it today!  