# Capstone Project

Welcome to the final project.  
Here you will use everything you learned so far - and more!


![image.png](attachment:image.png)

In [None]:
import os

from datasets import load_dataset
from dotenv import load_dotenv

In [None]:
# Load environment variables from .env file
load_dotenv()

llm_model_name = "gpt-5-nano"
llm_deploy_name = "gpt-5-nano"

aoai_api_key = os.getenv("OPENAI_API_KEY")
aoai_endpoint = "https://aa-dsa-training-msca.openai.azure.com/"
aoai_api_version = "2024-12-01-preview"

## Your Task
The [FinanceBench](https://github.com/patronus-ai/financebench/tree/main#) is a benchmark to test LLMs over a large pool of open book financial question answering (QA). According to their research, GPT-4-Turbo used with a retrieval system **incorrectly answered** or **refused to answer** 81% of questions.

To answer these questions, they used [SEC filings from EDGAR](https://www.sec.gov/search-filings). The researchers have mostly tested the LLMs when passing the whole content (sometime >150 pages of SEC filings), and with a simple retrieval (open book). However, we are certain that the usage of AI-Agents can improve these poor results. By a lot!

Your task is, therefore, to make the impossible possible, and the improbable probable:  
Make an agent that can answer questions such as "What is Amazon's FY2017 days payable outstanding (DPO)? DPO is defined as: $$ \frac{365 * (\text{average accounts payable between FY2016 and FY2017})} {(\text{FY2017 COGS + change in inventory between FY2016 and FY2017})} $$ Round your answer to two decimal places. Address the question by using the line items and information shown within the balance sheet and the P&L statement."

In [None]:
financebench_content_ds = load_dataset("Liadmagen/financebench_content")
financebench_qa_ds = load_dataset("Liadmagen/financebench_QA")

### Tools
Feel free to use *any tool you like*. You can use an existing retrieval, such as [llama-index](https://developers.llamaindex.ai/python/framework/optimizing/advanced_retrieval/advanced_retrieval/) [agentic tools](https://developers.llamaindex.ai/python/llamaagents/overview/) or use SmolAgents with their [text2sql](https://huggingface.co/docs/smolagents/en/examples/text_to_sql) for metadata extraction and conversion... Or maybe orchestrate all of these agents with [CrewAI](https://www.crewai.com/open-source)...

The choice is yours!