## 📊 **Financial Report Generation with Economic Indicators**

### **Overview**
This project focuses on creating a concise financial report for companies or stocks using the latest economic and market data. By leveraging open-source tools and APIs, we aim to simplify the process without relying on training or fine-tuning large language models (LLMs) or machine learning models.

---

### **Objectives**
- Build a financial report using real-time economic indicators from the **Financial Modeling Prep API**.
- Streamline data processing and retrieval to produce accurate and actionable insights.
- Avoid the computational overhead of training custom AI models by utilizing pre-trained open-source models.

---

### **Methodology**
1. **Data Retrieval**:  
   Fetch the latest company metrics and market economic indicators using the Financial Modeling Prep API.

2. **Data Preprocessing**:  
   Process the retrieved data using Python and save it in a structured CSV format.

3. **Vector Database**:  
   Load the processed data into a vector database using an embedding model from Hugging Face.

4. **RAG QA Chain**:  
   Build a Retrieval-Augmented Generation (RAG) architecture with **LangChain** and the **Falcon 7B LLM**.

5. **Evaluation**:  
   Query the RAG system and evaluate the quality and relevance of the responses.


### Installing Dependencies and Packages

#### Dependencies


- Install Anaconda from [Anaconda](https://www.anaconda.com/download/success)
- Create a conda virtual environment `conda create finance-venv`
- Activate the conda virtual environment `conda activate finance-venv`
- Install Rust from [Rust](https://rustup.rs/) 
- Install transformers from conda with `conda install -c huggingface transformers`
- Install sentence-transformers from conda with `conda install -c conda-forge sentence-transformers`


#### Python Packages
- langchain
- langchain-community
- langchain-core
- pandas
- python-dotenv
- torch
- torchvision
- torchaudio
- chromadb
- sentence-transformers

In [None]:
%pip install langchain langchain-community langchain-core pandas python-dotenv chromadb

In [None]:
%pip install --upgrade --force-reinstall torch torchvision torchaudio

### Importing Packages

In [8]:
from urllib.request import urlopen
import json
import pandas as pd
from urllib.error import URLError, HTTPError
import ssl
from dotenv import load_dotenv
import os
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.document_loaders import CSVLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain_community.llms import HuggingFaceHub
from IPython.display import display, Markdown
import warnings
warnings.filterwarnings('ignore')


### Settings for Financial Modeling Prep

- Create an account on [financial modeling prep](https://site.financialmodelingprep.com/)
- Create a file **.env** in the project folder
- Set the API key in this file as `FINANCIAL_MODELING_PREP_API_KEY=YOUR_KEY`

In [2]:
load_dotenv()

API_KEY = os.getenv("FINANCIAL_MODELING_PREP_API_KEY")

### Data Retreival
This process demonstrates how to fetch economic indicators for a specific stock ticker using the **Financial Modeling Prep API**. It is designed to handle multiple exchanges and process the data into a structured format for further analysis.


In [3]:
TICKER = "NVDA"
EXCHANGE = "US"

def get_economic_data(ticker, exchange):
  
  if exchange == "NSE":
    url = f"https://financialmodelingprep.com/api/v3/search?query={ticker}&exchange=NSE&apikey={API_KEY}"
  else:
    url = f"https://financialmodelingprep.com/api/v3/quote/{ticker}?apikey={API_KEY}"
  
  try:
      # Create SSL context
      ssl_context = ssl.create_default_context()

      # Fetch and decode data
      with urlopen(url, context=ssl_context) as response:
          data = response.read().decode("utf-8")
          return json.loads(data)
  
  except HTTPError as e:
      print(f"HTTP Error: {e.code} - {e.reason}")
  except URLError as e:
      print(f"URL Error: {e.reason}")
  except json.JSONDecodeError as e:
      print(f"JSON Decode Error: {e.msg}")
  except Exception as e:
      print(f"Unexpected error: {str(e)}")


economic_data_json = get_economic_data(TICKER, EXCHANGE)
economic_data_df = pd.DataFrame(economic_data_json)
economic_data_df

Unnamed: 0,symbol,name,price,changesPercentage,change,dayLow,dayHigh,yearHigh,yearLow,marketCap,...,exchange,volume,avgVolume,open,previousClose,eps,pe,earningsAnnouncement,sharesOutstanding,timestamp
0,NVDA,NVIDIA Corporation,137.01,-2.0868,-2.92,134.71,139.02,152.89,47.32,3355374900000,...,NASDAQ,169431279,224910079,138.555,139.93,2.54,53.94,2025-02-26T21:00:00.000+0000,24490000000,1735333202


### Preprocessing Data

Converting columns of dataframe to date format

In [4]:
def preprocess_economic_data(df):
    df['timestamp'] = pd.to_datetime(df['timestamp'])
    df['earningsAnnouncement'] = pd.to_datetime(df['earningsAnnouncement'])
    return df

preprocessed_economic_data_df = preprocess_economic_data(economic_data_df)
preprocessed_economic_data_df

Unnamed: 0,symbol,name,price,changesPercentage,change,dayLow,dayHigh,yearHigh,yearLow,marketCap,...,exchange,volume,avgVolume,open,previousClose,eps,pe,earningsAnnouncement,sharesOutstanding,timestamp
0,NVDA,NVIDIA Corporation,137.01,-2.0868,-2.92,134.71,139.02,152.89,47.32,3355374900000,...,NASDAQ,169431279,224910079,138.555,139.93,2.54,53.94,2025-02-26 21:00:00+00:00,24490000000,1970-01-01 00:00:01.735333202


### Storing Preprocessed Data

Storing the preprocessed data as a CSV file

In [7]:
preprocessed_economic_data_df.to_csv("data/processed/eco_ind.csv")

### Embeddings

Initializing Embeddings

In [5]:
# Using Document loader from Huggingface to generate documents of CSV file
loader_eco = CSVLoader('data/processed/eco_ind.csv')
documents_eco = loader_eco.load()

# Initializing text splitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=5)

# Splitting documents with text splitter
texts_eco = text_splitter.split_documents(documents_eco)

# Initializing Embeddings
embeddings = HuggingFaceEmbeddings()

  embeddings = HuggingFaceEmbeddings()
  embeddings = HuggingFaceEmbeddings()
  from .autonotebook import tqdm as notebook_tqdm


### Vectore Database

Initializing a vector database and storing the embeddings of documents in the vector databbase

In [7]:
persist_directory = 'docs/chroma_rag/'

economic_langchain_chroma = Chroma.from_documents(
    documents=texts_eco, 
    collection_name="economic_data",
    embedding=embeddings,
    persist_directory=persist_directory
)

### Settings for Huggingfacehub API

In [10]:
load_dotenv()

HUGGINGFACEHUB_API_KEY = os.getenv("HUGGINGFACEHUB_API_KEY")

### RAG Pipeline

Building the Retreival Augmented Generation pipeline

In [15]:
# Initializing the LLM model
llm = HuggingFaceHub(
    repo_id="tiiuae/falcon-7b-instruct",
    model_kwargs={"temperature": 0.1},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_KEY
)

# Initializing the retreiver for RAG Pipline
retriever_eco = economic_langchain_chroma.as_retriever(search_kwargs={"k":2})

# Template prompt for RAG pipeline
template = """You are a Financial Market Expert and Get the Market Economic Data and Market News about Company and Build the Financial Report for me. Understand this Market Information {context} and Answer the Query for this Company {question}. I just need the data into Tabular Form as well."""

# Initializing prompt template
PROMPT = PromptTemplate(input_variables=["context","question"], template=template)

# Initializing retreiver chain
qa_with_sources = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff",chain_type_kwargs = {"prompt": PROMPT}, retriever=retriever_eco, return_source_documents=True)

user_prompt = "Nvidia(NVDA) Financial Report"
llm_response = qa_with_sources({"query": user_prompt})
llm_response

{'query': 'Nvidia(NVDA) Financial Report',
 'result': "You are a Financial Market Expert and Get the Market Economic Data and Market News about Company and Build the Financial Report for me. Understand this Market Information : 0\nsymbol: NVDA\nname: NVIDIA Corporation\n\nearningsAnnouncement: 2025-02-26 21:00:00+00:00 and Answer the Query for this Company Nvidia(NVDA) Financial Report. I just need the data into Tabular Form as well.\n<p>NVIDIA Corporation (NVDA) is a leading provider of visual computing technologies. The company designs and develops graphics processing units (GPUs), which are used in gaming, professional visualization, and other applications. NVIDIA's GPUs are used in gaming, professional visualization, and other applications. The company's products are used in gaming, professional visualization, and other applications. The company's products are used in gaming, professional visualization, and other applications. The company'",
 'source_documents': [Document(metadata=

In [14]:
Markdown(llm_response['result'])

You are a Financial Market Expert and Get the Market Economic Data and Market News about Company and Build the Financial Report for me. Understand this Market Information : 0
symbol: NVDA
name: NVIDIA Corporation

earningsAnnouncement: 2025-02-26 21:00:00+00:00 and Answer the Query for this Company Nvidia(NVDA) Financial Report. I just need the data into Tabular Form as well.
<p>NVIDIA Corporation (NVDA) is a leading provider of visual computing technologies. The company designs and develops graphics processing units (GPUs), which are used in gaming, professional visualization, and other applications. NVIDIA's GPUs are used in gaming, professional visualization, and other applications. The company's products are used in gaming, professional visualization, and other applications. The company's products are used in gaming, professional visualization, and other applications. The company'