# AMS-520 Final project

### Team members:
### 1. Rashtardeep Sandhu
### 2. Atharva Kulkarni
### 3. Virti jain

# Finance Chatbot: Sentiment Analysis and Investment Advice
Welcome to the **Finance Chatbot**! This chatbot leverages the fine-tuned **LLaMA 3** model enhanced with **LoRA (Low-Rank Adaptation)** to provide:
- **Sentiment Analysis**: Understand the sentiment behind financial news and statements.
- **Investment Advice**: Offer recommendations based on stock fundamentals and sentiment.
- **Question-Answering**: Respond to basic financial queries.



# Objectives
1. Load financial datasets for sentiment analysis and investment advice.
2. Fine-tune the LLaMA 3 model on financial data using LoRA.
3. Test the chatbot to analyze sentiments, provide investment advice, and answer financial questions.

Installing Dependencies and Setting API Keys

In [1]:
%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes

Adding HuggingFace Token  

In [2]:
# HuggingFace token
hf_token = "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"

In [3]:
# Fine Tuning Related Packages
from unsloth import FastLanguageModel
import torch
from datasets import load_dataset
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported
import pandas as pd
from datasets import Dataset

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


## Fine-Tuning LLaMA 3 with Unsloth
In this section, we fine-tune the pre-trained **LLaMA 3** model using the **Unsloth** framework. Fine-tuning allows us to adapt the model to specific financial tasks, such as sentiment analysis, investment advice, and financial question-answering.

### Why Fine-Tuning?
While LLaMA 3 is a powerful pre-trained model, it requires adaptation to the domain of finance. Fine-tuning ensures the model understands financial language, nuances, and datasets, enabling it to perform better in sentiment classification and investment recommendations.

### Why Unsloth?
**Unsloth** is a lightweight framework designed for efficient fine-tuning of large language models:
- Simplifies integrating **LoRA (Low-Rank Adaptation)**.
- Optimizes resource usage by enabling parameter-efficient fine-tuning.
- Provides easy-to-use APIs for training and evaluation.

By leveraging Unsloth, we ensure a streamlined and efficient fine-tuning process, even with limited computational resources.


### **Initializing Pre Trained Model and Tokenizer**

We are using Meta's LLaMA 3 8B Instruct Model, a large-scale language model fine-tuned for instruction-based tasks. This model serves as the foundation for our finance chatbot.

In [11]:
# Load the model and tokenizer from the pre-trained FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    # Specify the pre-trained model to use
    model_name = "meta-llama/Meta-Llama-3-8B-Instruct",
    # Specifies the maximum number of tokens (words or subwords) that the model can process in a single forward pass
    max_seq_length = 512,
    # Data type for the model. None means auto-detection based on hardware, Float16 for specific hardware like Tesla T4
    dtype = None,
    # Enable 4-bit quantization, By quantizing the weights of the model to 4 bits instead of the usual 16 or 32 bits, the memory required to store these weights is significantly reduced. This allows larger models to be run on hardware with limited memory resources.
    load_in_4bit = True,
    # Access token for gated models, required for authentication to use models like Meta-Llama-2-7b-hf
    token = hf_token,
)


==((====))==  Unsloth 2024.12.4: Fast Llama patching. Transformers:4.46.3.
   \\   /|    GPU: NVIDIA L4. Max memory: 22.168 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu121. CUDA: 8.9. CUDA Toolkit: 12.1. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.70G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/220 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/51.1k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/345 [00:00<?, ?B/s]

**Adding in LoRA adapters for parameter efficient fine tuning**

LoRA, or Low-Rank Adaptation, is a technique used in machine learning to fine-tune large models more efficiently. It works by adding a small, additional set of parameters to the existing model instead of retraining all the parameters from scratch. This makes the fine-tuning process faster and less resource-intensive. Essentially, LoRA helps tailor a pre-trained model to specific tasks or datasets without requiring extensive computational power or memory.

In [None]:
# Apply LoRA (Low-Rank Adaptation) adapters to the model for parameter-efficient fine-tuning
model = FastLanguageModel.get_peft_model(
    model,
    # Rank of the adaptation matrix. Higher values can capture more complex patterns. Suggested values: 8, 16, 32, 64, 128
    r = 16,
    # Specify the model layers to which LoRA adapters should be applied
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj"],
    # Scaling factor for LoRA. Controls the weight of the adaptation. Typically a small positive integer
    lora_alpha = 16,
    # Dropout rate for LoRA. A value of 0 means no dropout, which is optimized for performance
    lora_dropout = 0,
    # Bias handling in LoRA. Setting to "none" is optimized for performance, but other options can be used
    bias = "none",
    # Enables gradient checkpointing to save memory during training. "unsloth" is optimized for very long contexts
    use_gradient_checkpointing = "unsloth",
    # Seed for random number generation to ensure reproducibility of results
    random_state = 3407,
)

Unsloth 2024.12.4 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.



1. **r**: The rank of the low-rank adaptation matrix. This determines the capacity of the adapter to capture additional information. Higher ranks allow capturing more complex patterns but also increase computational overhead.

2. **target_modules**: List of module names within the model to which the LoRA adapters should be applied. These modules typically include the projections within transformer layers (e.g., query, key, value projections) and other key transformation points.
  - **q_proj**: Projects input features to query vectors for attention mechanisms.
  - **k_proj**: Projects input features to key vectors for attention mechanisms.
  - **v_proj**: Projects input features to value vectors for attention mechanisms.
  - **o_proj**: Projects the output of the attention mechanism to the next layer.
  - **gate_proj**: Applies gating mechanisms to modulate the flow of information.
  - **up_proj**: Projects features to a higher dimensional space, often used in feed-forward networks.
  - **down_proj**: Projects features to a lower dimensional space, often used in feed-forward networks.

These layers are typically involved in transformer-based models, facilitating various projections and transformations necessary for the attention mechanism and feed-forward processes.

3. **lora_alpha**: A scaling factor for the LoRA adapter. It controls the impact of the adapter on the model's outputs. Typically set to a small positive integer.

4. **lora_dropout**: Dropout rate applied to the LoRA adapters. Dropout helps in regularizing the model, but setting it to 0 means no dropout, which is often optimal for performance.

5. **bias**: This specifies how biases should be handled in the LoRA adapters. Setting it to "none" indicates no bias is used, which is optimized for performance, although other options are available depending on the use case.

6. **use_gradient_checkpointing**: Enables gradient checkpointing, which helps to save memory during training by not storing all intermediate activations. "unsloth" is a setting optimized for very long contexts, but it can also be set to True.

7. **random_state**: A seed for the random number generator to ensure reproducibility. This makes sure that the results are consistent across different runs of the code.

### **Fine Tuning Dataset**



The following code below formats the entries into the prompt defined first for training, being careful to add in special tokens. In this case our End of Sentence token is <|eot_id|>.

In [None]:
# Defining the expected prompt
ft_prompt=""" <|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a financial assistant capable of handling different tasks:
1. General Financial Questions: Answer the user's question based on the given context or general financial knowledge if no context is provided.
2. Sentiment Analysis: Identify the sentiment (Positive, Negative, Neutral) of the query and provide a brief explanation.
If the query involves both sentiment analysis and a financial question, perform both tasks. If the user’s intent is unclear, ask clarifying questions to ensure helpful responses. <|eot_id|>

<|start_header_id|>user<|end_header_id|>

### Question:
{}

### Context:
{}

<|eot_id|>

### Response: <|start_header_id|>assistant<|end_header_id|>
{}"""

# Grabbing end of sentence special token
EOS_TOKEN = tokenizer.eos_token

# Function for formatting above prompt with information from Financial QA dataset
def formatting_prompts_func(examples):
    questions = examples["question"]
    contexts       = examples["context"]
    responses      = examples["answer"]
    texts = []
    for question, context, response in zip(questions, contexts, responses):
        # Must add EOS_TOKEN, otherwise your generation will go on forever!
        text = ft_prompt.format(question, context, response) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }
pass

# Load CSV
data_path = "/content/QA_plus_sentiment_12k.csv"
df = pd.read_csv(data_path)

# Convert to Hugging Face Dataset
dataset = Dataset.from_pandas(df)
dataset = dataset.map(formatting_prompts_func, batched = True,)

Map:   0%|          | 0/11929 [00:00<?, ? examples/s]

In [None]:
dataset[0]

{'context': 'Data Center The NVIDIA computing platform is focused on accelerating the most compute-intensive workloads, such as AI, data analytics, graphics and scientific computing, across hyperscale, cloud, enterprise, public sector, and edge data centers. The platform consists of our energy efficient GPUs, data processing units, or DPUs, interconnects and systems, our CUDA programming model, and a growing body of software libraries, software development kits, or SDKs, application frameworks and services, which are either available as part of the platform or packaged and sold separately.',
 'answer': 'The NVIDIA computing platform includes energy-efficient GPUs, data processing units (DPUs), interconnects, systems, the CUDA programming model, and a suite of software libraries, SDKs, application frameworks, and services.',
 'question': 'What are the key components of the NVIDIA computing platform?',
 'text': " <|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a fina

### **Defining the Trainer Arguments**

We will be using HuggingFace Transformer Reinforcement Learning (TRL)'s [Supervised Fine-Tuning Trainer](https://huggingface.co/docs/trl/sft_trainer)

**Supervised fine-tuning** is a process in machine learning where a pre-trained model is further trained on a specific dataset with labeled examples. During this process, the model learns to make predictions or classifications based on the labeled data, improving its performance on the specific task at hand. This technique leverages the general knowledge the model has already acquired during its initial training phase and adapts it to perform well on a more targeted set of examples. Supervised fine-tuning is commonly used to customize models for specific applications, such as sentiment analysis, object recognition, or language translation, by using task-specific annotated data.

In [None]:
trainer = SFTTrainer(
    # The model to be fine-tuned
    model = model,
    # The tokenizer associated with the model
    tokenizer = tokenizer,
    # The dataset used for training
    train_dataset = dataset,
    # The field in the dataset containing the text data
    dataset_text_field = "text",
    # Maximum sequence length for the training data
    max_seq_length = 1024,
    # Number of processes to use for data loading
    dataset_num_proc = 2,
    # Whether to use sequence packing, which can speed up training for short sequences
    packing = False,
    args = TrainingArguments(
        # Batch size per device during training
        per_device_train_batch_size = 4,
        # Number of gradient accumulation steps to perform before updating the model parameters
        gradient_accumulation_steps = 4,
        # Number of warmup steps for learning rate scheduler
        warmup_steps = 5,
        # Total number of training steps
        # max_steps = 60,
        # Number of training epochs, can use this instead of max_steps, for this notebook its ~900 steps given the dataset
        num_train_epochs = 3,
        # Learning rate for the optimizer
        learning_rate = 2e-4,
        # Use 16-bit floating point precision for training if bfloat16 is not supported
        fp16 = not is_bfloat16_supported(),
        # Use bfloat16 precision for training if supported
        bf16 = is_bfloat16_supported(),
        # Number of steps between logging events
        logging_steps = 1,
        # Optimizer to use (in this case, AdamW with 8-bit precision)
        optim = "adamw_8bit",
        # Weight decay to apply to the model parameters
        weight_decay = 0.01,
        # Type of learning rate scheduler to use
        lr_scheduler_type = "linear",
        # Seed for random number generation to ensure reproducibility
        seed = 3407,
        # Directory to save the output models and logs
        output_dir = "outputs",
    ),
)


Map (num_proc=2):   0%|          | 0/11929 [00:00<?, ? examples/s]

1. **model**: The model to be fine-tuned. This is the pre-trained model that will be adapted to the specific training data.

2. **tokenizer**: The tokenizer associated with the model. It converts text data into tokens that the model can process.

3. **train_dataset**: The dataset used for training. This is the collection of labeled examples that the model will learn from during the fine-tuning process.

4. **dataset_text_field**: The field in the dataset containing the text data. This specifies which part of the dataset contains the text that the model will be trained on.

5. **max_seq_length**: The maximum sequence length for the training data. This limits the number of tokens per input sequence to ensure they fit within the model's processing capacity.

6. **dataset_num_proc**: The number of processes to use for data loading. This can speed up data loading by parallelizing it across multiple processes.

7. **packing**: A boolean indicating whether to use sequence packing. Sequence packing can speed up training by combining multiple short sequences into a single batch.

8. **args**: A set of training arguments that configure the training process. These include various hyperparameters and settings:

    - **per_device_train_batch_size**: The batch size per device during training. This determines how many examples are processed together in one forward/backward pass.
    
    - **gradient_accumulation_steps**: The number of gradient accumulation steps to perform before updating the model parameters. This allows for effectively larger batch sizes without requiring more memory.
    
    - **warmup_steps**: The number of warmup steps for the learning rate scheduler. During these steps, the learning rate increases gradually to the initial value.
    
    - **max_steps**: The total number of training steps. This defines how many batches of training data the model will be trained on.
    
    - **num_train_epochs**: The number of training epochs (uncommented in the example). This defines how many times the entire training dataset will be passed through the model.
    
    - **learning_rate**: The learning rate for the optimizer. This controls how much to adjust the model's weights with respect to the gradient during training.
    
    - **fp16**: A boolean indicating whether to use 16-bit floating point precision for training if bfloat16 is not supported. This can speed up training and reduce memory usage.
    
    - **bf16**: A boolean indicating whether to use bfloat16 precision for training if supported. This can also speed up training and reduce memory usage while maintaining higher precision than fp16.
    
    - **logging_steps**: The number of steps between logging events. This controls how frequently training progress and metrics are logged.
    
    - **optim**: The optimizer to use. In this case, AdamW with 8-bit precision, which can improve efficiency for large models.
    
    - **weight_decay**: The weight decay to apply to the model parameters. This is a regularization technique to prevent overfitting by penalizing large weights.
    
    - **lr_scheduler_type**: The type of learning rate scheduler to use. This controls how the learning rate changes over time during training.
    
    - **seed**: The seed for random number generation. This ensures reproducibility of results by controlling the randomness in training.
    
    - **output_dir**: The directory to save the output models and logs. This specifies where the trained model and training logs will be stored.

# **Training**

In [None]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 11,929 | Num Epochs = 3
O^O/ \_/ \    Batch size per device = 4 | Gradient Accumulation steps = 4
\        /    Total batch size = 16 | Total steps = 2,235
 "-____-"     Number of trainable parameters = 41,943,040


Step,Training Loss
1,3.965
2,3.8191
3,4.1081
4,3.5189
5,3.0116
6,2.6941
7,2.2913
8,1.9282
9,1.6587
10,1.4126


---
### **Saving Fine-Tuned Model Locally**

In [None]:
model.save_pretrained("/content/lora_model_final/") # Local saving
tokenizer.save_pretrained("/content/lora_model_final/")

('/content/lora_model_final/tokenizer_config.json',
 '/content/lora_model_final/special_tokens_map.json',
 '/content/lora_model_final/tokenizer.json')

### **Function for Loading New Model Later and making any changes in the prompt for inference**

In [4]:
ft_prompt="""<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a professional financial advisor designed to assist users with financial queries in detail. Your responsibilities include:
1. Providing actionable financial advice based on the user's query and context, if provided.
2. Relying on your expertise and financial principles to offer well-reasoned advice when context is missing.
3. Explaining your recommendations in a detailed manner.
4. Identifying ambiguous or incomplete queries and asking for clarification when necessary.

Your responses should:
- Provide a comprehensive analysis in multiple paragraphs.
- Be accurate, logical, and based on sound financial principles.
- Avoid speculation or unverified information.
- Always remain professional, empathetic, and helpful.

<|eot_id|>

<|start_header_id|>user<|end_header_id|>

### Query:
{}

### Context (if provided):
{}

<|eot_id|>

### Response: <|start_header_id|>assistant<|end_header_id|>
{}"""

if True: # switch to true to load model back up
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "/content/lora_model_final", # Put model files in new folder - lora_model
        max_seq_length = 2048, # Existing arguments from when we loaded earlier
        dtype = None,
        load_in_4bit = True,

    )
    FastLanguageModel.for_inference(model)



==((====))==  Unsloth 2024.12.4: Fast Llama patching. Transformers:4.46.3.
   \\   /|    GPU: NVIDIA L4. Max memory: 22.168 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu121. CUDA: 8.9. CUDA Toolkit: 12.1. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.70G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/220 [00:00<?, ?B/s]

Unsloth 2024.12.4 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


### **Setting Up Functions for Running Inference**

**Inference** refers to the process of using a trained machine learning model to make predictions or generate outputs based on new, unseen input data. It involves feeding the input data into the model and obtaining the model's predictions, classifications, or generated text, depending on the task the model is designed for. Inference is the application phase of the model, as opposed to the training phase.

In [5]:
# Main Inference Function, handles generating and decoding tokens
def inference(question, context):
  inputs = tokenizer(
  [
      ft_prompt.format(
          question,
          context,
          "", # output - leaving this blank for generation
      )
  ], return_tensors = "pt").to("cuda")

  # Generate tokens for the input prompt using the model, with a maximum of 512 new tokens.
  # The `use_cache` parameter enables faster generation by reusing previously computed values.
  # The `pad_token_id` is set to the EOS token to handle padding properly.
  outputs = model.generate(**inputs, max_new_tokens = 512, use_cache = True, pad_token_id=tokenizer.eos_token_id)
  response = tokenizer.batch_decode(outputs) # Decoding tokens into english words
  return response

In [6]:
# Function for extracting just the language model generation from the full response
def extract_response(text):
    text = text[0]
    start_token = "### Response: <|start_header_id|>assistant<|end_header_id|>"
    end_token = "<|eot_id|>"

    start_index = text.find(start_token) + len(start_token)
    end_index = text.find(end_token, start_index)

    if start_index == -1 or end_index == -1:
        return None

    return text[start_index:end_index].strip()

## General Queries

In [7]:
FastLanguageModel.for_inference(model)
# Testing it out!
context = ""
question = "List the first 15 numbers of fibonacci sequence"

resp = inference(question, context)
parsed_response = extract_response(resp)
print(parsed_response)

The first 15 numbers of the Fibonacci sequence are: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, and 377.


In [8]:
print(resp)

["<|begin_of_text|><|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a professional financial advisor designed to assist users with financial queries in detail. Your responsibilities include:\n1. Providing actionable financial advice based on the user's query and context, if provided.\n2. Relying on your expertise and financial principles to offer well-reasoned advice when context is missing.\n3. Explaining your recommendations in a detailed manner.\n4. Identifying ambiguous or incomplete queries and asking for clarification when necessary.\n\nYour responses should:\n- Provide a comprehensive analysis in multiple paragraphs.\n- Be accurate, logical, and based on sound financial principles.\n- Avoid speculation or unverified information.\n- Always remain professional, empathetic, and helpful.\n\n<|eot_id|>\n\n<|start_header_id|>user<|end_header_id|>\n\n### Query:\nList the first 15 numbers of fibonacci sequence\n\n### Context (if provided):\n\n\n<|eot_id|>\n\n### Resp

# Real time functionality

In [9]:
import yfinance as yf
import requests


# Function to fetch real-time metrics
def get_real_time_metrics(ticker):
    try:
        stock = yf.Ticker(ticker)
        info = stock.info
        real_time_metrics = {
            "Current Price": info.get("currentPrice"),
            "P/E Ratio": info.get("trailingPE"),


            # Valuation and Market Data
            "Market Cap": info.get("marketCap"),
            "Enterprise Value": info.get("enterpriseValue"),
            "Trailing P/E": info.get("trailingPE"),
            "Forward P/E": info.get("forwardPE"),
            "PEG Ratio": info.get("pegRatio"),
            "Price to Book": info.get("priceToBook"),
            "Enterprise to Revenue": info.get("enterpriseToRevenue"),
            "Enterprise to EBITDA": info.get("enterpriseToEbitda"),
            "Long Business Summary": info.get("longBusinessSummary"),

            # Price, Volume, and Range Data
            "Day High": info.get("dayHigh"),
            "Day Low": info.get("dayLow"),
            "Previous Close": info.get("previousClose"),
            "Open": info.get("open"),
            "Fifty Two Week High": info.get("fiftyTwoWeekHigh"),
            "Fifty Two Week Low": info.get("fiftyTwoWeekLow"),
            "Fifty Day Average": info.get("fiftyDayAverage"),
            "Two Hundred Day Average": info.get("twoHundredDayAverage"),
            "Volume": info.get("volume"),
            "Average Volume": info.get("averageVolume"),
            "Beta": info.get("beta"),
        }
        return real_time_metrics
    except Exception as e:
        return f"Error fetching real-time metrics: {e}"

# Function to fetch real-time news
def get_real_time_news(ticker):
    api_key = "bf2ec532378a4558b57f0d3f4bd8175f"
    url = f"https://newsapi.org/v2/everything?q={ticker}&apiKey={api_key}"
    try:
        response = requests.get(url)
        articles = response.json().get("articles", [])
        news_summary = [
            f"Title: {article['title']} | Source: {article['source']['name']} | Published: {article['publishedAt']}"
            for article in articles[:5]  # Limit to 5 articles
        ]
        return "\n".join(news_summary)
    except Exception as e:
        return f"Error fetching real-time news: {e}"

# Main inference function
def inference(question, context):
    inputs = tokenizer([
        ft_prompt.format(
            question,
            context,
            "",  # output - leaving this blank for generation
        )
    ], return_tensors="pt").to("cuda")

    outputs = model.generate(
        **inputs,
        max_new_tokens=2048,
        use_cache=True,
        pad_token_id=tokenizer.eos_token_id
    )
    response = tokenizer.batch_decode(outputs)
    return response

# Function to extract response
def extract_response(text):
    text = text[0]
    start_token = "### Response: <|start_header_id|>assistant<|end_header_id|>"
    end_token = "<|eot_id|>"

    start_index = text.find(start_token) + len(start_token)
    end_index = text.find(end_token, start_index)

    if start_index == -1 or end_index == -1:
        return None

    return text[start_index:end_index].strip()

# Function to build context
def build_context():
    final_context = ""  # Initialize the final context

    # Step 1: Ask if the user wants to mention a stock
    use_stock = input("Do you want to mention a stock? (y/n): ").strip().lower()
    if use_stock == "y":
        # Ask for the stock ticker
        ticker = input("Enter the stock ticker (e.g., AAPL): ").strip().upper()

        # Step 2: Ask if user wants real-time metrics
        want_metrics = input(f"Do you want real-time metrics for {ticker}? (y/n): ").strip().lower()
        if want_metrics == "y":
            real_time_data = get_real_time_metrics(ticker)
            if isinstance(real_time_data, dict):
                metrics_context = f"Real-time metrics for {ticker}:\n"
                for key, value in real_time_data.items():
                    metrics_context += f"- {key}: {value}\n"
                # print(metrics_context)
                final_context += metrics_context + "\n"
            else:
                print(real_time_data)

        # Step 3: Ask if user wants real-time news
        want_news = input(f"Do you want real-time news for {ticker}? (y/n): ").strip().lower()
        if want_news == "y":
            news_data = get_real_time_news(ticker)
            news_context = f"Real-time news for {ticker}:\n{news_data}\n"
            # print(news_context)
            final_context += news_context + "\n"

    # Step 4: Ask if they have additional context
    has_additional_context = input("Do you have any additional context to add? (y/n): ").strip().lower()
    if has_additional_context == "y":
        additional_context = input("Please provide the additional context: ").strip()
        final_context += f"Additional context:\n{additional_context}\n"

    if not final_context:
        print("No context provided.")
    else:
        print("Final Context:\n", final_context)

    return final_context

# Main Program for testing

In [10]:
# Main program
def main():
    final_context = build_context()
    if final_context:  # If context exists, allow the user to ask questions
        while True:
            question = input("Ask a question based on the context (or type 'x' to exit): ").strip()
            if question.lower() == "x":
                break
            response = inference(question, final_context)
            parsed_response = extract_response(response)
            print(f"Finance Agent: {parsed_response}")
            print("-----\n")
    else:
        print("No context was provided. Exiting.")

# Execute the program
main()

Do you want to mention a stock? (y/n): y
Enter the stock ticker (e.g., AAPL): TSLA
Do you want real-time metrics for TSLA? (y/n): y
Do you want real-time news for TSLA? (y/n): y
Do you have any additional context to add? (y/n): n
Final Context:
 Real-time metrics for TSLA:
- Current Price: 418.1
- P/E Ratio: 114.54794
- Market Cap: 1342126161920
- Enterprise Value: 1343451037696
- Trailing P/E: 114.54794
- Forward P/E: 127.72613
- PEG Ratio: None
- Price to Book: 19.173622
- Enterprise to Revenue: 13.829
- Enterprise to EBITDA: 101.438
- Long Business Summary: Tesla, Inc. designs, develops, manufactures, leases, and sells electric vehicles, and energy generation and storage systems in the United States, China, and internationally. The company operates in two segments, Automotive, and Energy Generation and Storage. The Automotive segment offers electric vehicles, as well as sells automotive regulatory credits; and non-warranty after-sales vehicle, used vehicles, body shop and parts, sup

---
# Disclaimer

The information provided from this notebook is for educational purposes only and does not constitute financial advice. The purpose of this project is purely to show the power of LLM can have on our financial decisions. We are not responsible for any losses or damages that may arise from the use of the information provided on this platform. Always consider your individual financial situation and objectives before making any investment or financial decision.
