# Data Analysis and Q&A Project Using a Local LLM

## Project Overview

This project requires you to perform a comprehensive analysis of a company's stock data using only the provided data sources and a local LLM. Your analysis should answer the following six questions strictly based on the supplied data and documents—no external data is allowed. All generated answers must be firmly based on the provided data, without any fabricated content. In addition, your logic must be clear, and any attribution of events must be causally linked.

---

## Provided Data

You will be provided with the following data sets:

#### Stock Price Data (Json format)
* Timeframe: Jan 22 to Feb 5
* Fields: Open, High, Low, Close, Volume

#### Quarterly Earnings Data for the Past Year (Json format)
* Contains key financial indicators (e.g., revenue, eps) for each quarter.

#### Full Earnings Transcript Call
* The complete transcript of the earnings call, including management discussions and Q&A.

#### Balance Sheet Data for the Past Year (Json format)
* Includes assets, liabilities, and shareholders' equity information.

#### News Articles
* Full text of 10 news articles related to the company during the analysis period.

---

## Questions
Using the provided data and a local LLM, you need to answer the following six questions:

1. What is the performance of the Tesla stock during this period (Jan 22 to Feb 5)?

2. Why did the price increase on Jan 30? Please provide potential factors.

3. Compared with previous quarters, how is the performance of this quarter?

4. With unsupervised Full Self Driving scheduled to launch in limited markets like Austin by June, what regulatory challenges does Tesla foresee for a nationwide or international rollout, and how is the company strategically preparing to address these hurdles?

5. What insights can be concluded from the earnings call?

6. Which key news events influenced the stock performance, and what insights do they offer?

---

## Project Requirements
- #### Data Source Restriction:
Only use the provided data and documents. No external data or information is allowed.

- #### Answer Generation:
All generated answers must strictly be based on the provided data and documents. The LLM should not "invent" information.

- #### Clear Logic and Causal Relationships:
For each question, your answers must clearly demonstrate logical reasoning, and any attribution of cause must be explicitly linked to events in the data.

- #### Prompt Design:
You must design your own prompts for calling the local LLM to ensure that the responses are generated strictly based on the analysis results.

- #### Result Evaluation:
After generating the answers, implement an evaluation step to assess whether the responses meet the above requirements in terms of data reliance, logical clarity, and correct causation.

- #### Please put the answers to these 6 questions in a dict at the end of your submitted Python nodebook file.

For example
```code
{ "Q1 answer": "Answer1", "Q2 answer": "Answer2", "Q3 answer": "Answer3", "Q4 answer4": "Answer4", "Q5 answer": "Answer5", "Q6 answer": "Answer6"}
```

# How to load and run a local LLM

In [3]:
# You may need to install the following packages:
%pip install transformers torch accelerate

Collecting accelerate
  Downloading accelerate-1.6.0-py3-none-any.whl.metadata (19 kB)
Downloading accelerate-1.6.0-py3-none-any.whl (354 kB)
Installing collected packages: accelerate
Successfully installed accelerate-1.6.0
Note: you may need to restart the kernel to use updated packages.


In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# You can choose any available model from Hugging Face: https://huggingface.co/models
# For example, you can use "meta-llama/Llama-3.2-1B"
model_path = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    trust_remote_code=True, # Trust the remote code for the model
    torch_dtype="auto",
    device_map="auto", # Automatically map the model to available devices (CPU/GPU), if you have GPU&CUDA, you can set it to 'cuda'
)
# The pipeline will automatically use the model and tokenizer you just loaded
tokenizer = AutoTokenizer.from_pretrained(model_path) # Load the tokenizer

 # Create a pipeline for text generation
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 600, # Limit the number of tokens generated
    "return_full_text": False, # Return only the generated text
    "do_sample": True, # Use sampling to generate text
    "temperature": 0.1 # Control the randomness of the output
    # "top_p": 0.9, # Control the diversity of the output
    # "top_k": 50, # Control the diversity of the output
}

Some parameters are on the meta device because they were offloaded to the disk.
Device set to use mps


KeyboardInterrupt: 

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_path = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"

# 1) pick the best device available
if torch.backends.mps.is_available():
    device = "mps"
    dtype  = torch.float16
else:
    device = "cpu"
    dtype  = torch.float32

# 2) load model on that device in half‑precision
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    trust_remote_code=True,
    torch_dtype=dtype,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_path)

# 3) build your pipeline and force it onto MPS/CPU
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device_map="auto"
)

Some parameters are on the meta device because they were offloaded to the disk.
Device set to use mps


Okay, so I need to describe a sunny day as a weather report. The user mentioned they're a cheerful weather reporter, so I should keep the tone positive and enthusiastic. Let me think about the key elements that should be included.

First, I should start with a greeting, maybe "Today is" to set the scene. Then, I need to mention the weather conditions. Since it's sunny, I'll use words like "sunny," "bright," and "lively." It's


In [None]:
generation_args = {
    "max_new_tokens": 600, # Limit the number of tokens generated
    "return_full_text": False, # Return only the generated text
    "do_sample": True, # Use sampling to generate text
    "temperature": 0.1 # Control the randomness of the output
    # "top_p": 0.9, # Control the diversity of the output
    # "top_k": 50, # Control the diversity of the output
}

messages = [
    {"role": "system", "content": "You are a cheerful weather reporter and you will be answering questions about the weather"},
    {"role": "system", "content": "Imagine today is a Sunny day."},
    {"role": "user", "content": "Describe to me what the weather is like today"},
]

# 5) actually run it
out = pipe(messages, **generation_args)
print(out[0]["generated_text"])

# Learning to Write System Prompts and User Prompts

Welcome! In this tutorial, we’ll learn how to craft **System Prompts** and **User Prompts** to interact with generative AI models (like language models). Prompts are the instructions you give to the model, and they directly impact the quality of the output.

## Goals
- Understand the definition and difference between System Prompts and User Prompts
- Learn how to write clear and specific prompts
- Master combining them through experimentation

## What Are System Prompts and User Prompts?

### System Prompt
- **Definition**: A System Prompt is a “global instruction” that defines the model’s role, tone, behavior, or context. It’s typically set at the start and influences the model’s overall responses.
- **Characteristics**: Acts like a “background setting”—not directly part of the user’s input but guides how the model behaves.
- **Examples**:  
  - "You are a helpful assistant who explains things in simple terms."
  - "You are a funny weather reporter."

### User Prompt
- **Definition**: A User Prompt is the specific request or question the user provides, telling the model what task to perform.
- **Characteristics**: Short and specific, usually paired with a System Prompt.
- **Examples**:  
  - "Tell me about a sunny day."
  - "Write a short sentence about rain."

### Differences
| **Type**       | **Purpose**                  | **Example**                          |
|----------------|------------------------------|--------------------------------------|
| System Prompt | Sets the model’s role/style  | "You are a cheerful teacher."        |
| User Prompt   | Specifies the task/question  | "Explain what rain is."              |

---

### Add a System Prompt

Let’s simulate a System Prompt by defining the model’s role, then combine it with a User Prompt.  

In [None]:
messages = [
    {"role": "system", "content": "You are a cheerful weather reporter."},
    {"role": "user", "content": "Describe a sunny day: Today is"},
]

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

Okay, so I need to describe a sunny day as if I'm a cheerful weather reporter. Let me think about how to approach this. First, I should probably start with a greeting, something like "Today is sunny, bringing a warm welcome to the sunny skies of the day ahead." That sets a positive tone.

Next, I should mention the weather itself. Since it's sunny, I can talk about the temperature. Maybe something like "The temperature is pleasant, with a warm 75 degrees Fahrenheit." That gives a clear idea of the weather conditions.

Then, I should talk about the atmosphere. People are happy and excited, so I can say, "The atmosphere is lively, with everyone looking forward to the bright day ahead." This shows that the mood is positive and encouraging.

I should also mention the benefits of the sunny day. People are more productive, so I can say, "People are more productive, with everyone looking forward to the bright day ahead." This highlights the practical benefits of a sunny day.

I should also to

### Optimize the System Prompt

A good System Prompt should clearly define the role and rules. Let’s try a more detailed version.

In [None]:
messages = [
    {"role": "system", "content": "You are a funny weather reporter who uses humor and exaggerations."},
    {"role": "user", "content": "Tell me about a rainy day."},
]

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

Okay, so I need to write a funny weather report about a rainy day. The user provided an example, so I should probably use that as a template. Let me read through the example first to understand the style and tone.

The example starts with a greeting, then mentions the weather, and uses some exaggerated language. It includes elements like "rainy day" and "rainy clouds," which are funny. The example also uses humor with phrases like "rainy day" and "rainy clouds," and maybe some exaggeration with words like "bogus" or "overblown."

I should make sure my response is similar in style. Maybe I can start with a greeting, mention the rain, use some exaggerated language, and include some humor. I should keep it light and funny, not too serious.

Let me think about the structure. The example had a greeting, a description of the weather, some humor, and a conclusion. I can follow that structure. Maybe start with a greeting, mention the rain, use some exaggerated words, add a funny twist, and end

## Creating a Finance-Related System Prompt
### Key Points:
- Template Design: Use placeholders to insert specific financial data such as company name, revenue, net income, debt ratios, etc.
- Clear Instructions: Define specific questions that help the model analyze various aspects (e.g., profitability, balance sheet health, future opportunities and risks).
- Flexibility: Ensure that the template can be adapted to different companies and scenarios by filling in the placeholders.

### Example User Prompt Template:

```
Based on the following financial data, please analyze the company's financial health and provide insights on potential risks and opportunities for future growth.

Company Name: {CompanyName}
Time Period: {TimePeriod}
Key Financial Data:
- Revenue: {Revenue} million USD
- Net Income: {NetIncome} million USD
- Debt-to-Equity Ratio: {DebtEquityRatio}%
- Earnings Per Share (EPS): {EPS}

Answer the following questions:
1. How strong is the company's profitability? Please explain the main factors.
2. Is the company's debt level sustainable? Are there any financial risks?
3. What potential risks and opportunities do you foresee based on the current data?

Please detail your analysis process and provide clear conclusions and recommendations.
```


## Integrating the Prompts

In [None]:
system_prompt = (
    "You are a professional financial analyst with deep expertise in stocks, bonds, mutual funds, and derivatives. "
    "Your responses should be data-driven, professionally rigorous, and provide clear, step-by-step explanations. "
    "Include cautionary advice regarding potential risks, but do not offer direct investment advice."
)

# Define the user prompt template
user_prompt_template = (
    "Based on the following financial data, please analyze the company's financial health and provide insights on potential risks and opportunities for future growth.\n\n"
    "Company Name: {CompanyName}\n"
    "Time Period: {TimePeriod}\n"
    "Key Financial Data:\n"
    "- Revenue: {Revenue} million USD\n"
    "- Net Income: {NetIncome} million USD\n"
    "- Debt-to-Equity Ratio: {DebtEquityRatio}%\n"
    "- Earnings Per Share (EPS): {EPS}\n\n"
    "Answer the following questions:\n"
    "1. How strong is the company's profitability? Please explain the main factors.\n"
    "2. Is the company's debt level sustainable? Are there any financial risks?\n"
    "3. What potential risks and opportunities do you foresee based on the current data?\n\n"
    "Please detail your analysis process and provide clear conclusions and recommendations."
)

# Fill in the template with specific data
filled_user_prompt = user_prompt_template.format(
    CompanyName="ABC Corporation",
    TimePeriod="Q1 2023",
    Revenue="5000",
    NetIncome="800",
    DebtEquityRatio="45",
    EPS="1.2"
)

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": filled_user_prompt},
]

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

Okay, so I'm trying to figure out how to analyze ABC Corporation's financial health based on the data provided. Let me start by going through each question one by one.

First, the question is about the company's profitability. The key data here is revenue at 5000 million USD and net income at 800 million USD. I know that profitability can be measured in several ways, like gross profit, net profit margin, and return on equity (ROE). Since I don't have those numbers, I'll focus on the revenue and net income provided.

Revenue is a good indicator because it shows the company's ability to generate income. A revenue of 5000 million USD is quite high, especially if it's the first quarter of the year. Net income is 800 million, which is a significant portion of the revenue. To find the net profit margin, I can divide net income by revenue. So, 800 / 5000 equals 0.16, or 16%. That's a pretty high margin, which suggests good profitability.

Next, the debt-to-equity ratio is 45%. I remember that

# Prompt Tutorial: Validating Generated Results

This tutorial explains how to create prompts that ensure the generated output meets specified requirements. We will cover constructing both a system prompt and a user prompt template, along with an example of integrating them in a Python application.

---

## 1. Understanding the Requirements

Before generating the response, it's essential to verify that the output:
- Follows the specified format and template.
- Includes all required sections or key points.
- Contains clear, logical, and accurate explanations.
- Uses professional and concise language.

To achieve this, we design prompts that instruct the model to perform a self-evaluation after generating the answer.

---

## 2. Creating the System Prompt

The system prompt sets the model's role and behavior. For validation, the prompt instructs the model to:
- Generate the main answer.
- Perform a comprehensive check to ensure all criteria are met.
- Provide a checklist summary indicating whether each criterion is satisfied. If any criterion is not met, include a brief explanation.

### Example System Prompt:
```plaintext
You are an expert validator specialized in reviewing and verifying generated responses. After generating your answer, please perform a comprehensive check to ensure the following criteria are met:
1. The output follows the specified format and template.
2. All required sections or key points are included.
3. The explanation is clear, logical, and accurate.
After your answer, provide a checklist summary indicating whether each criterion is satisfied. If any criterion is not met, include a brief explanation.
```

## 3. Creating the User Prompt Template
The user prompt provides the specific task and data for analysis. It also instructs the model to include a validation checklist after the main answer.

### Example User Prompt Template:
```
Please analyze the following information and provide a detailed answer. After your analysis, include a checklist verifying that:
- The answer includes a clear introduction, analysis, and conclusion.
- All necessary data points mentioned in the input are addressed.
- The formatting matches the specified template.
- The language is professional and concise.

Here is the input data:
{input_data}

Provide your detailed analysis followed by the verification checklist.
```

## 4. Integration Example in Python
Below is a Python code snippet that demonstrates how to combine the system prompt and the user prompt template in an API call:


In [None]:

system_prompt = (
    "You are an expert validator specialized in reviewing and verifying generated responses. "
    "After generating your answer, please perform a comprehensive check to ensure the following criteria are met:\n"
    "1. The output follows the specified format and template.\n"
    "2. All required sections or key points are included.\n"
    "3. The explanation is clear, logical, and accurate.\n"
    "After your answer, provide a checklist summary indicating whether each criterion is satisfied. "
    "If any criterion is not met, include a brief explanation."
)

user_prompt_template = (
    "Please analyze the following information and provide a detailed answer. After your analysis, include a checklist verifying that:\n"
    "- The answer includes a clear introduction, analysis, and conclusion.\n"
    "- All necessary data points mentioned in the input are addressed.\n"
    "- The formatting matches the specified template.\n"
    "- The language is professional and concise.\n\n"
    "Here is the input data:\n"
    "{input_data}\n\n"
    "Provide your detailed analysis followed by the verification checklist."
)

# Fill in the template with specific data
filled_user_prompt = user_prompt_template.format(
    input_data="Company: XYZ Ltd.\nRevenue: 3000 million USD\nNet Income: 500 million USD\nDebt-to-Equity Ratio: 30%"
)

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": filled_user_prompt},
]

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])
