## Finance ChatBot

The `TinyLlama-1.1B-Chat-v1.0` model is fine-tuned on the [finance_alpaca.json](https://github.com/JoyM268/Intel-Unnati-Industrial-Training-Program-2024/blob/main/finance%20chatbot/finance_alpaca.json) dataset and later the fine-tuned model  is prompted Questions. The chatbot is fine-tuned to answer finance related questions.

## Prepare Environment

Install `intel_extension_for_transformers` Library.

In [1]:
!pip install intel-extension-for-transformers

Collecting intel-extension-for-transformers
  Downloading intel_extension_for_transformers-1.4.2-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (26 kB)
Collecting schema (from intel-extension-for-transformers)
  Downloading schema-0.7.7-py2.py3-none-any.whl.metadata (34 kB)
Collecting neural-compressor (from intel-extension-for-transformers)
  Downloading neural_compressor-2.6-py3-none-any.whl.metadata (15 kB)
Collecting pycocotools (from neural-compressor->intel-extension-for-transformers)
  Downloading pycocotools-2.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.1 kB)
Downloading intel_extension_for_transformers-1.4.2-cp310-cp310-manylinux_2_28_x86_64.whl (45.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.3/45.3 MB[0m [31m33.0 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[?25hDownloading neural_compressor-2.6-py3-none-any.whl (1.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m41.1 MB/s

Clone the `intel-extension-for-transformers` git repository.

In [2]:
!git clone https://github.com/intel/intel-extension-for-transformers.git

Cloning into 'intel-extension-for-transformers'...
remote: Enumerating objects: 1682035, done.[K
remote: Counting objects: 100% (116683/116683), done.[K
remote: Compressing objects: 100% (13586/13586), done.[K
remote: Total 1682035 (delta 63043), reused 113533 (delta 60254), pack-reused 1565352[K
Receiving objects: 100% (1682035/1682035), 595.02 MiB | 14.33 MiB/s, done.
Resolving deltas: 100% (898941/898941), done.
Updating files: 100% (3217/3217), done.


Install the necessary requirements given in [requirements.txt](https://github.com/JoyM268/Intel-Unnati-Industrial-Training-Program-2024/blob/main/requirements/requirements.txt) file.

In [3]:
%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/
!pip install -r requirements.txt
%cd ../../../

/kaggle/working/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat
Collecting cchardet (from -r requirements.txt (line 2))
  Downloading cchardet-2.1.7.tar.gz (653 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m653.6/653.6 kB[0m [31m23.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting einops (from -r requirements.txt (line 3))
  Downloading einops-0.8.0-py3-none-any.whl.metadata (12 kB)
Collecting evaluate (from -r requirements.txt (line 4))
  Downloading evaluate-0.4.2-py3-none-any.whl.metadata (9.3 kB)
Collecting fschat==0.2.35 (from -r requirements.txt (line 6))
  Downloading fschat-0.2.35-py3-none-any.whl.metadata (19 kB)
Collecting intel_extension_for_pytorch==2.3.0 (from -r requirements.txt (line 8))
  Downloading intel_extension_for_pytorch-2.3.0-cp310-cp310-manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting lm-eval (from -r requirements.txt (line 9))
  Downloading lm_eval-0.4.

## Login Hugging Face
Logging in hugging face using User Access Token, The access token must be saved as a secret under the secret name `Intel` before logging in.

In [4]:
from kaggle_secrets import UserSecretsClient
from huggingface_hub import login
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("Intel")
login(token=secret_value_0)

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


## Finetuning The Model
The [TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) model is fine-tuned on the [finance_alpaca.json](https://github.com/JoyM268/Intel-Unnati-Industrial-Training-Program-2024/blob/main/finance%20chatbot/finance_alpaca.json) dataset. Make sure to upload the dataset before fine-tuning the model. The fine-tuned model is saved in the specified `output_dir`, here the output directory is `./finetuned_model`.

In [6]:
from transformers import TrainingArguments
from intel_extension_for_transformers.neural_chat.config import (
    ModelArguments,
    DataArguments,
    FinetuningArguments,
    TextGenerationFinetuningConfig,
)
from intel_extension_for_transformers.neural_chat.chatbot import finetune_model
model_args = ModelArguments(model_name_or_path="TinyLlama/TinyLlama-1.1B-Chat-v1.0")
data_args = DataArguments(train_file="/kaggle/input/finance/finance_alpaca.json", validation_split_percentage=1)
training_args = TrainingArguments(
    output_dir='./finetuned_model',
    do_train=True,
    do_eval=True,
    num_train_epochs=3,
    overwrite_output_dir=True,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=2,
    save_strategy="no",
    log_level="info",
    save_total_limit=2,
    bf16=True,
)
finetune_args = FinetuningArguments()
finetune_cfg = TextGenerationFinetuningConfig(
            model_args=model_args,
            data_args=data_args,
            training_args=training_args,
            finetune_args=finetune_args,
        )
finetune_model(finetune_cfg)

[INFO|training_args.py:2048] 2024-07-15 09:23:52,880 >> PyTorch: setting up devices
[INFO|training_args.py:1751] 2024-07-15 09:23:52,908 >> The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
[INFO|configuration_utils.py:733] 2024-07-15 09:23:53,253 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-Chat-v1.0/snapshots/fe8a4ea1ffedaf415f4da2f062534de366a451e6/config.json
[INFO|configuration_utils.py:800] 2024-07-15 09:23:53,255 >> Model config LlamaConfig {
  "_name_or_path": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 2048,
  "i

trainable params: 1,126,400 || all params: 1,101,174,784 || trainable%: 0.10229075496156657


[INFO|trainer.py:642] 2024-07-15 09:23:57,546 >> Using auto half precision backend
[INFO|trainer.py:2128] 2024-07-15 09:23:58,046 >> ***** Running training *****
[INFO|trainer.py:2129] 2024-07-15 09:23:58,047 >>   Num examples = 3,733
[INFO|trainer.py:2130] 2024-07-15 09:23:58,047 >>   Num Epochs = 3
[INFO|trainer.py:2131] 2024-07-15 09:23:58,048 >>   Instantaneous batch size per device = 4
[INFO|trainer.py:2134] 2024-07-15 09:23:58,049 >>   Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:2135] 2024-07-15 09:23:58,051 >>   Gradient Accumulation steps = 2
[INFO|trainer.py:2136] 2024-07-15 09:23:58,051 >>   Total optimization steps = 1,401
[INFO|trainer.py:2137] 2024-07-15 09:23:58,054 >>   Number of trainable parameters = 1,126,400


Step,Training Loss
500,0.8514
1000,0.8295


[INFO|trainer.py:2383] 2024-07-15 10:47:59,242 >> 

Training completed. Do not forget to share your model on huggingface.co/models =)


[INFO|trainer.py:3478] 2024-07-15 10:47:59,248 >> Saving model checkpoint to ./finetuned_model
[INFO|tokenization_utils_base.py:2574] 2024-07-15 10:47:59,292 >> tokenizer config file saved in ./finetuned_model/tokenizer_config.json
[INFO|tokenization_utils_base.py:2583] 2024-07-15 10:47:59,293 >> Special tokens file saved in ./finetuned_model/special_tokens_map.json
[INFO|trainer.py:3788] 2024-07-15 10:47:59,299 >> 
***** Running Evaluation *****
[INFO|trainer.py:3790] 2024-07-15 10:47:59,299 >>   Num examples = 38
[INFO|trainer.py:3793] 2024-07-15 10:47:59,300 >>   Batch size = 4


***** eval metrics *****
  epoch                   =        3.0
  eval_loss               =     1.0709
  eval_ppl                =      2.918
  eval_runtime            = 0:00:08.20
  eval_samples            =         38
  eval_samples_per_second =      4.629
  eval_steps_per_second   =      1.218


## Inference with the finetuned model
The output files of the fine-tuned model are stored in the `./finetuned_model` directory, therefore the `peft_model_path` is `./finetuned_model` and the `base_model_path` is `TinyLlama/TinyLlama-1.1B-Chat-v1.0`. Here the question to be prompted is being stored in the `query` variable.

In [10]:
from intel_extension_for_transformers.neural_chat.models.model_utils import load_model, predict_stream
from transformers import set_seed
set_seed(27)

base_model_path = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
peft_model_path = "./finetuned_model"

load_model(model_name=base_model_path,
        tokenizer_name=base_model_path,
        peft_path=peft_model_path,
        )

template = """
### System:
- You are a helpful finance chatbot.
- You answer questions.
- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- You give answers to finance related questions</s>
### User:
{}</s>
### Assistant:
"""

query = "what is meant by cost of equity?"

params = {
        "prompt": template.format(query),
        "model_name": base_model_path,
        "use_cache": True,
        "repetition_penalty": 1.0,
        "temperature": 0.1,
        "top_k": 10,
        "top_p": 0.75,
        "num_beams": 1,
        "max_new_tokens": 1000
        }

for new_text in predict_stream(**params):
    print(new_text, end="", flush=True)


Loading model TinyLlama/TinyLlama-1.1B-Chat-v1.0


[INFO|configuration_utils.py:733] 2024-07-15 10:51:05,080 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-Chat-v1.0/snapshots/fe8a4ea1ffedaf415f4da2f062534de366a451e6/config.json
[INFO|configuration_utils.py:800] 2024-07-15 10:51:05,082 >> Model config LlamaConfig {
  "_name_or_path": "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 2048,
  "initializer_range": 0.02,
  "intermediate_size": 5632,
  "max_position_embeddings": 2048,
  "mlp_bias": false,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 22,
  "num_key_value_heads": 4,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.42

Cost of equity is a financial metric used to evaluate the cost of capital for a company. It is calculated by dividing the total return on equity (ROE) by the cost of equity (COE). The COE is a measure of the cost of capital for a company, which is the amount of money required to finance the company's operations.

The cost of equity is important for investors because it provides a measure of the risk associated with investing in a company. It takes into account the cost of capital, which includes interest rates, taxes, and other expenses associated with investing in the company. By comparing the cost of equity to the ROE, investors can determine whether the company's investment is profitable or not.

The cost of equity is typically calculated using a discounted cash flow (DCF) model. This model assumes that the company will continue to generate future cash flows at a constant rate, and that the cash flows are discounted to their present value using a discount rate. The discount rate is 