# Fine-tuning Gemma using Ludwig on GST-FAQs dataset


🙌 Welcome to the hands-on tutorial dedicated to exploring the cutting-edge capabilities of [Ludwig](https://ludwig.ai/latest/) 0.8, for building an Question Answering model for FAQs (Frequently Asked Questions) on GST (Goods and Services Tax) in India.

Ludwig, an open-source package has been used here to train machine learning models in Encoder-Combination-Decoder (ECD) mode as well as in fine-tuning LLMs via Instruction Tuning mode, through declarative config files.

A bit more info about GST:  GST is a single tax-structure that replaces a multitude of taxes that were there before in India, such as the service tax, central excise duty, VAT, and more. It's the all-in-one tax solution that streamlines the entire tax process in India. This transition from mutlitude-tax system to a single-tax system, raises lots of queries. These queries, along with their answers are avaiable as FAQs. Building a ML model or a fine-tuned LLM would surely help build a chatbot like application on top.

👉👉 Step-by-step explanation of the solution is available [here](https://medium.com/analytics-vidhya/how-to-fine-tune-llms-without-coding-41cf8d4b5d23).

## Installation 🧰

Needs HuggingFace API Token, access approval to Gemma–7b-it, and a GPU with a minimum of 12 GiB of VRAM. Here in this notebook, T4 GPU is being used. You **MUST** connect to T4 GPU in the runtime to get following steps working.

In [1]:
!pip uninstall -y tensorflow --quiet
!pip install Cython # do this before installing torch which is inside ludwig, to avoid "_C" error
!pip install ludwig
!pip install ludwig[llm]
!pip install accelerate
from accelerate.utils import write_basic_config; write_basic_config(mixed_precision='fp16')
!pip install -i https://pypi.org/simple/ bitsandbytes  # latest
# !pip install bitsandbytes==0.41.3 --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui # overriding 0.40.2 which comes with Ludwig
#You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.

Collecting sentence-transformers (from ludwig[llm])
  Using cached sentence_transformers-2.5.1-py3-none-any.whl (156 kB)
Collecting faiss-cpu (from ludwig[llm])
  Using cached faiss_cpu-1.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.0 MB)
Collecting loralib (from ludwig[llm])
  Using cached loralib-0.1.2-py3-none-any.whl (10 kB)
Collecting peft (from ludwig[llm])
  Using cached peft-0.9.0-py3-none-any.whl (190 kB)
Installing collected packages: loralib, faiss-cpu, sentence-transformers, peft
Successfully installed faiss-cpu-1.8.0 loralib-0.1.2 peft-0.9.0 sentence-transformers-2.5.1
Looking in indexes: https://pypi.org/simple/


Enable text wrapping so we don't have to scroll horizontally and create a

---

function to flush CUDA cache.

In [2]:
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))

get_ipython().events.register('pre_run_cell', set_css)

def clear_cache():
  if torch.cuda.is_available():
    model = None
    torch.cuda.empty_cache()

Sometime error comes as ```NameError: name '_C' is not defined``, follow https://github.com/pytorch/pytorch/issues/1633 for the solution

-> **Setup Your HuggingFace Token** 🤗

We'll be using  Llama-2, which a model released by Meta. However, the model is not openly-accessible and requires requesting for access (assigned to your HuggingFace token).

Obtain a [HuggingFace API Token](https://huggingface.co/settings/tokens) and request access to [gemma-7b-it](https://huggingface.co/google/gemma-7b-it) before proceeding. You may need to signup on HuggingFace if you don't aleady have an account: https://huggingface.co/join

In [3]:
import getpass
# import locale; locale.getpreferredencoding = lambda: "utf-8"
import logging
import os
import torch
import yaml

from ludwig.api import LudwigModel


os.environ["HUGGING_FACE_HUB_TOKEN"] = getpass.getpass("Token:")
assert os.environ["HUGGING_FACE_HUB_TOKEN"]

Token:··········


In [4]:
# os.environ["HF_TOKEN"] = getpass.getpass("Token:")
# assert os.environ["HF_TOKEN"] #give same as HUGGING_FACE_HUB_TOKEN

## Dataset
Data in the form of csv is made avilable at the Github location [here](https://raw.githubusercontent.com/yogeshhk/Sarvadnya/master/src/ludwig/data/cbic-gst_gov_in_fgaq.csv). `wget` it ones from the location given below. Keep it in `data` folder, then comment this cell for further executions.

In [5]:
# !pip install wget
# import wget

# # Replace the URL with the raw URL of the file on GitHub
# url = "https://raw.githubusercontent.com/yogeshhk/Sarvadnya/master/src/ludwig/data/cbic-gst_gov_in_fgaq.csv"

# # Download the file
# wget.download(url, 'cbic-gst_gov_in_fgaq.csv')

-> Needs permission. Change to drive location below to where the csv file needed for the notebook resides.

In [6]:
from google.colab import drive
drive.mount('/content/drive')

import os
os.chdir('/content/drive/MyDrive/ImpDocs/Work/AICoach/Notebooks')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [7]:
from google.colab import data_table; data_table.enable_dataframe_formatter()
import numpy as np; np.random.seed(123)
import pandas as pd

Change to drive location below to where the csv file needed for the notebook resides.

In [8]:
df = pd.read_csv('/content/drive/MyDrive/ImpDocs/Work/AICoach/Notebooks/data/cbic-gst_gov_in_fgaq.csv', encoding='cp1252')
df.head()


Unnamed: 0,Question,Answer
0,Does aggregate turnover include value of inwar...,Refer Section 2(6) of CGST Act. Aggregate turn...
1,What if the dealer migrated with wrong PAN as ...,New registration would be required as partners...
2,A taxable person’s business is in many states....,He is liable to register if the aggregate turn...
3,Can we use provisional GSTIN or do we get new ...,Provisional GSTIN (PID) should be converted in...
4,Whether trader of country liquor is required t...,If the person is involved in 100% supply of go...


A crucial step in our journey involves the compilation of a dataset that mirrors the real-world questions taxpayers grapple with. So, this dataset is a Question Answering dataset. Each row in the dataset consists of an:
- `Question` that describes a query
- `Answer` that describes the correspondng answer

## Running Ludwig: Training

The model's declarative nature allows us to clearly define the architecture, making the training process transparent and insightful.

Instantiation of `LudwigModel` with fine-tuning config `instruction_tuning_yaml`.

Defining config for Instruction Fine Tuning using Gemma 7B model. It is based
on [this](https://predibase.com/blog/fine-tuning-mistral-7b-on-a-single-gpu-with-ludwig) tutorial. Prompt has been changed.

In [9]:
instruction_tuning_yaml = yaml.safe_load("""
model_type: llm
base_model: google/gemma-7b-it

quantization:
 bits: 4

adapter:
 type: lora

prompt:
  template: |
    ### Instruction:
    You are a taxation expert on Goods and Services Tax used in India.
    Take the Input given below which is a Question. Give Answer for it as a Response.

    ### Input:
    {Question}

    ### Response:

input_features:
 - name: Question
   type: text
   preprocessing:
      max_sequence_length: 1024

output_features:
 - name: Answer
   type: text
   preprocessing:
      max_sequence_length: 384

trainer:
  type: finetune
  epochs: 5
  batch_size: 1
  eval_batch_size: 2
  gradient_accumulation_steps: 16  # effective batch size = batch size * gradient_accumulation_steps
  learning_rate: 2.0e-4
  enable_gradient_checkpointing: true
  learning_rate_scheduler:
    decay: cosine
    warmup_fraction: 0.03
    reduce_on_plateau: 0

generation:
  temperature: 0.1
  max_new_tokens: 512

backend:
 type: local
""")

Training it on GST csv based dataframe.




In [10]:
model_instruction_tuning = LudwigModel(config=instruction_tuning_yaml,  logging_level=logging.INFO)
results_instruction_tuning = model_instruction_tuning.train(dataset=df)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
INFO:ludwig.utils.print_utils:
INFO:ludwig.utils.print_utils:╒════════════════════════╕
INFO:ludwig.utils.print_utils:│ EXPERIMENT DESCRIPTION │
INFO:ludwig.utils.print_utils:╘════════════════════════╛
INFO:ludwig.utils.print_utils:
INFO:ludwig.api:╒══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════╕
│ Experiment name  │ api_experiment                                                                          │
├──────────────────┼─────────────────────────────────────────────────────────────────────────────────────────┤
│ Mode

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.
INFO:ludwig.models.llm:Done.
INFO:ludwig.utils.tokenizers:Loaded HuggingFace implementation of google/gemma-7b-it tokenizer
INFO:ludwig.models.llm:Trainable Parameter Summary For Fine-Tuning
INFO:ludwig.models.llm:Fine-tuning with adapter: lora
INFO:ludwig.trainers.trainer:Gradient checkpointing enabled for training.
INFO:ludwig.utils.print_utils:
INFO:ludwig.utils.print_utils:╒══════════╕
INFO:ludwig.utils.print_utils:│ TRAINING │
INFO:ludwig.utils.print_utils:╘══════════╛
INFO:ludwig.utils.print_utils:


trainable params: 3,211,264 || all params: 8,540,892,160 || trainable%: 0.037598695075901765


INFO:ludwig.trainers.trainer:Creating fresh model training run.
INFO:ludwig.trainers.trainer:Training for 540 step(s), approximately 5 epoch(s).
INFO:ludwig.trainers.trainer:Early stopping policy: 5 round(s) of evaluation, or 540 step(s), approximately 5 epoch(s).

INFO:ludwig.trainers.trainer:Starting with step 0, epoch: 0


Training:  20%|██        | 108/540 [02:01<08:07,  1.13s/it, loss=4.93]

INFO:ludwig.trainers.trainer:
Running evaluation for step: 108, epoch: 1


Evaluation valid: 100%|██████████| 8/8 [00:05<00:00,  1.47it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
Whether taxi aggregators can issue any other document in lieu of invoice and issue consolidated invoice for such services?

### Response:
INFO:ludwig.trainers.trainer_llm:Output: Please are given software consultant and a and Services Tax ( in India. ExplainPlease a following Tax below and includes the list and
 the in the. GST GST.

** Input:
The GST servicesators like charge invoices GST GST than lieu of invoice for receipt GST bill for multiple services.

### Answer:
Whether GST mandates for to issue service providers to issue alternative or other invoices instead a year of the date of service of service. If for services insurance services providers, all providers are as taxi aggregators can not have to option to issue consolidated invoices., they option to consolidate


Evaluation test : 100%|██████████| 15/15 [00:10<00:00,  1.48it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
How to comply with 9(4) of CGST Act if POS is in another State of the unregistered supplier

### Response:
INFO:ludwig.trainers.trainer_llm:Output: Please are given software consultant and a and Services Tax ( in India. ExplainPlease a following Tax below and includes the list and
 the in the. GST GST.

** Input:
The much calculate with GST1%2) of theST Act, a is not a state than India Union business?** Response:
The person who supply-State supplies of to complysorily register a under collect comply such cases, the 9(4) of apply apply into play.**
INFO:ludwig.trainers.trainer_llm:--------------------
INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Gi


Training:  40%|████      | 216/540 [07:29<06:20,  1.18s/it, loss=2.16]

INFO:ludwig.trainers.trainer:
Running evaluation for step: 216, epoch: 2


Evaluation valid: 100%|██████████| 8/8 [00:05<00:00,  1.48it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
Whether taxi aggregators can issue any other document in lieu of invoice and issue consolidated invoice for such services?

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** are given software consultant and a and Services Tax ( in India. You a case Tax below and contains the list and
 the which each. GST GST.

** Input:
 GST servicesators are charge invoices GST invoices than lieu of invoice for receipt GST bill for multiple services.

### Answer:
 GST mandates for to levy service providers to issue any or other invoices in the year of the date of service. service. for services insurance services providers, all providers whose as airlines aggregators, not have to option to issue consolidated invoices., they consolidated to consolidated flexibility invoices is suc


Evaluation test : 100%|██████████| 15/15 [00:10<00:00,  1.49it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
How to comply with 9(4) of CGST Act if POS is in another State of the unregistered supplier

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** are given software consultant and a and Services Tax ( in India. You a case Tax below and contains the list and
 the which each. GST GST.

** Input:
 much calculate with GST1%2) of GSTST Act, turnover is notoperative state? Union Union business? Answer:
 supplies who supplies-State supplies of to appointsorily register GST. collect obtain such cases, reverse 9(4) of apply apply into play.
INFO:ludwig.trainers.trainer_llm:--------------------
INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer f


Training:  60%|██████    | 324/540 [14:32<04:01,  1.12s/it, loss=3.21]

INFO:ludwig.trainers.trainer:
Running evaluation for step: 324, epoch: 3


Evaluation valid: 100%|██████████| 8/8 [00:05<00:00,  1.54it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
Whether taxi aggregators can issue any other document in lieu of invoice and issue consolidated invoice for such services?

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** have given software consultant and a and Services Tax ( to Oman. You a case Tax below and is the list and
 the in the. GST GST.

** Input:
 the servicesators are charge invoices invoices invoices other lieu of invoice

 whether tax bill. the services.

 Answer:
 GST is for to the service providers to issue a, other invoices.  year of the date of service. service. in service insurance services,, the providers are as airlines aggregators, not have to option to issue consolidated invoices., the other to consolidated flexibility invoices is such service providers is be implemented inThe
INFO:ludwi


Evaluation test : 100%|██████████| 15/15 [00:09<00:00,  1.53it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
How to comply with 9(4) of CGST Act if POS is in another State of the unregistered supplier

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** have given software consultant and a and Services Tax ( to Oman. You a case Tax below and is the list and
 the in the. GST GST.

** Input:
 much calculate with GST102) of GSTST Act, a is notoperative state

 Union Union taxable. Answer:
 registered who taxable-State taxable of to appointsorily register GST. collect obtain such cases, the 9(4) is apply apply into play.
INFO:ludwig.trainers.trainer_llm:--------------------
INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.





INFO:ludwig.trainers.trainer:Evaluation took 16.7105s

INFO:ludwig.utils.metrics_printed_table:╒═══════════════════════╤═════════════╤══════════════╤═════════════╕
│                       │       train │   validation │        test │
╞═══════════════════════╪═════════════╪══════════════╪═════════════╡
│ bleu                  │      0.0283 │       0.0294 │      0.0276 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ char_error_rate       │      2.0717 │       2.0910 │      2.0550 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ loss                  │     53.0248 │      54.8123 │     51.8379 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ next_token_perplexity │ 182020.6719 │  186605.5625 │ 183132.5625 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ perplexity            │ 247850.0000 │  246450.1719 │ 246424.7969 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ rouge1

Training:  80%|████████  | 432/540 [24:13<02:03,  1.14s/it, loss=2.32]

INFO:ludwig.trainers.trainer:
Running evaluation for step: 432, epoch: 4


Evaluation valid: 100%|██████████| 8/8 [00:05<00:00,  1.46it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
Whether taxi aggregators can issue any other document in lieu of invoice and issue consolidated invoice for such services?

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** have given cashier consultant and the and Services Tax ( to the.

Draft a following Tax in and is a  and
 возле in возле {; a возлеچنینDenote<strong>Denote:
Whether the,ators are charge invoices invoice invoice, lieu of invoice./ of refund in a supply.

Denote ()=>:
 IG is for to the persons providers to charge a, other invoices in  of of the date of the service in services other services,, no providers, as the-ators, not have to option to issue a invoices., in said to the such invoices is  other providers, be taken inDenote
INFO:ludwig.trainers.trainer_llm:--------------------
INFO:ludwig.tra


Evaluation test : 100%|██████████| 15/15 [00:10<00:00,  1.49it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
How to comply with 9(4) of CGST Act if POS is in another State of the unregistered supplier

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** have given cashier consultant and the and Services Tax ( to the.

Draft a following Tax in and is a  and
 возле in возле {; a возлеچنینDenote<strong>Denote:
Whether will get with the202) of CGST/, a is notoperative state

 Union Union... ()=>:
 other who supplies-State supplies of to takesorily get a. charge, the case, the 9(4) of apply apply into playDenote
INFO:ludwig.trainers.trainer_llm:--------------------
INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input




INFO:ludwig.trainers.trainer:Evaluation took 17.2930s

INFO:ludwig.utils.metrics_printed_table:╒═══════════════════════╤═════════════╤══════════════╤═════════════╕
│                       │       train │   validation │        test │
╞═══════════════════════╪═════════════╪══════════════╪═════════════╡
│ bleu                  │      0.0535 │       0.0000 │      0.0131 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ char_error_rate       │      1.8157 │       2.3676 │      2.5010 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ loss                  │     48.7508 │      43.7285 │     41.5017 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ next_token_perplexity │ 188558.4062 │  199410.3438 │ 195377.0000 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ perplexity            │ 249038.6875 │  246506.8281 │ 246050.0000 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ rouge1

Training: 100%|██████████| 540/540 [34:04<00:00,  1.19s/it, loss=1.65]

INFO:ludwig.trainers.trainer:
Running evaluation for step: 540, epoch: 5


Evaluation valid: 100%|██████████| 8/8 [00:05<00:00,  1.49it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
Whether taxi aggregators can issue any other document in lieu of invoice and issue consolidated invoice for such services?

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** have given cashier officer and the and Services Tax Act to the.

 a case of by: is a rate and
 the/ the in per hyper.
** ()=>:
 GST-ators are charge invoice invoice invoice than lieu of invoice u receipt of bill for a supply.

<h3> ()=>:
 said does for to the operator providers to issue any, other invoice in the accounting of the date of service. service. in service insurance services,. in providers shall as airlines-ators, not have to option to issue invoice invoice., the said to the the invoice is the other providers has be taken.
INFO:ludwig.trainers.trainer_llm:--------------------
INFO:lu


Evaluation test : 100%|██████████| 15/15 [00:09<00:00,  1.53it/s]

INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
How to comply with 9(4) of CGST Act if POS is in another State of the unregistered supplier

### Response:
INFO:ludwig.trainers.trainer_llm:Output: ** have given cashier officer and the and Services Tax Act to the.

 a case of by: is a rate and
 the/ the in per hyper.
** ()=>:
 to get with GST201) rule CGST Rules Rules turnover is notoperative state. Union Union taxable. ()=>:
 person who supply-State supplies of to obtainsorily register registration in obtain, such case, the 2(4) of not be into force.
INFO:ludwig.trainers.trainer_llm:--------------------
INFO:ludwig.trainers.trainer_llm:Input: ### Instruction:
You are a taxation expert on Goods and Services Tax used in India.
Take the Input given below which is a Question. Give Answer for it as a Response.

### Input:
Is




INFO:ludwig.trainers.trainer:Evaluation took 16.8555s

INFO:ludwig.utils.metrics_printed_table:╒═══════════════════════╤═════════════╤══════════════╤═════════════╕
│                       │       train │   validation │        test │
╞═══════════════════════╪═════════════╪══════════════╪═════════════╡
│ bleu                  │      0.0175 │       0.0304 │      0.0248 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ char_error_rate       │      2.0932 │       2.1743 │      2.2569 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ loss                  │     40.3270 │      29.8554 │     28.4905 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ next_token_perplexity │ 184274.9531 │  185313.7656 │ 183398.9375 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ perplexity            │ 242438.2500 │  240000.2656 │ 239901.1719 │
├───────────────────────┼─────────────┼──────────────┼─────────────┤
│ rouge1

Training: 100%|██████████| 540/540 [40:06<00:00,  4.46s/it, loss=1.65]


INFO:ludwig.utils.print_utils:
INFO:ludwig.utils.print_utils:╒═════════════════╕
INFO:ludwig.utils.print_utils:│ TRAINING REPORT │
INFO:ludwig.utils.print_utils:╘═════════════════╛
INFO:ludwig.utils.print_utils:
INFO:ludwig.api:╒══════════════════════════════╤════════════════════╕
│ Validation feature           │ Answer             │
├──────────────────────────────┼────────────────────┤
│ Validation metric            │ loss               │
├──────────────────────────────┼────────────────────┤
│ Best model step              │ 540                │
├──────────────────────────────┼────────────────────┤
│ Best model epoch             │ 6                  │
├──────────────────────────────┼────────────────────┤
│ Best model's validation loss │ 29.855409622192383 │
├──────────────────────────────┼────────────────────┤
│ Best model's test loss       │ 28.49051856994629  │
╘══════════════════════════════╧════════════════════╛
INFO:ludwig.api:
Finished: api_experiment_run
INFO:ludwig.api:Saved to

Following config is for ECD way for solving Question Answering problem, on top of Gemma model.

In [11]:
qna_tuning_config_dict = {
    "model_type": "llm",
    "base_model": "google/gemma-7b-it",
    "input_features": [
        {
            "name": "Question",
            "type": "text",
            "encoder": {
                # "type": "auto_transformer",
                "pretrained_model_name_or_path": "google/gemma-7b-it",
                "trainable": False
            },
            "preprocessing": {
                "cache_encoder_embeddings": True
            }
        }
    ],
    "output_features": [
        {
            "name": "Answer",
            "type": "text"
        }
    ]
}

In [12]:
# model_ecd = LudwigModel(config=qna_tuning_config_dict)
# results_ecd = model_ecd.train(dataset=df)

## Runnuing Ludwig: Inferencing
Testing or inferencing dataset has just a couple of questions for which answers are seeked.

In [13]:
import pandas as pd
test_df = pd.DataFrame([
    {
        "Question": "If I am not an existing taxpayer and wish to newly register under GST, when can I do so?"
    },
    {
        "Question": "Does aggregate turnover include value of inward supplies received on which RCM is payable?"
    },
])


With Ludwig's training complete, the explorers put the model to the test. They fed it a set of questions related to GST, eager to witness the declarative AI framework in action.

**Predictions on fine-tuned model**

In [14]:
predictions_instruction_tuning_df, output_directory = model_instruction_tuning.predict(dataset=test_df)
print(predictions_instruction_tuning_df["Answer_response"].tolist())

INFO:ludwig.utils.tokenizers:Loaded HuggingFace implementation of google/gemma-7b-it tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


Prediction: 100%|██████████| 1/1 [02:12<00:00, 132.84s/it]


INFO:ludwig.utils.tokenizers:Loaded HuggingFace implementation of google/gemma-7b-it tokenizer
  return np.sum(np.log(sequence_probabilities))
INFO:ludwig.api:Finished predicting in: 144.53s.


[[''], ['']]


In [15]:
# predictions_ecd_df, output_directory = model_ecd.predict(dataset=test_df)
# print(predictions_ecd_df["Answer_response"].tolist())

These answers are not ok. There could be many reasons, quality of LLM, training paramater, and above all, need far bigger bigger dataset for fine-tuning.

## **Observations** 🔎

Fine-tunined model seems to have given decent results. Ludwig's declarative approach provides a clear and concise methodology for building machine learning models, making it an invaluable tool for unraveling the mysteries of complex domains. It becomes extreamly easy to change between these approaches, change base LLMs etc.

# **Resources** 🧺
- How to Efficiently Fine-Tune Gemma-7B with Open-Source Ludwig https://predibase.com/blog/how-to-efficiently-fine-tune-gemma-7b-with-open-source-ludwig
- Fine-tuning Mistral 7B on a Single GPU with Ludwig https://predibase.com/blog/fine-tuning-mistral-7b-on-a-single-gpu-with-ludwig
- Efficient Fine-Tuning for Llama-v2-7b on a Single GPU https://www.youtube.com/watch?v=g68qlo9Izf0
- If you're new to LLMs, check out this webinar where Daliana Liu discusses the 10 things to know about LLMs: https://www.youtube.com/watch?v=fezMHMk7u5o&t=2027s&ab_channel=Predibase
- Ludwig 0.8 Release Blogpost for the full set of new features: https://predibase.com/blog/ludwig-v0-8-open-source-toolkit-to-build-and-fine-tune-custom-llms-on-your-data
- Ludwig Documentation: https://ludwig.ai/latest/