Fine tuning mixtral7B using raft data format for implementation of RAG2.0 #306

xorsuyash · 2024-04-08T07:33:04Z

cc @GautamR-Samagra
cc @ChakshuGautam

Description

RAG2.0 refers to fine tuning and optmizing end-to-end LLMs as well as retriever for better RAG

Question
Answer( normal answers or chain-of-thought answers)
Relevant_docs + Irrelevant_docs
so that LLM learns how to extract and design relevant answers from the mess of chunks.

Dataset

Dataset requires question answer and the context from which question is answered called oracle_context , distractors(chunks randomly sampled from the context ) , oracle_context and distractors are then randomly interleaved inside and question is appended at the end , documents + question constitutes instruction set for the LLM.

dataset:- https://huggingface.co/datasets/xorsuyash/raft_datasetp1

parameters of dataset

p_value:- p value refers to percentage of data points in which oracle_context is included this affects the behaviour of model upto significant extent. above dataset has p_value=1.0 which means 100% of data has oracle_context.
num_distractors:- number of distractors documents to include.

Fine-tuning overview

Mixtral 7B was used for fine tuning as the base model and LoRA with quantization of 4bit is used a fine tuning technique.
Initially data containing only question answer is used for fine tuning mixtral7B for around 2000 epochs which showed significant decrease in the training loss and eval-loss. further model is again fine-tuned on data containing context+question+answer for around 200 epochs.

Performance comparison

Inference on finetuned model ans base model was done using 250 samples randomly sampled from the test set and inference is then quantitavely evaluated using metrics of RAGAS library and samagra llm_evaluator. metrics include

answer correctness
answer relevancy
answer similarity
Fine tuned model was relatively performing better than base model for RAG and also adds explainability in the answers which base LLM answer and even ground truth answers lags.
**model 2 is the finetuned model **

Future Plans for Improvement:

The initial data used p_value=1.0 , on further iterations different p_values may result in better fine-tuned model and also lower p_value reduces the over fitting in model.
Chain of thoughts answers will also be used instead of normal answers for fine tuning which can lead to better fine tuned models.
and comparision among :

base model (QA)
Fine tuned model on QA without RAG
Fine tuned model on QA with RAG
GPT 3.5,4 without RAG
GPT 3.5,4 with RAG
RAFT finetuned w.o COT w.o RAG
RAFT finetuned w.o COT with RAG
RAFT finetuned with COT with RAG
will be conducted ideally query-doc-answer RAG, query-doc-cot-answer RAG should beat the higher models like gpt3.5/4.

References

raft_research_paper:https://github.com/ShishirPatil/gorilla/blob/gh-pages/assets/RAFT.pdf

xorsuyash · 2024-04-19T09:20:19Z

cc @GautamR-Samagra
cc @ChakshuGautam

on further improvement on the issue.....

Finetuned two models mixtral-base and mixtral-instruct on raft data format.
performed comparision with RAG+gpt3.5, RAG+finetune_base ,RAG+finetune instruct.
Finetuned mixtral base and instruct performs comparable to gpt3.5 in some metrics like answer similarity and outperforms in some metrics like answer relevancy.
Instruct finetuned answers shows better control over base finetuned.
base-finetuned test results
Instruct-finetuned test reults

xorsuyash · 2024-04-22T07:30:12Z

References and possible approaches for buiding RAG2.0

RePLuG
REALM
In Context RALM
Combined Contextualized Retriever and Generator
k-NN LM
ATLAS

xorsuyash · 2024-04-26T07:20:23Z

cc @TakshPanchal
cc @GautamR-Samagra
cc @ChakshuGautam

Tasks

Check -ve miner? How to improve quality of -ves (https://github.com/bclavie/RAGatouille/blob/main/ragatouille/negative_miners/simpleminer.py)
FineTune and evaluate Colbert Retrival
Generate QnA dataset from Autotune
Generate Chain of Thought answers

Gautam-Rajeev assigned Gautam-Rajeev and xorsuyash and unassigned Gautam-Rajeev Apr 19, 2024

Gautam-Rajeev closed this as completed Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine tuning mixtral7B using raft data format for implementation of RAG2.0 #306

Fine tuning mixtral7B using raft data format for implementation of RAG2.0 #306

xorsuyash commented Apr 8, 2024 •

edited

Loading

xorsuyash commented Apr 19, 2024

xorsuyash commented Apr 22, 2024

xorsuyash commented Apr 26, 2024 •

edited

Loading

Fine tuning mixtral7B using raft data format for implementation of RAG2.0 #306

Fine tuning mixtral7B using raft data format for implementation of RAG2.0 #306

Comments

xorsuyash commented Apr 8, 2024 • edited Loading

Description

Dataset

Fine-tuning overview

Performance comparison

Future Plans for Improvement:

References

xorsuyash commented Apr 19, 2024

xorsuyash commented Apr 22, 2024

xorsuyash commented Apr 26, 2024 • edited Loading

Tasks

xorsuyash commented Apr 8, 2024 •

edited

Loading

xorsuyash commented Apr 26, 2024 •

edited

Loading