Fine-tuning LLM for Various E-commerce Tasks

In this repo we will fine-tune llama2-7B on four different e-commerce tasks using LoRA. The tasks are following:

We have different baselines for each task: BERT for classification, BERT for NER, GPT2/BART/T5 for description generation, BART/T5 for summarization.

The questions we want to explore:

How many training samples do we need to achieve SOTA results with fine-tuning llama2-7B for each task?
Is LLM a good choice for e-commerce tasks compared to traditional baseline models?
Fine-tuning LLM on a mixed training dataset for all tasks, or fine-tuning LLM for single task one by one, which one has better performance?
Can we get performance gain if we merge LoRA weights with different task? Is LoRA merging a good way to explore?
What are correlations between LoRA weights for every two tasks, and the correlations between every two tasks?

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
datasets		datasets
open_brand_prompt		open_brand_prompt
scoring		scoring
summerization		summerization
.gitignore		.gitignore
.lock		.lock
README.md		README.md
dataset.py		dataset.py
dataset_mix.py		dataset_mix.py
inference_gpt2.py		inference_gpt2.py
model.py		model.py
requirement.txt		requirement.txt
run.sh		run.sh
train_gpt2.py		train_gpt2.py
train_sft_ebay.py		train_sft_ebay.py
utils_tokenizer.py		utils_tokenizer.py

Provide feedback