Automotive_NER

Fine tune a large language model (LLM) for performing named entity recognition (NER) task on an automotive dataset. NER task involves processing unstructured text data to extract useful information/entities.

HuggingFace Link of Fine Tuned Model: https://huggingface.co/Souvik2807/Llama-2-7b-Automotive-finetune-NER/tree/main

Dataset Link1: https://drive.google.com/drive/folders/1ohFCqEKD_J55KUbpxvbCarvywWkyeZP5?usp=sharing Note: Dataset is stored in data/FLAT_RCL.txt,

Dataset Link2: https://www.nhtsa.gov/nhtsa-datasets-and-apis#recalls Note: Download NHTSA Recall Dataset

This assignment is broken down in three tasks:

First task involves analyzing the data and identifying what are some automotive entities that can be extracted from this data. We are interested in entities related to automotive domain. Some examples could be component, failure issue, vehicle model, corrective action etc.
Second task is to use an open source LLM and write the prompt to extract the automotive domain entities from given dataset. We have used Llama2-7b and Flan-T5 in our project. The Llama2-7b LLM has been trained using Zero Shot Learning Technique and Flan-T5 was trained using both Few Shot Learning and Zero Shot Learning technique but responses of the model on Few Shot Learning performed better.
Final task is to fine tune the selected LLM on a subset of provided dataset. We have chose Llama2-7b for Fine Tuning. The model was fine-tuned against held out dataset obtained by pre-processing the Few Shot Learning model response of Flan-T5.

File Description:

Llama2FT.ipynb: Contains the code for Fine Tuning the Llama2 model

test.py: Run test.py with corresponding input in the console to obtain the output returned by the Fine Tuned Model

Data_Preprocessing.ipynb: Notebook that contains code to pre-process the held out data used for Fine Tuning Llama2

model_response.txt: Flan-T5 response after Few Shot Learning

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
Flan_T5		Flan_T5
Llama2		Llama2
Automotive_NER_Report.pdf		Automotive_NER_Report.pdf
Data_PreProcessing.ipynb		Data_PreProcessing.ipynb
LICENSE		LICENSE
LLAMA2FT.ipynb		LLAMA2FT.ipynb
Llama2FT_output.txt		Llama2FT_output.txt
README.md		README.md
Train_Epoch_graph.png		Train_Epoch_graph.png
Train_Learning-Rate_Graph.png		Train_Learning-Rate_Graph.png
Train_Loss_Graph.png		Train_Loss_Graph.png
model_response.txt		model_response.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automotive_NER

About

Uh oh!

Releases

Packages

Languages

License

Souvik-prog/Automotive_NER

Folders and files

Latest commit

History

Repository files navigation

Automotive_NER

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages