## Introduction: Smarter Driving with LLMs and RAG

As automotive technology continues to advance, so does the need for **intelligent, real-time driver assistance**. With dashboards becoming increasingly complex, drivers often face challenges in understanding unfamiliar warning indicators and technical alerts. To address this, we explore the integration of **Large Language Models (LLMs)** with vehicle manuals using a technique called **Retrieval-Augmented Generation (RAG)**, resulting in a **context-aware chatbot** that can assist users in plain language.

In this project, we are acting on behalf of a **well-known car manufacturer** looking to embed LLMs into their vehicles. Your task is to build a proof of concept that enhances the traditional user manual experience by allowing drivers to **ask natural-language questions** and get **accurate, actionable answers** directly based on the car’s official documentation.

The focus is the **MG ZS compact SUV**, whose **warning message manual** (stored as `mg-zs-warning-messages.html`) contains detailed descriptions of dashboard alerts and their recommended actions. Using tools like **LangChain**, **OpenAI's GPT-4o-mini**, and **vector-based retrieval with Chroma**, we:

- Load and process the HTML manual.
- Embed the text into a searchable vector database.
- Implement a RAG pipeline that matches user queries with relevant manual content.
- Generate informative, concise answers using a chat-based LLM.

This project not only serves as a **smart assistant** for understanding car warnings but also paves the way for **voice-enabled, AI-driven interfaces** in vehicles. The final goal is to connect this chatbot with **text-to-speech software**, offering a fully **hands-free driver support system** in future automotive applications.


In [1]:
#Run this cell to install the necessary packages
import subprocess
import pkg_resources

def install_if_needed(package, version):
    '''Function to ensure that the libraries used are consistent to avoid errors.'''
    try:
        pkg = pkg_resources.get_distribution(package)
        if pkg.version != version:
            raise pkg_resources.VersionConflict(pkg, version)
    except (pkg_resources.DistributionNotFound, pkg_resources.VersionConflict):
        subprocess.check_call(["pip", "install", f"{package}=={version}"])

install_if_needed("langchain-core", "0.3.18")
install_if_needed("langchain-openai", "0.2.8")
install_if_needed("langchain-community", "0.3.7")
install_if_needed("unstructured", "0.14.4")
install_if_needed("langchain-chroma", "0.1.4")
install_if_needed("langchain-text-splitters", "0.3.2")

Defaulting to user installation because normal site-packages is not writeable
Collecting langchain-core==0.3.18
  Downloading langchain_core-0.3.18-py3-none-any.whl.metadata (6.3 kB)
Downloading langchain_core-0.3.18-py3-none-any.whl (409 kB)
Installing collected packages: langchain-core


[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.1.20 requires langchain-core<0.2.0,>=0.1.52, but you have langchain-core 0.3.18 which is incompatible.
langchain-cohere 0.1.5 requires langchain-core<0.3,>=0.1.42, but you have langchain-core 0.3.18 which is incompatible.
langchain-community 0.0.38 requires langchain-core<0.2.0,>=0.1.52, but you have langchain-core 0.3.18 which is incompatible.
langchain-openai 0.1.7 requires langchain-core<0.3,>=0.1.46, but you have langchain-core 0.3.18 which is incompatible.
langchain-text-splitters 0.0.2 requires langchain-core<0.3,>=0.1.28, but you have langchain-core 0.3.18 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To upda

Successfully installed langchain-core-0.3.18
Defaulting to user installation because normal site-packages is not writeable
Collecting langchain-openai==0.2.8
  Downloading langchain_openai-0.2.8-py3-none-any.whl.metadata (2.6 kB)
Collecting openai<2.0.0,>=1.54.0 (from langchain-openai==0.2.8)
  Downloading openai-1.82.0-py3-none-any.whl.metadata (25 kB)
Downloading langchain_openai-0.2.8-py3-none-any.whl (50 kB)
Downloading openai-1.82.0-py3-none-any.whl (720 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m720.4/720.4 kB[0m [31m42.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai, langchain-openai


[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
embedchain 0.1.113 requires langchain-openai<0.2.0,>=0.1.7, but you have langchain-openai 0.2.8 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


Successfully installed langchain-openai-0.2.8 openai-1.82.0
Defaulting to user installation because normal site-packages is not writeable
Collecting langchain-community==0.3.7
  Downloading langchain_community-0.3.7-py3-none-any.whl.metadata (2.9 kB)
Collecting SQLAlchemy<2.0.36,>=1.4 (from langchain-community==0.3.7)
  Downloading SQLAlchemy-2.0.35-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting langchain<0.4.0,>=0.3.7 (from langchain-community==0.3.7)
  Downloading langchain-0.3.25-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain-core<0.4.0,>=0.3.17 (from langchain-community==0.3.7)
  Downloading langchain_core-0.3.61-py3-none-any.whl.metadata (5.8 kB)
Collecting langchain-text-splitters<1.0.0,>=0.3.8 (from langchain<0.4.0,>=0.3.7->langchain-community==0.3.7)
  Downloading langchain_text_splitters-0.3.8-py3-none-any.whl.metadata (1.9 kB)
Collecting pydantic<3.0.0,>=2.7.4 (from langchain<0.4.0,>=0.3.7->langchain-community==0.3.7)
  Downlo

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
crewai 0.30.11 requires langchain<0.2.0,>=0.1.10, but you have langchain 0.3.25 which is incompatible.
embedchain 0.1.113 requires langchain<0.2.0,>=0.1.4, but you have langchain 0.3.25 which is incompatible.
embedchain 0.1.113 requires langchain-openai<0.2.0,>=0.1.7, but you have langchain-openai 0.2.8 which is incompatible.
langchain-cohere 0.1.5 requires langchain-core<0.3,>=0.1.42, but you have langchain-core 0.3.61 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


Successfully installed SQLAlchemy-2.0.35 langchain-0.3.25 langchain-community-0.3.7 langchain-core-0.3.61 langchain-text-splitters-0.3.8 pydantic-2.11.5 pydantic-core-2.33.2 typing-inspection-0.4.1
Defaulting to user installation because normal site-packages is not writeable
Collecting unstructured==0.14.4
  Downloading unstructured-0.14.4-py3-none-any.whl.metadata (28 kB)
Collecting filetype (from unstructured==0.14.4)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting python-magic (from unstructured==0.14.4)
  Downloading python_magic-0.4.27-py2.py3-none-any.whl.metadata (5.8 kB)
Collecting emoji (from unstructured==0.14.4)
  Downloading emoji-2.14.1-py3-none-any.whl.metadata (5.7 kB)
Collecting python-iso639 (from unstructured==0.14.4)
  Downloading python_iso639-2025.2.18-py3-none-any.whl.metadata (14 kB)
Collecting langdetect (from unstructured==0.14.4)
  Downloading langdetect-1.0.9.tar.gz (981 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


Defaulting to user installation because normal site-packages is not writeable
Collecting langchain-chroma==0.1.4
  Downloading langchain_chroma-0.1.4-py3-none-any.whl.metadata (1.6 kB)
Downloading langchain_chroma-0.1.4-py3-none-any.whl (10 kB)
Installing collected packages: langchain-chroma
Successfully installed langchain-chroma-0.1.4



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


Defaulting to user installation because normal site-packages is not writeable
Collecting langchain-text-splitters==0.3.2
  Downloading langchain_text_splitters-0.3.2-py3-none-any.whl.metadata (2.3 kB)
Downloading langchain_text_splitters-0.3.2-py3-none-any.whl (25 kB)
Installing collected packages: langchain-text-splitters
  Attempting uninstall: langchain-text-splitters
    Found existing installation: langchain-text-splitters 0.3.8
    Uninstalling langchain-text-splitters-0.3.8:
      Successfully uninstalled langchain-text-splitters-0.3.8
Successfully installed langchain-text-splitters-0.3.2


[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.3.25 requires langchain-text-splitters<1.0.0,>=0.3.8, but you have langchain-text-splitters 0.3.2 which is incompatible.
crewai 0.30.11 requires langchain<0.2.0,>=0.1.10, but you have langchain 0.3.25 which is incompatible.
embedchain 0.1.113 requires langchain<0.2.0,>=0.1.4, but you have langchain 0.3.25 which is incompatible.
embedchain 0.1.113 requires langchain-openai<0.2.0,>=0.1.7, but you have langchain-openai 0.2.8 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


In [2]:
# Set your API key to a variable
import os
openai_api_key = os.environ["OPENAI_API_KEY"]

# Import the required packages
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_community.document_loaders import UnstructuredHTMLLoader
from langchain_openai import OpenAIEmbeddings
from langchain_core.runnables import RunnablePassthrough
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma

In [3]:
# Load the HTML as a LangChain document loader
loader = UnstructuredHTMLLoader(file_path="data/mg-zs-warning-messages.html")
car_docs = loader.load()

In [4]:
# Initialize RecursiveCharacterTextSplitter to make chunks of HTML text
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

# Split GDPR HTML
splits = text_splitter.split_documents(car_docs)

# Initialize Chroma vectorstore with documents as splits and using OpenAIEmbeddings
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings(openai_api_key=openai_api_key))

# Setup vectorstore as retriever
retriever = vectorstore.as_retriever()

# Define RAG prompt
prompt = ChatPromptTemplate.from_template("You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:")

# Initialize chat-based LLM with 0 temperature and using gpt-4o-mini
model = ChatOpenAI(openai_api_key=openai_api_key, model_name="gpt-4o-mini", temperature=0)

# Setup the chain
rag_chain = (
    {"context": retriever , "question": RunnablePassthrough()}
    | prompt
    | model
)

# Initialize query
query = "The Gasoline Particular Filter Full warning has appeared. What does this mean and what should I do about it?"

# Invoke the query
answer = rag_chain.invoke(query).content
print(answer)



## Conclusion: Driving into the Future with AI-Powered Assistance

This project demonstrates a compelling proof of concept for integrating **LLMs with technical vehicle documentation** using **Retrieval-Augmented Generation (RAG)**. By leveraging the MG ZS warning message manual and OpenAI’s GPT-4o-mini, we created a **context-aware chatbot** capable of interpreting complex automotive alerts and providing **clear, human-like responses**.

We successfully:

- Parsed and embedded car manual content into a vector database.
- Enabled accurate, real-time information retrieval using LangChain and Chroma.
- Built a natural-language interface that understands and responds to driver queries.

This system lays the groundwork for more advanced in-car technologies. In the near future, this chatbot could be extended with **text-to-speech capabilities**, enabling it to **read answers aloud**, enhancing accessibility and reducing driver distraction. Additionally, the approach can scale to other manuals, languages, and brands, forming a **modular support tool for any smart vehicle system**.

Ultimately, this project shows how **modern AI and NLP techniques** can turn static documentation into **dynamic, interactive, and context-aware support systems**, improving the user experience and driving safety.
