In [1]:
%pip install -q langchain_community protobuf==3.20.3 chromadb==0.4.22 langchain-ollama

Note: you may need to restart the kernel to use updated packages.


In [5]:
from langchain_community.document_loaders import TextLoader
from langchain_ollama import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma

In [7]:
loader = TextLoader("Transcripts/transcript.txt")
data = loader.load()
print(data[0].page_content)

 Welcome to the Ultimate Docker course.
 In this course, I'm going to take you on a journey and teach you everything you need
 to know about Docker from the basics to my advanced concepts.
 So by the end of this course, you'll be able to use it like a pro as part of your software
 Devonement workflow.
 If you're looking for a comprehensive and highly practical course that takes you from 0 to
 0, this is the Docker course for you.
 We're going to start off with a really simple project so you understand the basics.
 Then we'll use Docker, run and deploy a full stack application with a front end, back
 end and a database.
 So you learn all the necessary techniques and apply them to your own projects.
 I'm Mosh Hamadani and I've taught millions of people how to advance their software engineering
 skills through my YouTube channel and online school code with mosh.com.
 If you're new here, be sure to subscribe as you upload new videos all the time.
 Now let's jump in and get started.
 Let's 

In [8]:
!ollama list

NAME                       ID              SIZE      MODIFIED          
nomic-embed-text:latest    0a109f422b47    274 MB    About an hour ago    
llama3.1:8b                42182419e950    4.7 GB    5 weeks ago          
llama3.2:3b                a80c4f17acd5    2.0 GB    5 weeks ago          


In [26]:
# Split and chunk 
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
chunks = text_splitter.split_documents(data)
print(len(chunks))


135


In [27]:
from langchain_ollama import OllamaEmbeddings
from langchain_community.vectorstores import Chroma

vector_db = Chroma.from_documents(
    documents=chunks,
    embedding=OllamaEmbeddings(model="nomic-embed-text"),
    collection_name="localvectorstore"
)

## RETRIEVAL

In [28]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_ollama.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever

In [29]:
# LLM from Ollama
local_model = "llama3.2:3b"
llm = ChatOllama(model=local_model)

In [30]:
QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant. Your task is to generate five
    different versions of the given user question to retrieve relevant documents from
    a vector database. By generating multiple perspectives on the user question, your
    goal is to help the user overcome some of the limitations of the distance-based
    similarity search. Provide these alternative questions separated by newlines.
    Original question: {question}""",
)

In [20]:
retriever = MultiQueryRetriever.from_llm(
    vector_db.as_retriever(), 
    llm,
    prompt=QUERY_PROMPT
)

# RAG prompt
template = """Answer the question based ONLY on the following context:
{context}
Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [21]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [32]:
response = chain.invoke("What this context is about?")
print(response)

Based on the provided text, the context appears to be a transcription of an educational course or tutorial about Docker, a containerization platform. The transcript includes discussions on basic Git commands, GitHub repositories, and file management in a Linux environment, as well as an introduction to the architecture and workflow of Docker. The tone is instructional and encouraging, with the instructor emphasizing the importance of hands-on practice and active learning.


In [33]:
response = chain.invoke("Docker is a platform for?")
print(response)

building, running and shipping applications in a consistent manner.
