In [1]:
%pip install langchain

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [10]:
import os
from langchain.llms import HuggingFacePipeline
from langchain.llms import HuggingFaceTextGenInference
from langchain.document_loaders import TextLoader
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.chains import RetrievalQA
from langchain.vectorstores.pgvector import PGVector

In [21]:
# test our Llama-2 interface with langchain

llm = HuggingFaceTextGenInference(
    inference_server_url="http://localhost:8080/",
    max_new_tokens=512,
    top_k=10,
    top_p=0.95,
    typical_p=0.95,
    temperature=0.01,
    repetition_penalty=1.03,
)
llm("[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe and keep your responses less than 200 words  \n<</SYS>>\n \
        How to deploy a container on K8s?[/INST]")

" To deploy a container on Kubernetes (K8s), you can follow these steps:\n\n1. Create a Docker image of your container using the `docker build` command.\n2. Push the image to a container registry, such as Docker Hub or Google Container Registry.\n3. Create a Kubernetes deployment YAML file that defines the container and its configuration.\n4. Use the `kubectl apply` command to deploy the YAML file to your Kubernetes cluster.\n5. Use the `kubectl get` command to verify that the deployment was successful.\n\nHere's an example deployment YAML file:\n```\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: my-container\nspec:\n  selector:\n    matchLabels:\n      app: my-container\n  replicas: 1\n  template:\n    metadata:\n      labels:\n        app: my-container\n    spec:\n      containers:\n      - name: my-container\n        image: <image-name>\n        ports:\n        - containerPort: 80\n```\nReplace `<image-name>` with the name of your Docker image.\n\nYou can also use a Kube

In [11]:
# Load list of URLs -> kubernetes.io/docs/concepts/
file1 = open('./data/k8s-urls.txt', 'r')

loader = WebBaseLoader(file1.readlines())
documents = loader.load()

In [16]:
# Chunk all the kubernetes concept documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20)
docs = text_splitter.split_documents(documents)

print("%s chunks in %s pages" % (len(docs), len(documents)))

4282 chunks in 135 pages


In [3]:
# Load sentence transformer embeddings
model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {}

embeddings = HuggingFaceEmbeddings(model_name=model_name, model_kwargs=model_kwargs)

In [18]:
# Connection string for connecting to Postgres
CONNECTION_STRING = PGVector.connection_string_from_db_params(
    driver=os.environ.get("PGVECTOR_DRIVER", "psycopg2"),
    host=os.environ.get("PGVECTOR_HOST", "localhost"),
    port=int(os.environ.get("PGVECTOR_PORT", "5432")),
    database=os.environ.get("PGVECTOR_DATABASE", "postgres"),
    user=os.environ.get("PGVECTOR_USER", "postgres"),
    password=os.environ.get("PGVECTOR_PASSWORD", "secretpassword"),
)

In [19]:
COLLECTION_NAME = "kubernetes_concepts"

db = PGVector.from_documents(
    embedding=embeddings,
    documents=docs,
    collection_name=COLLECTION_NAME,
    connection_string=CONNECTION_STRING,
)

In [32]:
query = "What is a deployment?" # "Should I use gateway API in my app?"
docs = db.similarity_search(query)
print(f"Query: {query}")
print(f"Retrieved documents: {len(docs)}")
for doc in docs:
    doc_details = doc.to_json()['kwargs']
    print("Source: ", doc_details['metadata']['source'])
    print("Text: ", doc_details['page_content'], "\n")

Query: What is a deployment?
Retrieved documents: 4
Source:  https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

Text:  Create an issue
 Print entire sectionUse CaseCreating a DeploymentPod-template-hash labelUpdating a DeploymentRollover (aka multiple updates in-flight)Label selector updatesRolling Back a DeploymentChecking Rollout History of a DeploymentRolling Back to a Previous RevisionScaling a DeploymentProportional scalingPausing and Resuming a rollout of a DeploymentDeployment statusProgressing DeploymentComplete DeploymentFailed DeploymentOperating on a failed deploymentClean up PolicyCanary DeploymentWriting a Deployment SpecPod TemplateReplicasSelectorStrategyProgress Deadline SecondsMin Ready SecondsRevision History LimitPausedWhat's nextDocumentation
Blog
Training
Partners
Community 

Source:  https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

Text:  Advanced contributing
Viewing Site Analytics
Docs smoke test pageKubernetes Documenta

In [25]:
from langchain.prompts import PromptTemplate
prompt_template = """[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.
        Use the following pieces of context to answer the question. If you don't know the answer, just say that you don't know, don't try to make up an answer.
        
        {context}

        \n<</SYS>>\n
        Question: {question}
        Answer:[/INST]"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

chain_type_kwargs = {"prompt": PROMPT}

In [26]:
retriever = db.as_retriever()

qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever,
    chain_type_kwargs=chain_type_kwargs,
    verbose=True
)

In [33]:
result = qa.run("When should I use the Gateway API?")
print(result)



[1m> Entering new RetrievalQA chain...[0m


GenerationError: Request failed during generation: Server error: CUDA out of memory. Tried to allocate 448.00 MiB (GPU 0; 21.99 GiB total capacity; 19.03 GiB already allocated; 257.69 MiB free; 21.99 GiB allowed; 20.47 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

In [29]:
query = "What is the nation economic status? Summarize. Keep it under 200 words."
test_rag(qa, query)

Query: What is the nation economic status? Summarize. Keep it under 200 words.



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m

Result:   Based on the State of the Union address, the nation's economic status is strong and resilient. The President highlighted the country's ability to turn every crisis into an opportunity, citing the nation's history of overcoming challenges and emerging stronger. The address emphasized the importance of protecting freedom and liberty, expanding fairness and opportunity, and saving democracy. The President expressed optimism about the nation's future, citing the strength and resilience of the American people. The address did not provide a detailed summary of the nation's economic status, but it emphasized the importance of meeting the challenges of the present moment and working together as one people to build a better future.
