This version uses Milvus through Docker Compose so you must have Docker installed to run this notebook (Milvus is spun up via `docker compose up -d` as shown in the block below)

In [1]:
# ! pip install pymilvus milvus langchain sentence-transformers tiktoken octoai-sdk
# docker compose up -d

In [2]:
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.llms.octoai_endpoint import OctoAIEndpoint

In [3]:
from dotenv import load_dotenv
import os

load_dotenv()
os.environ["OCTOAI_API_TOKEN"] = os.getenv("OCTOAI_API_TOKEN")

In [4]:
template = """Below is an instruction that describes a task. Write a response that appropriately completes the request.\n Instruction:\n{question}\n Response: """
prompt = PromptTemplate.from_template(template)

In [5]:
llm = OctoAIEndpoint(
    endpoint_url="https://text.octoai.run/v1/chat/completions",
    model_kwargs={
        "model": "mixtral-8x7b-instruct-fp16",
        "max_tokens": 128,
        "presence_penalty": 0,
        "temperature": 0.01,
        "top_p": 0.9,
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant. Keep your responses limited to one short paragraph if possible.",
            },
        ],
    },
)

In [15]:
question = "Who was leonardo davinci?"

llm_chain = LLMChain(prompt=prompt, llm=llm)

print(llm_chain.invoke(question)["text"])

 Leonardo da Vinci (1452-1519) was an Italian polymath who is often regarded as one of the greatest painters in history. He is also celebrated for his technological ingenuity, scientific curiosity, and philosophical wisdom. Da Vinci is widely known for his masterpieces such as 'The Last Supper' and 'Mona Lisa.' As an artist, scientist, mathematician, engineer, inventor, anatomist, geologist, cartographer, botanist, musician, and writer, da Vinci embodied the Renaissance ideal. His thirst for


In [7]:
from langchain_community.embeddings import OctoAIEmbeddings
from langchain_community.vectorstores import Milvus

In [8]:
embeddings = OctoAIEmbeddings(endpoint_url="https://text.octoai.run/v1/embeddings")

In [9]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.schema import Document
import os

In [28]:
files = os.listdir("./ascii")

In [29]:
file_texts = []

In [30]:
for file in files:
    with open(f"./ascii/{file}", encoding="utf-8") as f:
        file_text = f.read()
    text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
        chunk_size=512, chunk_overlap=64, 
    )
    texts = text_splitter.split_text(file_text)
    for i, chunked_text in enumerate(texts):
        file_texts.append(Document(page_content=chunked_text, 
                metadata={"doc_title": file.split(".")[0], "chunk_num": i}))

Created a chunk of size 1007, which is longer than the specified 512
Created a chunk of size 599, which is longer than the specified 512
Created a chunk of size 672, which is longer than the specified 512
Created a chunk of size 657, which is longer than the specified 512
Created a chunk of size 799, which is longer than the specified 512
Created a chunk of size 876, which is longer than the specified 512
Created a chunk of size 1111, which is longer than the specified 512
Created a chunk of size 933, which is longer than the specified 512
Created a chunk of size 1341, which is longer than the specified 512
Created a chunk of size 1339, which is longer than the specified 512
Created a chunk of size 1287, which is longer than the specified 512
Created a chunk of size 1538, which is longer than the specified 512
Created a chunk of size 1674, which is longer than the specified 512
Created a chunk of size 593, which is longer than the specified 512
Created a chunk of size 526, which is lon

In [31]:
vector_store = Milvus.from_documents(
    file_texts,
    embedding=embeddings,
    connection_args={"host": "localhost", "port": 19530},
    collection_name="cities"
)

In [32]:
file_texts[0]

Document(page_content='Celtic cross by Joan Stark\n\n         _..._\n       .-|>X<|-.\n     _//`|oxo|`\\\\_  \n    /xo=._\\X/_.=ox\\\n    |<>X<>(_)<>X<>|\n    \\xo.=\'/X\\\'=.ox/\n      \\\\_/oxo\\_//\n       \';<>X<>;\'\n        |=====|\n        |<>X<>|\n        |oxoxo|\n        |<>X<>|\n       _|oxoxo|_\njgs.--\' `"""""` \'--.\n\nCeltic knots\n\n   /\\  /\\\n  /  \\/  \\\n / /\\ \\/\\ \\\n \\ \\/\\ \\/ /\n  \\/ /\\/ /\n  / /\\/ /\\\n / /\\  /\\ \\\n/ /  \\/  \\ \\\n\\ \\  /\\  / /\n \\ \\/  \\/ /\n  \\/ /\\/ /\n  / /\\/ /\\\n / /\\ \\/\\ \\\n \\ \\/\\ \\/ /\n  \\  /\\  /\n   \\/  \\/', metadata={'doc_title': 'celtic', 'chunk_num': 0})

In [19]:
retriever = vector_store.as_retriever()

In [33]:
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = PromptTemplate.from_template(template)

In [35]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [40]:
chain.invoke("can you create an ascii representation of a village in the mountains?")

" Sure, here's a simple ASCII representation of a village in the mountains:\n\n    _/_/\n   /   \\\n__/     \\__\n\\         /\\\n_\\       _/\n  \\     /\n___\\ /_/__\n\\_____/\n \nThis representation features a mountain range with a village nestled in the valley. Please note that this is a very basic representation and can be further customized based on your specific needs."

In [23]:
# Let's make this a bit more fun and showcase the multilingual capabilities of Mixtal which really outshine other open source models

# Our Vector DB is populated with entries from english text - even the embedding model we're using here, GTE-Large
# works best on english text. However Mixtral has good mutlilingual capabilities in French, German, Spanish and Italian.
# So what we'll do is ask the assistant to only answer in french in the system and user prompt. RAG here is performed based on 
# english text, but upon producing the user response, the Mixtral LLM will generate tokens in a different language here (french)
french_llm = OctoAIEndpoint(
    endpoint_url="https://text.octoai.run/v1/chat/completions",
    model_kwargs={
        "model": "mixtral-8x7b-instruct-fp16",
        "max_tokens": 128,
        "presence_penalty": 0,
        "temperature": 0.1,
        "top_p": 0.9,
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant who responds in French and not in English.",
            },
        ],
    },
)

french_template = """Answer the question in French based only on the following context:
{context}

Question: {question}
"""
french_prompt = PromptTemplate.from_template(french_template)

In [24]:
french_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | french_prompt
    | french_llm
    | StrOutputParser()
)

In [26]:
fr_1 = french_chain.invoke("How big is the city of Seattle?")

In [27]:
from pprint import pprint
pprint(fr_1)

(' La ville de Seattle est assez grande avec une population de 749 256 '
 "habitants en 2022. C'est la ville la plus peuplée de l'État de Washington et "
 "de la région du Nord-Ouest Pacifique de l'Amérique du Nord. L'aire "
 "métropolitaine de Seattle compte 4,02 millions d'habitants, ce qui en fait "
 'la 15e plus importante aux États-Unis. La croissance de la population de '
 'Seattle a été rapide, avec une augmentation de 21,1%')
