## Step 1 - Add your OpenAI API key.
The key must be taken from a registered OpenAI account.
Link to the keys sub-menu https://platform.openai.com/account/api-keys.

In [10]:
%env OPENAI_API_KEY YOUR_OPENAI_API_KEY

env: OPENAI_API_KEY=YOUR_OPENAI_API_KEY


## Step 2 - Installing necessary libraries
The libraries below are needed for:
- Loading of text data from files and further preparation;
- Creating embeddings via Langchain;
- Storing the content with embeddings;
- Fetching the mst relevant content by the use of embeddings;
- Prompting;

In [3]:
!pip install langchain
!pip install faiss-cpu
!pip install pinecone-client


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Step 3 - Libraries import
Imported libraries will be used across the whole notebook

In [4]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

## Step 4 - Instantiating tools for creating embeddings
This step will instantiate tools and create necessary embedding for the provided document - `docs/preprocessed_with_gpt/what_is_microservice_architecture.txt`.

The document taken is prepared with GPT by the prompt provided in the [content chunking](./content_chunking.ipynb) notebook.

In [5]:
loader = TextLoader(file_path="./docs/preprocessed_with_gpt/what_is_microservice_architecture.txt")
text_splitter = CharacterTextSplitter(
    separator="\n\n",
    chunk_size=100,
    chunk_overlap=10,
    length_function=len
)
embeddings = OpenAIEmbeddings()

docs = loader.load_and_split(text_splitter=text_splitter)

docs[0]

Created a chunk of size 409, which is longer than the specified 100
Created a chunk of size 848, which is longer than the specified 100
Created a chunk of size 1077, which is longer than the specified 100
Created a chunk of size 2231, which is longer than the specified 100
Created a chunk of size 2425, which is longer than the specified 100
Created a chunk of size 1525, which is longer than the specified 100


Document(page_content='General title: Microservice Architecture Style\nMicroservice Architecture Style - Overview\nA microservices architecture consists of a collection of small, autonomous services. Each service is self-contained and should implement a single business capability within a bounded context. A bounded context is a natural division within a business and provides an explicit boundary within which a domain model exists.', metadata={'source': './docs/preprocessed_with_gpt/what_is_microservice_architecture.txt'})

## Step 5 - Creating storage
Creates storage for content with embeddings and saved it to the local filesystem.

This step will create a folder with encoded data and indexes for FAISS.

In [7]:
store = FAISS.from_documents(docs, embedding=embeddings)
store.save_local("what_is_microservice_architecture.faiss_index")

"""
What the above operation will do under the hood
"""
# faiss.write_index(store.index, "what_is_microservice_architecture.index")
# store.index = None
#
# with open("what_is_microservice_architecture.pkl", "wb") as faiss_store_dump:
#     pickle.dump(store, faiss_store_dump)

'\nWhat the above operation will do under the hood\n'

## Step 6 - Loading data into index
This step will load data into FAISS index and create store that can be further used for searching.

The steps above are separated to prevent tokens waste. As the content has embeddings and is stored locally it can be used for multiple data fetches.

In [6]:
faiss_store = FAISS.load_local("what_is_microservice_architecture.faiss_index", embeddings)

"""
What the above operation will do under the hood
"""
# index = faiss.read_index("what_is_microservice_architecture.index")
#
# with open("what_is_microservice_architecture.pkl", "rb") as faiss_store_dump:
#     faiss_store = pickle.load(faiss_store_dump)
#
# faiss_store.index = index

'\nWhat the above operation will do under the hood\n'

Both methods below produce similar search result and both return more documents that it is required by the context of the question.

The method with search score allows to sort documents after they retrieved from the store and use only the most relevant in the prompt.

## Step 7 - Data fetching test
The result will have the document found and the "match score". The lover the core is the better match between the query and the found document.

In [8]:
query = "What problems does microservice architecture has?"
documents = faiss_store.similarity_search_with_score(query)

documents

[(Document(page_content="Microservice Architecture Style - Challenges\nThe benefits of microservices don't come for free. Here are some of the challenges to consider before embarking on a microservices architecture. Complexity. A microservices application has more moving parts than the equivalent monolithic application. Each service is simpler, but the entire system as a whole is more complex. Development and testing. Writing a small service that relies on other dependent services requires a different approach than a writing a traditional monolithic or layered application. Existing tools are not always designed to work with service dependencies. Refactoring across service boundaries can be difficult. It is also challenging to test service dependencies, especially when the application is evolving quickly. Lack of governance. The decentralized approach to building microservices has advantages, but it can also lead to problems. You might end up with so many different languages and framewo

In [9]:
embedded_query = embeddings.embed_query(query)
documents = faiss_store.similarity_search_with_score_by_vector(embedded_query)

documents

[(Document(page_content="Microservice Architecture Style - Challenges\nThe benefits of microservices don't come for free. Here are some of the challenges to consider before embarking on a microservices architecture. Complexity. A microservices application has more moving parts than the equivalent monolithic application. Each service is simpler, but the entire system as a whole is more complex. Development and testing. Writing a small service that relies on other dependent services requires a different approach than a writing a traditional monolithic or layered application. Existing tools are not always designed to work with service dependencies. Refactoring across service boundaries can be difficult. It is also challenging to test service dependencies, especially when the application is evolving quickly. Lack of governance. The decentralized approach to building microservices has advantages, but it can also lead to problems. You might end up with so many different languages and framewo

## Step 8 - Work with hosted vector DB
The code below does the same job as the code above. The difference is that it uses a hosted, production-ready vector DB - `Pinecone`.

Current ecosystem has a large amount of data storage solutions to offer. `Pinecone` is probably one of the most widely mentioned and simple DBs around.

In [29]:
import pinecone
from langchain.vectorstores import Pinecone

pinecone.init(
    api_key="YOUR_PINECONE_API_KEY",
    environment="us-west1-gcp-free"
)

pinecone_db = Pinecone.from_documents(documents=docs, embedding=embeddings, index_name="llm-general-data-cosine-v2")

In [30]:
"""
Pinecone produces the result of similar quality as FAISS
"""

result = pinecone_db.similarity_search(query)

result

[Document(page_content="Microservice Architecture Style - Challenges\nThe benefits of microservices don't come for free. Here are some of the challenges to consider before embarking on a microservices architecture. Complexity. A microservices application has more moving parts than the equivalent monolithic application. Each service is simpler, but the entire system as a whole is more complex. Development and testing. Writing a small service that relies on other dependent services requires a different approach than a writing a traditional monolithic or layered application. Existing tools are not always designed to work with service dependencies. Refactoring across service boundaries can be difficult. It is also challenging to test service dependencies, especially when the application is evolving quickly. Lack of governance. The decentralized approach to building microservices has advantages, but it can also lead to problems. You might end up with so many different languages and framewor