# Ollama embeddings verification

see: https://python.langchain.com/docs/integrations/text_embedding/ollama

https://ollama.com/library/codellama

https://continue.dev/docs/walkthroughs/codebase-embeddings

In [1]:
from langchain_community.document_loaders.generic import GenericLoader
from langchain_community.document_loaders.parsers import LanguageParser
from langchain_text_splitters import Language, RecursiveCharacterTextSplitter


In [2]:
REPO_BASIC = '/Dev/_2/DDD.cs/Domain-Driven-BASIC_CS'
REPO_MODULAR_MONOLITH = '/Dev/_2/DDD.cs/Modular-monolith-by-example/Src'
REPO_PATH = REPO_MODULAR_MONOLITH
FILE_EXTENSION = '.cs'
CHROMA_DB_SAVE_PATH = '/Temp/embeddings/ollama'

In [4]:
def load_documents(repo_path:str, file_extensions:list[str], exclude_files:list[str]=[]):

    loader = GenericLoader.from_filesystem(
        repo_path,
        glob="**/*",
        suffixes=file_extensions,
        exclude=exclude_files,
        parser=LanguageParser(language=Language.CSHARP, parser_threshold=500),
    )
    documents = loader.load()
    print(f' ** loaded files:  {len(documents)}')
    return documents
documents = load_documents(REPO_PATH, [FILE_EXTENSION])

 ** loaded files:  847


In [5]:

# HINT: split code files to some arbitrary size. Use overlap
python_splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.CSHARP, chunk_size=2000, chunk_overlap=200
)
texts = python_splitter.split_documents(documents)
print(f' ** splitted code files:  {len(texts)}')

 ** splitted code files:  955


In [6]:
from langchain_community.vectorstores import Chroma

# from langchain_openai import OpenAIEmbeddings
# embedding_model = OpenAIEmbeddings(disallowed_special=())

from langchain_community.embeddings import OllamaEmbeddings
embedding_model = OllamaEmbeddings()

db = Chroma.from_documents(
    documents=texts, 
    embedding=embedding_model,
    persist_directory=CHROMA_DB_SAVE_PATH
)
retriever = db.as_retriever(
    search_type="mmr",  # Also test "similarity"
    search_kwargs={"k": 8},
)
db.persist()
print(f' ** db collection counts:  {db._collection.count()}')

 ** db collection counts:  955


In [7]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationSummaryMemory
# from langchain_openai import ChatOpenAI
# llm = ChatOpenAI(model_name="gpt-4")

from langchain_community.llms import Ollama
llm = Ollama(model = "codellama:7b-instruct")

memory = ConversationSummaryMemory(
    llm=llm, memory_key="chat_history", return_messages=True
)
qa = ConversationalRetrievalChain.from_llm(llm, retriever=retriever, memory=memory)

In [9]:
from textwrap import dedent

question = dedent("""Please list all the classes implementing IAggregateRoot, 
    such as e.g. Inquiry (from `Divstack.Company.Estimation.Tool.Inquiries.Domain.Inquiries` namespace).
    What are the other classes from Divstack Estimation Tool?""")
result = qa(question)
result["answer"]

"It seems like you are asking for a list of classes that might be part of the Divstack Estimation Tool. Here's what I can tell you about those based on the information provided:\n\n1. `AskedServiceDto`: This class is used to represent a requested service in the estimation tool. It contains an ID, name, description, category ID, and a list of attributes associated with the service.\n2. `AttributeDto`: This class is used to represent an attribute associated with a service in the estimation tool. It contains an ID and a value ID.\n3. `Service`: This class represents a service in the estimation tool. It has an ID, name, description, category ID, and a list of attributes associated with it.\n4. `InquiryMadeEvent`: This class is used to represent an event that occurs when a user makes an inquiry in the estimation tool. It contains an ID for the inquiry.\n5. `IQuery<TDto>`: This interface is used as a base class for queries in the estimation tool. It provides a generic type parameter `TDto` t

In [9]:
question = "Please write xunit test code that creates a Customer object and tests it succeeded."
result = qa(question)
result["answer"]

Number of requested results 20 is greater than number of elements in index 12, updating n_results = 12


'```\nusing System;\nusing Xunit;\nusing eCommerce.DomainModelLayer.Customers;\nusing eCommerce.Helpers.Domain;\n\nnamespace eCommerce.Tests\n{\n    public class CustomerTests\n    {\n        [Fact]\n        public void Create_ValidFirstNameLastNameEmailCountry_CreatesCustomer()\n        {\n            // Arrange\n            string firstname = "John";\n            string lastname = "Doe";\n            string email = "johndoe@example.com";\n            Country country = new Country("United States", 1);\n\n            // Act\n            Customer customer = Customer.Create(firstname, lastname, email, country);\n\n            // Assert\n            Assert.NotNull(customer);\n            Assert.Equal(Guid.Empty, customer.Id);\n            Assert.Equal("John", customer.FirstName);\n            Assert.Equal("Doe", customer.LastName);\n            Assert.Equal("johndoe@example.com", customer.Email);\n            Assert.Equal(1, customer.CountryId);\n        }\n    }\n}\n```\nThis unit test c

Code generated:
```
using System;
using Xunit;
using eCommerce.DomainModelLayer.Customers;
using eCommerce.Helpers.Domain;

namespace eCommerce.Tests
{
    public class CustomerTests
    {
        [Fact]
        public void Create_ValidFirstNameLastNameEmailCountry_CreatesCustomer()
        {
            // Arrange
            string firstname = "John";
            string lastname = "Doe";
            string email = "johndoe@example.com";
            Country country = new Country("United States", 1);
            
            // Act
            Customer customer = Customer.Create(firstname, lastname, email, country);
            
            // Assert
            Assert.NotNull(customer);
            Assert.Equal(Guid.Empty, customer.Id);
            Assert.Equal("John", customer.FirstName);
            Assert.Equal("Doe", customer.LastName);
            Assert.Equal("johndoe@example.com", customer.Email);
            Assert.Equal(1, customer.CountryId);
            
        }
    }
}
```
This unit test code verifies that creating a `Customer` object with valid first name, last name, email, and country values results in the creation of a new instance of the `Customer` class. The test case also checks for the correctness of the properties of the `Customer` object being created.
```