# Augmented-Dev: using Vertex AI

Jupyter notebook with tests of the Vertex AI embeddings and LLM models

HINT: it requires preliminary configuration of the local machine - including `GOOGLE_APPLICATION_CREDENTIALS` env variable, pointing to the configuration file 

see: https://cloud.google.com/vertex-ai/docs/start/client-libraries

In [1]:
from langchain_community.document_loaders.generic import GenericLoader
from langchain_community.document_loaders.parsers import LanguageParser
from langchain_community.vectorstores import Chroma

from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationSummaryMemory

from langchain_text_splitters import Language, RecursiveCharacterTextSplitter

from textwrap import dedent
from IPython.display import display, HTML, Markdown

In [2]:
# load environment variables -> with override
%load_ext dotenv
%dotenv ../.env -o

In [3]:
REPO_BASIC = '/Temp/augmented.dev/src/Domain-Driven-BASIC_CS'
REPO_MODULAR_MONOLITH = '/Temp/augmented.dev/src/Modular-monolith-by-example/Src'
REPO_PATH = REPO_MODULAR_MONOLITH
FILE_EXTENSION = '.cs'
CHROMA_DB_SAVE_PATH = '/Temp/augmented.dev/embeddings/vertex-ai'

### Vertex AI specific imports

**TODO**: there are several models - test them

1. `code-bison`: "A model fine-tuned to generate code based on a natural language description of the desired code. For example, it can generate a unit test for a function."  
2. ` codechat-biso`n:"“A model fine-tuned to generate code based on a natural language description of the desired code. For example, it can generate a unit test for a functio.  ”3
3`. code-gec`ko “A model fine-tuned for chatbot conversations that help with code-related question".”

In [4]:
from langchain.embeddings import VertexAIEmbeddings
from langchain.llms import VertexAI

VERTEXAI_EMBEDDING_MODEL = "textembedding-gecko@001"

VERTEXAI_MODEL_BISON = "text-bison@001" # initial, not bad - better than gemini-1.0-pro (sic!)
VERTEXAI_MODEL_CODE_BISON = "code-bison"
VERTEXAI_MODEL_CODECHAT_BISON = "codechat-bison"
VERTEXAI_MODEL_CODE_GECKO = "code-gecko"

# HINT: clean that later - for now, just ignore
import warnings
warnings.filterwarnings('ignore')

embedding_model = VertexAIEmbeddings(model_name=VERTEXAI_EMBEDDING_MODEL)

llm = VertexAI(model_name=VERTEXAI_MODEL_CODE_BISON)

In [5]:
def load_split_documents(repo_path:str, file_extensions:list[str], exclude_files:list[str]=[]):
    """Splits all text files (of given extension) into chunks to calculate embeddings"""
    def load_documents(repo_path:str, file_extensions:list[str], exclude_files:list[str]=[]):
    
        loader = GenericLoader.from_filesystem(
            repo_path,
            glob="**/*",
            suffixes=file_extensions,
            exclude=exclude_files,
            parser=LanguageParser(language=Language.CSHARP, parser_threshold=500),
        )
        documents = loader.load()
        print(f' ** loaded files:  {len(documents)}')
        return documents

    # HINT: split code files to some arbitrary size. Use overlap
    python_splitter = RecursiveCharacterTextSplitter.from_language(
        language=Language.CSHARP, chunk_size=2000, chunk_overlap=200
    )
    documents = load_documents(REPO_PATH, [FILE_EXTENSION])
    texts = python_splitter.split_documents(documents)
    print(f' ** loaded documents: {len(documents)}; splitted into code files:  {len(texts)}')
    return texts

### Load or create embeddings

In [6]:
# HINT: try loading existing embeddings from ChromaDB
db = Chroma(
    persist_directory=CHROMA_DB_SAVE_PATH,
    embedding_function=embedding_model
)

if db._collection.count() > 0:
    print(f"Embeddings' database was already initializede: {CHROMA_DB_SAVE_PATH}")
else:
    # when collection's empty, re-create it
    print(f"Embeddings' database is empty - restoring")
    texts = load_split_documents(REPO_PATH, [FILE_EXTENSION])
    db = Chroma.from_documents(
        documents=texts, 
        embedding=embedding_model,
        persist_directory=CHROMA_DB_SAVE_PATH
    )

retriever = db.as_retriever(
    search_type="mmr",  # Also test "similarity"
    search_kwargs={"k": 8},
)
db.persist()
print(f' ** db collection counts:  {db._collection.count()}')

Embeddings' database was already initializede: /Temp/embeddings/vertex-ai
 ** db collection counts:  955


### Create conversation-retrieval chain

Some example questions:
```
* Question:
Please write a summary of the Estimation tool: what's the purpose of the project, main technology stack etc. Use at least 5 sentences.
* Question:
Please list all the classes implementing IAggregateRoot, such as e.g. Inquiry (from `Divstack.Company.Estimation.Tool.Inquiries.Domain.Inquiries` namespace). What are the other classes from Divstack Estimation Tool?
* Question:
You are a helpful, experienced software developer, eager to help other people understand code.
What is the purpose of the Inquiry class. What are the use cases (e.g. events) involved with it?
* Question:
You are a helpful, experienced software developer, who writes clean, self-explainable object-oriented C# code.
Please write xunit tests of the Inquiry class, that tests domain events InquiryMadeDomainEvent. Mock the necessary objects when necessary.
* Question:
Please explain the code in Inquiry.cs file using 7 bullet points.
```

In [7]:
memory = ConversationSummaryMemory(
    llm=llm, memory_key="chat_history", return_messages=True
)
qa = ConversationalRetrievalChain.from_llm(llm, retriever=retriever, memory=memory)

In [8]:
question = dedent("""Please write a summary of the Estimation tool: what's the purpose of the project, main technology stack etc.
Use at least 5 sentences.
""")
result = qa(question)
display(Markdown(result['answer']))

 The provided code snippets do not contain any information about the purpose or technology stack of the Estimation tool project.

### Interactive extension

In [9]:
%run _interactive.ipynb

HBox(children=(Output(),), layout=Layout(display='inline-flex', flex_flow='column-reverse', max_height='500px'…

Box(children=(Image(value=b'GIF89a\xc8\x00\xc8\x00\xf7\x00\x00;Ch\x83\x90\xb7\xcf\xdc\xe8\xda\xec\xf1\xf1\xf2\…

### Previous results

In [8]:
question = dedent("""Please list all the classes implementing IAggregateRoot, 
    such as e.g. Inquiry (from `Divstack.Company.Estimation.Tool.Inquiries.Domain.Inquiries` namespace).
    What are the other classes from Divstack Estimation Tool?""")
result = qa(question)
print(result["answer"])
display(Markdown(result["answer"]))

  warn_deprecated(



Without seeing more context, it is difficult to determine the specific classes from the `Divstack.Company.Estimation.Tool` namespace. However, based on the given context, we can infer that there are classes related to inquiries, valuations, priorities, services, and attributes within this namespace. 



Without seeing more context, it is difficult to determine the specific classes from the `Divstack.Company.Estimation.Tool` namespace. However, based on the given context, we can infer that there are classes related to inquiries, valuations, priorities, services, and attributes within this namespace. 

In [9]:
question = dedent("""You are a helpful, experienced software developer, eager to help other people understand code.
What is the purpose of the Inquiry class. What are the use cases (e.g. events) involved with it?""")
result = qa(question)
print(result["answer"])
display(Markdown(result["answer"]))

 The Inquiry class is a domain entity that represents a customer inquiry for services. It is used to create an inquiry, validate the services and client, and publish an event when the inquiry is made. The involved use cases for this class include creating an inquiry, validating services, and publishing an event.


 The Inquiry class is a domain entity that represents a customer inquiry for services. It is used to create an inquiry, validate the services and client, and publish an event when the inquiry is made. The involved use cases for this class include creating an inquiry, validating services, and publishing an event.

In [10]:

question = dedent("""You are a helpful, experienced software developer, who writes clean, self-explainable object-oriented C# code.
Please write xunit tests of the Inquiry class, that tests domain events InquiryMadeDomainEvent. Mock the necessary objects when necessary.""")
result = qa(question)
print(result["answer"])
display(Markdown(result["answer"]))


Sure, here's an example of how you could write tests for the Inquiry class, using xUnit and NSubstitute for mocking:

```
public class InquiryTests
{
    // The Inquiry class requires a service existing checker and a client object in its constructor, so we'll mock those using NSubstitute.
    private readonly IServiceExistingChecker _serviceExistingChecker = Substitute.For<IServiceExistingChecker>();
    private readonly Client _client = Substitute.For<Client>();

    // We'll also create a fake list of services to use in our tests.
    private readonly List<Service> _services = new List<Service>
    {
        new Service("Service 1"),
        new Service("Service 2")
    };

    // The MakeAsync method is the main method we want to test, so we'll start with a test for that.
    [Fact]
    public async Task Given_ValidParameters_Then_InquiryIsMade()
    {
        // Arrange
        // We'll use the SetupServiceExistingChecker method to set up our mock service existing checker to retur


Sure, here's an example of how you could write tests for the Inquiry class, using xUnit and NSubstitute for mocking:

```
public class InquiryTests
{
    // The Inquiry class requires a service existing checker and a client object in its constructor, so we'll mock those using NSubstitute.
    private readonly IServiceExistingChecker _serviceExistingChecker = Substitute.For<IServiceExistingChecker>();
    private readonly Client _client = Substitute.For<Client>();

    // We'll also create a fake list of services to use in our tests.
    private readonly List<Service> _services = new List<Service>
    {
        new Service("Service 1"),
        new Service("Service 2")
    };

    // The MakeAsync method is the main method we want to test, so we'll start with a test for that.
    [Fact]
    public async Task Given_ValidParameters_Then_InquiryIsMade()
    {
        // Arrange
        // We'll use the SetupServiceExistingChecker method to set up our mock service existing checker to return true.
        SetupServiceExistingChecker(true);

        // Act
        // We'll call the MakeAsync method on the Inquiry class, passing in our mock objects.
        var inquiry = await Inquiry.MakeAsync(_services

In [11]:
question = "Please write xunit test code that creates a Inquiry object and tests it succeeded."
result = qa(question)
display(Markdown(result["answer"]))
#result["answer"]



Unfortunately, without access to the code for the Inquiry class, it is not possible to provide an example of how to write xunit tests for it. Additionally, the provided context does not include any information about the specific functionality of the Inquiry class, so it is not possible to accurately write a test for its success. It is recommended to review the documentation for xunit and the specific functionality of the Inquiry class to determine the appropriate way to write tests for it.

In [13]:
question = dedent("""Please explain the code in Inquiry.cs file using 7 bullet points.""")
result = qa(question)
result['answer']
display(Markdown(result["answer"]))


1. The Inquiry.cs file is a part of the Divstack.Company.Estimation.Tool.Inquiries domain and it contains the Inquiry class.
2. The Inquiry class is a specific type of entity that represents an inquiry made by a client for a particular service.
3. The class contains properties such as Id, ClientFirstName, ClientLastName, and ClientEmail that store information about the inquiry.
4. The class also has a private constructor and a public factory method that is used to create a new instance of the Inquiry class.
5. The class has a private Service property that stores information about the service requested in the inquiry.
6. The class also has a private InquiryItemId property that stores the unique identifier for the inquiry item.
7. Finally, the class has a Create method that is used to create a new instance of the Inquiry class.

## Some other code

Code generated:
```
using System;
using Xunit;
using eCommerce.DomainModelLayer.Customers;
using eCommerce.Helpers.Domain;

namespace eCommerce.Tests
{
    public class CustomerTests
    {
        [Fact]
        public void Create_ValidFirstNameLastNameEmailCountry_CreatesCustomer()
        {
            // Arrange
            string firstname = "John";
            string lastname = "Doe";
            string email = "johndoe@example.com";
            Country country = new Country("United States", 1);
            
            // Act
            Customer customer = Customer.Create(firstname, lastname, email, country);
            
            // Assert
            Assert.NotNull(customer);
            Assert.Equal(Guid.Empty, customer.Id);
            Assert.Equal("John", customer.FirstName);
            Assert.Equal("Doe", customer.LastName);
            Assert.Equal("johndoe@example.com", customer.Email);
            Assert.Equal(1, customer.CountryId);
            
        }
    }
}
```
This unit test code verifies that creating a `Customer` object with valid first name, last name, email, and country values results in the creation of a new instance of the `Customer` class. The test case also checks for the correctness of the properties of the `Customer` object being created.
```