# Conversation Intelligence: Gong.io Open-Source Alternative AI Sales Assistant

In this lesson we will explore how LangChain, Deep Lake, and GPT-4 can be used to develop a sales assistant able to give advice to salesman, taking into considerations internal guidelines.

Introduction
This particular article provides an in-depth view of a sales call assistant that connects you to a chatbot that understands the context of your conversation. A great feature of SalesCopilot is its ability to detect potential objections from customers and deliver recommendations on how best to address them.

The article is a journey that reveals the challenges faced and the solutions discovered during the creation of a the project. You'll learn about the two distinct text-splitting approaches that didn't work and how these failures paved the way for an effective solution.

Firstly, the authors tried to rely solely on the LLM, but they encountered issues such as response inconsistency and slow response times with GPT-4. Secondly, they naively split the custom knowledge base into chunks, but this approach led to context misalignment and inefficient results.

After these unsuccessful attempts, a more intelligent way of splitting the knowledge base based on its structure was adopted. This change greatly improved the quality of responses and ensured better context grounding for LLM responses. This process is explained in detail, helping you grasp how to navigate similar challenges in your own AI projects.

Next, the article explores how SalesCopilot was integrated with Deep Lake. This integration enhanced SalesCopilot's capabilities by retrieving the most relevant responses from a custom knowledge base, thereby creating a persistent, efficient, and highly adaptable solution for handling customer objections.

### Did Work: Intelligent Splitting

In our example text, there is a set structure to each individual objection and its recommended response. Rather than split the text based on size, why don’t we split the text based on its structure? We want each chunk to begin with the objection, and end before the “Objection” of the next chunk. Here’s how we could do it:

*First, we take our knowledge base and embed it, storing the embeddings in a Deep Lake vector database. Then, when we detect an objection in the transcript, we embed the objection and use it to search our database, retrieving the most similar guidelines. We then pass those guidelines along with the objection to the LLM and send the result to the user.*

## Creating, Loading, and Querying Our Database for AI

We’re going to define a class that handles the database creation, database loading, and database querying.

In [6]:
import os
import re
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import DeepLake

class DeepLakeLoader:
    def __init__(self, source_data_path):
        self.source_data_path = source_data_path
        self.file_name = os.path.basename(source_data_path) # What we'll name our database 
        self.data = self.split_data()
        if self.check_if_db_exists():
            self.db = self.load_db()
        else:
            self.db = self.create_db()

    def split_data(self):  
        """  
        Preprocess the data by splitting it into passages.  

        If using a different data source, this function will need to be modified.  

        Returns:  
            split_data (list): List of passages.  
        """  
        with open(self.source_data_path, 'r') as f:  
            content = f.read()  
        split_data = re.split(r'(?=\d+\. )', content)
        if split_data[0] == '':  
            split_data.pop(0)  
        split_data = [entry for entry in split_data if len(entry) >= 30]  
        return split_data
            

There’s a few things happening here. First, the data is being processed by a method called split_data.

Since we know the structure of our knowledge base, we use this method to split it into individual entries, each representing an example of a customer objection. When we run our similarity search using the detected customer objection, this will improve the results, as outlined above.

After preprocessing the data, we check if we’ve already created a database for this data. One of the great things about Deep Lake is that it provides us with persistent storage, so we only need to create the database once. If you restart the app, the database doesn’t disappear!

Creating and loading the database is super easy:

In [7]:
def load_db(self):  
    """  
    Load the database if it already exists.  

    Returns:  
        DeepLake: DeepLake object.  
    """  
    return DeepLake(dataset_path=f'deeplake/{self.file_name}', embedding_function=OpenAIEmbeddings(), read_only=True)  

def create_db(self):  
    """  
    Create the database if it does not already exist.  

    Databases are stored in the deeplake directory.  

    Returns:  
        DeepLake: DeepLake object.  
    """  
    return DeepLake.from_texts(self.data, OpenAIEmbeddings(), dataset_path=f'deeplake/{self.file_name}')



Just like that, our knowledge base becomes a vector database that we can now query

In [8]:
def query_db(self, query):  
    """  
    Query the database for passages that are similar to the query.  

    Args:  
        query (str): Query string.  

    Returns:  
        content (list): List of passages that are similar to the query.  
    """  
    results = self.db.similarity_search(query, k=3)  
    content = []  
    for result in results:  
        content.append(result.page_content)  
    return content



## Connecting Our Database to GPT-4

Now, all we need to do is connect our LLM to the database. First, we need to create a DeepLakeLoader instance with the path to our data.

In [None]:
#Load from Deeplake
db = DeepLakeLoader('data/salestesting.txt')
# Run the LLM
results = db.query_db(detected_objection)

To have our LLM generate a message based off these results and the objection, we use LangChain. In this example, we use a placeholder for the prompt - if you want to check out the prompts used in SalesCopilot, check out the prompts.py file.

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import SystemMessage, HumanMessage, AIMessage

chat = ChatOpenAI()
system_message = SystemMessage(content=objection_prompt)
human_message = HumanMessage(content=f'Customer objection: {detected_objection} | Relevant guidelines: {results}')

response = chat([system_message, human_message])


# Print the results
print(response.content)