<a href="https://colab.research.google.com/github/aljebraschool/ai-startup-idea-generator/blob/master/LLM_university_RAG_with_Connection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# What Are Connectors?
Connectors are independent REST APIs that can be used in a RAG workflow to provide secure, real-time access to private data.

In enterprises, data lives in many different places. The ability of enterprises to realize the full value of RAG rests on their ability to bring these data sources tog#ether. Cohere’s build-your-own connectors framework enables developers to develop a connector to any datastore that offers an accompanying search API.

At a high level, here’s what connectors do. When the Chat endpoint calls a connector, what happens is that the endpoint is sending a query to the search endpoint of that connector. The connector will then return the list of documents that it deems the most relevant to the query.

The build-your-own connectors framework allows developers to build any logic behind a connector. For example, you can define the retrieval implementation—whether it’s running a semantic similarity search over a vector database, searching over an existing full-text search engine, or utilizing the existing search APIs of platforms like Google Drive or Notion.

Additionally, in connector mode, most of the RAG building blocks are taken care of by the endpoint. This includes deciding whether to retrieve information, generating queries, retrieving documents, chunking and reranking documents (post-retrieval), and generating the response.

Recall that in the previous chapter (document mode), we implemented the following steps.

Step 1: Get the user message
Step 2: Call the Chat endpoint in query-generation mode
If at least one query is generated:
Step 3: Retrieve and rerank relevant documents
Step 4: Call the Chat endpoint in document mode to generate a grounded response with citations
If no query is generated:
Step 4: Call the Chat endpoint in normal mode to generate a direct response
In connector mode, this is simplified to the following two steps.

Step 1: Get the user message
Step 2: Call the Chat endpoint in connector mode to generate a response (this can be either a grounded response with citations or a direct response)

# Step-by-Step Guide

We’ll build a RAG chatbot that can search the web, retrieve relevant results to a user query, and generate grounded responses to the query.

In [None]:
!pip install cohere -q

In [None]:
import cohere
import uuid
from cohere import ChatConnector

In [None]:
co = cohere.ClientV2("COHERE_API_KEY") # Get your free API key: https://dashboard.cohere.com/api-keys

# Create the Chatbot Component

The change from document mode to connector mode requires just one change to the Chat endpoint, which is swapping the documents parameter with the connectors parameter.

Here’s how it looks with the web search connector. We supply the connector id, which is web-search as an argument to the connectors parameter.



> response = co.chat_stream(message="What is LLM university",
     connectors = [ChatConnector(id="web-search)])



The one line of code above is enough to get a full RAG-enabled response—the response text, the citations, and the source documents, which in this case are snippets from the most relevant information available on the web based on a given user message.

But in order to run this in a multi-turn chatbot scenario, we need to build the chatbot component. The good news is that we can adapt the chatbot we built in the previous chapter.

There are a few changes to make, including:

Remove the query generation logic (done by the endpoint)
Remove the retrieval logic (done by the endpoint)
Change the Chatbot initialization to use connectors instead
Use the connectors parameter instead of documents in the Chat endpoint call

In [None]:
class Chatbot:
  def __init__(self, connectors : list[str]):
        """
        Initializes an instance of the Chatbot class.

        """

        self.conversation_id = str(uuid.uuid4())
        self.connectors = [ ChatConnector(id = connector) for connector in connectors]

  def run(self):
        """
        Runs the chatbot application.

        """

        while True:
          # Get the user message
          message = input("user: ")

           # Typing "quit" ends the conversation
          if message.lower() == "quit":
            print("Ending Chat... ")
            break

          # else:                       # If using Google Colab, remove this line to avoid printing the same thing twice
          #   print(f"User: {message}") # If using Google Colab, remove this line to avoid printing the same thing twice

          # Generate response
          response = co.chat_stream(
                    message=message,
                    model="command-r-plus",
                    conversation_id=self.conversation_id,
                    connectors=self.connectors,
            )

          # Print the chatbot response, citations, and documents
          print("\nChatbot:")
          citations = []
          cited_documents = []

          # Display response
          for event in response:
            if event.event_type == "text-generation":
              print(event.text, end = "")
            elif event.event_type == "citation-generation":
              citations.extend(event.citations)
            elif event.event_type == "stream-end":
              cited_documents = event.response.documents

          # Display citations and source documents
          if citations:
            print("\n\nCITATIONS:")
            for citation in citations:
              print(citation)

            print("\nDOCUMENTS:")
            for document in cited_documents:
              print({'id': document['id'],
                      'snippet': document['snippet'][:400] + '...',
                      'title': document['title'],
                      'url': document['url']})

          print(f"\n{'-'*100}\n")




We can now run the chatbot.  For this, we create the instance of `Chatbot` using Cohere's managed web-search connector.  Then we run the chatbot by invoking the `run()` method.

The format of each citation is:
- `start`: The starting point of a span where one or more documents are referenced
- `end`: The ending point of a span where one or more documents are referenced
- `text`: The text representing this span
- `document_ids`: The IDs of the documents being referenced (`doc_0` being the ID of the first document passed to the `documents` creating parameter in the endpoint call, and so on)

In [None]:
# Define the connector
connector = ['web-search']

# Create an instance of the Chatbot class
chatbot = Chatbot(connector)

# Run the chatbot
chatbot.run()

user: hello 

Chatbot:
Hello! How can I help you today?
----------------------------------------------------------------------------------------------------

user: how are you doing today?

Chatbot:
I don't have feelings, so I don't have good or bad days. How are you doing today?
----------------------------------------------------------------------------------------------------

user: tell me about yourself?

Chatbot:
I am an AI assistant chatbot trained to assist human users by providing thorough responses. I am designed to be helpful and harmless. I am always learning and evolving based on the data I receive. My goal is to make your life easier and more efficient by providing you with the information you need when you need it.
----------------------------------------------------------------------------------------------------

user: can you tell me briefly about what you can help me with? what is your knowledge base?

Chatbot:
I can help you with a wide range of tasks and topics. My