# Vectorshift Chatbot

## Installing Library

To use the VectorShift Python library, you should be using Python 3.10 or newer.

The SDK is built upon our API. To access much of the functionality, such as saving and downloading pipelines, you should already have an API key ready.

Our Python SDK is available as the vectorshift package on PyPl. Before downloading, ensure you have pip installed. Then, you can simply get started by downloading the package by running the command in your terminal of choice:

In [63]:
! pip install vectorshift --upgrade




[notice] A new release of pip is available: 23.3.2 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


## Vectorshift Chatbot: Add Your Company Knowledge Base to Chat

The pipeline takes in a user question about VectorShift (input node). The user question queries a vector store that contains VectorShift documentation (a database that allows for semantic queries and returns the most relevant pieces of information). The results of the vector store are fed into an LLM prompt, along with the user question, and chat memory.

The overall pipeline should be looks like the figure below:
![alt text](images/vectorshift_chatbot/1-overview.png "Overall Pipeline")

In the first step, lets import our SDK and put the API Key

In [47]:
import vectorshift
from vectorshift.node import InputNode, URLLoaderNode, TextNode, VectorQueryNode, OpenAILLMNode, OutputNode, ChatMemoryNode
from vectorshift.pipeline import Pipeline
from vectorshift.knowledge_base import *

In [20]:
vectorshift.api_key="YOU_API_KEY"

### Input
Our pipeline takes in one input, which is of type text (the URL). Correspondingly, there's an InputNode class that we can use to represent this input, which requires a name and data type.

![alt text](images/vectorshift_chatbot/2-input_node.png "Input Node")

The data type is more than a constructor argument here. Behind the scenes, node outputs are tagged with different types (e.g. LLMs produce textual output), which can help catch issues with pipelines before they're saved to the VectorShift platform. We list the expected types of different nodes' inputs and outputs in node-specific documentation.

In [21]:
input_node = InputNode(name="User_question", input_type="text")

### Chat Memory
Chat memory allows chatbot to memorize the last n-conversation from the chats.

![alt text](images/vectorshift_chatbot/3-memory_node.png "Overall Pipeline")

In [55]:
chat_history = ChatMemoryNode(memory_type='Full - Formatted')

### Knowledge Base Node
Knowledge base allows VectorShift to store information about your product. In this demo, we use docs.vectorshift.ai as source of documentation

![alt text](images/vectorshift_chatbot/4-knowledge_node.png "Knowledge Node")

Create knowledge base

In [54]:
knowledge_base = KnowledgeBase(name="Vectorshift Doc", description="Knowledge Base for SDK related questions")
knowledge_base.save()

{'id': {'id': '667b1b2c7e47d29bce0ce899',
  'name': 'Vectorshift Doc',
  'description': 'Knowledge Base for SDK related questions',
  'chunkSize': 400,
  'chunkOverlap': 0,
  'isHybrid': False,
  'userID': 'auth0|64cbd237b160e37c8c3510d4',
  'orgID': 'Personal',
  'vectorCount': 0,
  'createdDate': '2024-06-25T19:31:56.839678',
  'lastSynced': None,
  'selectedIntegrations': [],
  'vector_db_details': {'vector_db_provider': 'qdrant',
   'collection_name': 'text-embedding-3-small',
   'embedding_model': 'text-embedding-3-small',
   'embedding_provider': 'openai',
   'dimension': None,
   'is_hybrid': False,
   'sparse_embedding_model': None,
   'query_embedding_model': None,
   'sparse_query_embedding_model': None},
  'documents': None,
  'integration_metadata': None,
  'folderId': None,
  'integrationNode': False,
  'fileProcessingImplementation': None,
  'apifyKey': None,
  'conversation_id': None,
  'hide_from_owner': False}}

In [None]:
knowledge_base = KnowledgeBase.fetch(base_name="Vectorshift Doc")
knowledge_base.load_documents(document="cl.pdf", document_type="File", document_name="cv")


In [62]:
query_result = knowledge_base.query(query=input_node.output())
query_result



Exception: Internal Server Error

### LLM Node
Knowledge base allows VectorShift to store information about your product. In this demo, we use docs.vectorshift.ai as source of documentation

![alt text](images/vectorshift_chatbot/5-llm_node.png "Knowledge Node")

In [34]:
system_text_raw = """You are a helpful assistant that answers User Question based on Context and Conversational History.

If you are unable to answer the question or if the user requests, direct them to these support resources:
1. Documentation: https://docs.vectorshift.ai/vectorshift/
2. Book a meeting:
https://calendly.com/albert_mao/vectorshift-intro-chat
3. Discord:
https://discord.gg/3bpkv4AX"""
system_text = TextNode(text=system_text_raw)

In [58]:
llm = OpenAILLMNode(
    model="gpt-4", 
    system_input=system_text.output(), 
    prompt_input=[query_result, chat_history.output(), input_node.output()],
)

NameError: name 'query_result' is not defined

### Output

The output of the entire pipeline should be the text of the email, which is created by the output_text node. We can just take that node's output() and package it in an OutputNode, which determines the overall returned value of the pipeline.

Remember that OutputNode is a node that represents, in the pipeline's computation graph, the final value produced. We pass in the output() of output_text, which is a NodeOutput, as the input to that node. OutputNodes are a kind of node; NodeOutputs define what a node returns.

![alt text](images/vectorshift_chatbot/6-output.png "Overall Pipeline")

In [14]:
output = OutputNode(
    name="Output", 
    output_type="text", 
    input=OpenAILLMNode.output()
)

These are all the nodes we need! The overall structure of the nodes closely follows that of the no-code example. Each node block in the no-code editor became its own object in Python, and each edge between nodes has been represented by the output of one node being passed into the constructor of another.

### Creating and Deploying the Pipeline

Once nodes have been defined, creating a pipeline object is fairly simple, since the node objects themselves already encode the edges between them.

A Pipeline object can be initialized by passing in a list of all nodes, a name, and a description. The list of nodes can be passed in any order.

In [15]:
vectorshift_chatbot = [
    input_node, chat_history, llm, system_text, output, knowledge_base, query_result
]

In [16]:
vectorshift_chatbot_pipeline = Pipeline(
    name="Vectorshift Chatbot",
    description="Generate personalized emails for outreach",
    nodes=vectorshift_chatbot
)

There are a few nifty methods that a Pipeline object has. Printing it gives a representation of its constituent nodes—and if you want to generate code that represents how you could construct the object, there's a method for that too (that assigns generated IDs as variable names for each node).

In [None]:
print(vectorshift_chatbot_pipeline)

In [None]:
print(vectorshift_chatbot_pipeline.construction_str())

To save the pipeline to the VectorShift platform, we can pass in our API keys to create a Config object and then pass the pipeline object in.

In [19]:
config = vectorshift.deploy.Config(
    api_key="sk_gEVQxWypbU8avw2nxDxnooSdlaQkx6hUZXkzp3iDL17cJjiG",
)

In [None]:
config.save_new_pipeline(vectorshift_chatbot_pipeline)

The constructed pipeline should be looks like figure below. You can check via VectorShift Dashboard -> Pipeline 
![alt text](images/7-combined.png "Overall Pipeline")

### Running a Pipeline

To rune a pipeline, you need to fetch the name of pipeline you wanted to try, and then execute with pipeline.run

In [21]:
pipeline = Pipeline.fetch(pipeline_name='Vectorshift Chatbot')

In [22]:
response = pipeline.run(
    inputs = {"input_1": "https://www.vectorshift.ai/"},
    api_key= "sk_gEVQxWypbU8avw2nxDxnooSdlaQkx6hUZXkzp3iDL17cJjiG"
)

In [23]:
print(response)

{'output_1': "Hello,\nWe are XYZ consulting, specializing in crafting growth strategies.\n\nOur consulting firm can strategically help VectorShift by streamlining their workflow automation processes, effectively leveraging AI technologies across varied data formats. We can offer customized solutions to automate the creation of marketing copy, personalized outbound emails, and call summaries. We can further assist in enhancing their document, video, and audio file analysis capabilities through advanced AI integration. Additionally, we can guide VectorShift to efficiently use their pre-built templates, improve their application's architecture, and make the transfer of work seamless between no-code and their Python SDK. Our expertise can significantly contribute to enhancing their enterprise solutions, report generation, chatbot volume, outbound emails, RFPs, and proposal generators, ultimately driving the company's growth.\n\nAre you available anytime later this week to chat?\n\nBest,\nX