### What this notebook is for:
This notebook looks at understanding ChatGPT's ability to select the needed function to be called, from a user's prompt

*BIG IDEA*: The main idea may be illustrated using examples -

When the following prompt is inputted: Give me the 10 most qualified react developers in Lagos, I expect that a function related to talent search on the Loubby platform, particularly the getTalents function is called, and the necessary parameters are given.

This notebook exists to highlight the process behind obtaining this desired result, and understanding ChatGPT's ability to generate the desired result.

In [2]:
import pymongo
import datetime
import key
import csv

In [5]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, AIMessage, SystemMessage
from langchain.prompts import PromptTemplate
import openai
import json

from langchain.vectorstores.mongodb_atlas import MongoDBAtlasVectorSearch
from langchain.chains import RetrievalQA
from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
    MessagesPlaceholder
)
from langchain.schema import HumanMessage, SystemMessage

In [6]:
embeddings = OpenAIEmbeddings()
chat = ChatOpenAI(temperature=0.7, model='gpt-4-1106-preview', api_key=key.OPENAI_API_KEY)

In [6]:
# generate_embedding function that converts string to vectors 
def generate_embedding(text: str) -> list[float]:
    query_vector = embeddings.embed_query(text)
    return query_vector

In [7]:
# Creating connection to the loubbydb using the connection string
client = pymongo.MongoClient(key.Loubby_string)
client.admin.command('ping')

{'ok': 1}

In [8]:
# Accessing the loubby database and collection 
db = client.loubbyDb
action_collection = db.actionSpaceText

RAG doesn't work except when the conventional text dump is done, by manually adding text. However, what is currently on ground is the actionSpace as a 

collection with documents (just like a table). So, 2 branches of the experiment will occur.

Continue with the current schema, and manually perform RAG.
Turn the actionSpace information to a list of strings containing all pieces of information that we currently have.

## Experiment 1: Continuing with current schema and manually performing RAG

In [9]:
explanation = """

{'task': 'Talent search', 'function_name': 'getTalents', 'frontend_url': '/dashboard/talents', 'backend_url': '/user/info', 'parameters/args': 
'jobType, professionalLevel, country,  experienceYears, degrees, skill, role, state ', 'Parameter Description': "['jobType' -> Takes a JSON object with the 
following keys (fullTime, partTime, contract, internship, remote) and boolean values (example {'fullTime': True, 'partTime': True, 'contract': False, 'internship': 
False, 'remote': False});\n\n'professionalLevel' -> Takes a JSON object with the following keys (entryLevel, juniorLevel, middleLevel, seniorLevel, directorLevel,
 VPabove) and boolean values (example {'entryLevel': True, 'juniorLevel': True, 'middleLevel': False, 'seniorLevel': False, 'directorLevel': False, 
 'VPabove': True});\n\n'country' -> Takes a string corresponding country names example 'Nigeria', 'United States', 'China';\n\n'experienceYears' -> Takes a string 
 from one of '0-2 experience years', '2-5 experience years', '5-10 experience years';\n\n'degrees' -> Takes a string from one of 'Bachelor's Degree', 'Master's 
 Degree', 'PhD';\n\n'skills' -> Takes a list of strings highlighting the candidate(s)'s skills example ['Python', 'Java', 'React', 'Node', 'JavaScript', 'SQL', 
 'Tableau', 'Laveral', 'Flutter'];\n\n'role' -> A string highlighting the job headline, example 'Data Analyst', 'Software Engineer', 'Chef';\n\n'state' -> A 
 string highlighting the state in the country where the candidate(s) reside(s) and the state code in parentheses example ['Abuja (ABJ)', 'Texas (TX)', 'London 
 (LDN)']", 'descriptions': 'returns information about candidates / job seekers / talents based on the arguments passed', 'rest_api': 'GET', 'front_endpoints': '
 https://https://app.loubby.ai/dashboard/talents', 'back_endpoints': 'https://api.loubby.ai/api/v1/user/info'}

"""

template = 'Explanation: {explanation} \n\nquery: {query}'

prompt = PromptTemplate(
    input_variables=['query'],
    template=template,
    partial_variables={'explanation': explanation}
)

chat = ChatOpenAI(temperature=0.7, model='gpt-4-1106-preview')

ourConvo = chat([
    SystemMessage(content="You are a bot for the Loubby AI product, and you are called Jabbari. For the query, you are to output a JSON containing: 1) a text response from you which could either be a follow up question (if the user's query is VAGUE or NOT CLOSE TO SPECIFIC) or just a response to tell the user that you've done what they've asked; 2) The function name; 3) The approproate values of each parameter, also given in JSON format; 4) front end URL 5) back end URL"),
    HumanMessage(content = prompt.format(query = 'Find me 10 data scientists in Lagos'))
])

print(ourConvo.content)

```json
{
  "text_response": "I have executed the 'getTalents' function to search for 10 data scientists in Lagos.",
  "function_name": "getTalents",
  "parameters": {
    "jobType": {"fullTime": true, "partTime": false, "contract": false, "internship": false, "remote": false},
    "professionalLevel": {"entryLevel": false, "juniorLevel": true, "middleLevel": true, "seniorLevel": true, "directorLevel": false, "VPabove": false},
    "country": "Nigeria",
    "experienceYears": "Not specified",
    "degrees": "Not specified",
    "skills": ["Data Science"],
    "role": "Data Scientist",
    "state": "Lagos (LOS)"
  },
  "front_end_url": "https://app.loubby.ai/dashboard/talents",
  "back_end_url": "https://api.loubby.ai/api/v1/user/info"
}
```


### A. Try implementing RAG straight away

In [10]:
vectorStore = MongoDBAtlasVectorSearch(collection=action_collection, embedding=embeddings, index_name='action-description-index', text_key='func_doc', embedding_key='embedded_description')
retriever = vectorStore.as_retriever()

OpenAI must have scrapped the normal completions endpoint, because the models that support it have been deprecated. Let's try the chat models

In [11]:
RAG_PROMPT = """
CONTEXT:
{context}

QUERY:
{question}

Use the provide context to answer the provided user query. Only use the provided context to answer the query. If you do not know the answer, response 
with "I don't know"
"""

rag_prompt = ChatPromptTemplate.from_template(RAG_PROMPT)

In [12]:
rag_prompt

ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template='\nCONTEXT:\n{context}\n\nQUERY:\n{question}\n\nUse the provide context to answer the provided user query. Only use the provided context to answer the query. If you do not know the answer, response with "I don\'t know"\n'))])


### Experiment 2: Turn the actionSpace to a list of strings and perform RAG

In [13]:
vectorStore = MongoDBAtlasVectorSearch(collection=action_collection, embedding=embeddings, index_name='action-description-index', text_key='func_doc', embedding_key='embedded_description')
retriever = vectorStore.as_retriever(search_type = 'mmr', search_kwargs ={'fetch_k': 4, 'k': 2})

#### Refer to Paul's notebook for the data preprocessing and collection creation

In [14]:
rag_chain = RetrievalQA.from_chain_type(chat, chain_type="stuff", retriever=retriever)

rag_chain.run('What function can I call to get information on candidates')

'To get information on candidates, you can call the `getTalents` function. This function retrieves talent/candidates information from the Loubby AI database based on specified parameters such as job type, professional level, country, experience years, degrees, skills, role, and state.'

#### It works. So, let's add a conversation functionality to it

In [15]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain

In [16]:
memory = ConversationBufferMemory(
    return_messages=True, memory_key='chat_history'
)

In [17]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import SystemMessagePromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate

memory.clear()

qa_system_template = """You are a bot for the Loubby AI product, and your name is Jabbari. Given context and a query, you are to output a JSON containing: 
1) a text response from you [Key name: text_response], (as if in conversation with the user) which may be a follow up question (if you need more clarity) or 
any other appropriate response; 2) The function name [Key name: function_name]; 3) The appropriate values of each parameter [Key name: params], also given in 
JSON format, if a parameter is not implied by the user, then do not include it; 4) front end endpoint [Key name: front_endpoint]; 5) back end point [Key name: 
back_endpoint]. If the query does not correlate to the context, then, output ONLY the response in item 1)"""
messages = [
    SystemMessagePromptTemplate.from_template(qa_system_template),
    HumanMessagePromptTemplate.from_template("context:\n{context} \n\nquery:\n{question}"),
]
qa_system_prompt = ChatPromptTemplate.from_messages(messages)
qa = ConversationalRetrievalChain.from_llm(chat, vectorStore.as_retriever(search_kwargs={
                                            "k": 2}), memory=memory,
                                            verbose=True, combine_docs_chain_kwargs={"prompt": qa_system_prompt})
                                            
question = "Siri, give me 5 data scientists"

qa({'question': question})



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: You are a bot for the Loubby AI product, and your name is Jabbari. Given context and a query, you are to output a JSON containing: 1) a text response from you [Key name: text_response], (as if in conversation with the user) which may be a follow up question (if you need more clarity) or any other appropriate response; 2) The function name [Key name: function_name]; 3) The appropriate values of each parameter [Key name: params], also given in JSON format, if a parameter is not implied by the user, then do not include it; 4) front end endpoint [Key name: front_endpoint]; 5) back end point [Key name: back_endpoint]. If the query does not correlate to the context, then, output ONLY the response in item 1)
Human: context:
{'task': 'Create Job', 'function_name': 'createJob', 'frontend_url': '/dashboard/jobs/joblisting', 'backend_url': '/employer/listi

{'question': 'Siri, give me 5 data scientists',
 'chat_history': [HumanMessage(content='Siri, give me 5 data scientists'),
  AIMessage(content='```json\n{\n  "text_response": "Just to clarify, are you looking for 5 data scientists available for job opportunities, or do you want to create a job listing for a data scientist position?",\n  "function_name": "",\n  "params": {},\n  "front_endpoint": "",\n  "back_endpoint": ""\n}\n```')],
 'answer': '```json\n{\n  "text_response": "Just to clarify, are you looking for 5 data scientists available for job opportunities, or do you want to create a job listing for a data scientist position?",\n  "function_name": "",\n  "params": {},\n  "front_endpoint": "",\n  "back_endpoint": ""\n}\n```'}

In [18]:
print(memory.chat_memory.messages[1].content)

```json
{
  "text_response": "Just to clarify, are you looking for 5 data scientists available for job opportunities, or do you want to create a job listing for a data scientist position?",
  "function_name": "",
  "params": {},
  "front_endpoint": "",
  "back_endpoint": ""
}
```


In [20]:
question = "Give me 14 front end developers in Nigeria"

qa({'question': question})



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: Siri, give me 5 data scientists
Assistant: ```json
{
  "text_response": "Just to clarify, are you looking for 5 data scientists available for job opportunities, or do you want to create a job listing for a data scientist position?",
  "function_name": "",
  "params": {},
  "front_endpoint": "",
  "back_endpoint": ""
}
```
Human: Give me 14 front end developers in Nigeria
Assistant: ```json
{
  "text_response": "Sure, I can help you with that. Would you like to specify any particular skills, experience level, or professional level for these front-end developers?",
  "function_name": "getTalents",
  "params": {
    "jobType": {"fullTime": true, "partTime": true, "contract": true, "internship": false, "remote": true},
    "professionalLevel": 

{'question': 'Give me 14 front end developers in Nigeria',
 'chat_history': [HumanMessage(content='Siri, give me 5 data scientists'),
  AIMessage(content='```json\n{\n  "text_response": "Just to clarify, are you looking for 5 data scientists available for job opportunities, or do you want to create a job listing for a data scientist position?",\n  "function_name": "",\n  "params": {},\n  "front_endpoint": "",\n  "back_endpoint": ""\n}\n```'),
  HumanMessage(content='Give me 14 front end developers in Nigeria'),
  AIMessage(content='```json\n{\n  "text_response": "Sure, I can help you with that. Would you like to specify any particular skills, experience level, or professional level for these front-end developers?",\n  "function_name": "getTalents",\n  "params": {\n    "jobType": {"fullTime": true, "partTime": true, "contract": true, "internship": false, "remote": true},\n    "professionalLevel": {"entryLevel": true, "juniorLevel": true, "middleLevel": true, "seniorLevel": true, "direct

In [21]:
print(memory.chat_memory.messages[3].content)

```json
{
  "text_response": "Sure, I can help you with that. Would you like to specify any particular skills, experience level, or professional level for these front-end developers?",
  "function_name": "getTalents",
  "params": {
    "jobType": {"fullTime": true, "partTime": true, "contract": true, "internship": false, "remote": true},
    "professionalLevel": {"entryLevel": true, "juniorLevel": true, "middleLevel": true, "seniorLevel": true, "directorLevel": false, "VPabove": false},
    "country": "Nigeria",
    "role": "Front-end Developer"
  },
  "front_endpoint": "https://https://app.loubby.ai/dashboard/talents",
  "back_endpoint": "https://api.loubby.ai/api/v1/user/info"
}
```


In [6]:
import json

data_list = [
    {'name': 'Alice', 'age': 30},
    {'name': 'Bob', 'age': 25},
    {'name': 'Charlie', 'age': 35}
]

output = []

for item in data_list:
    if isinstance(item, dict):
        output.append({
            'keys': list(item.keys()),
            'values': list(item.values())
        })

# Print the output in JSON format
print(json.dumps(output, indent=4))


[
    {
        "keys": [
            "name",
            "age"
        ],
        "values": [
            "Alice",
            30
        ]
    },
    {
        "keys": [
            "name",
            "age"
        ],
        "values": [
            "Bob",
            25
        ]
    },
    {
        "keys": [
            "name",
            "age"
        ],
        "values": [
            "Charlie",
            35
        ]
    }
]
