## ASSISITIVE SEARCH ON https://platform.openai.com/docs/

- The scraped content retrieved from APIFY actor is used for Question & answering **[pls refer Scraping_screenshots_Apify.ppt]**

- The process begins by breaking down the contents of the page into smaller segments (chunks), which are then transformed into embeddings using the Huggingface embedding model **(BAAI/bge-base-en-v1.5)**.
- These embeddings, along with metadata, are stored within the Pinecone DB. Leveraging Pinecone's semantic search capabilities, similar matches can be extracted from the database.
- Subsequently, a prompt, combined with the user's query and the identified similar matches, is provided as input to the **Mistral7b model**, an open-source LLM. This model generates responses and answers based on the context for the given user query.
- Question recommendation utilizes **FAISS** through langchain to fetch similar questions that are stored in a separate file. This approach leverages FAISS, a similarity search library, to identify questions that closely match the user's query, enhancing the recommendation process.

In [74]:
from langchain.embeddings import HuggingFaceInferenceAPIEmbeddings
from langchain.llms import HuggingFaceHub
from transformers import AutoTokenizer, AutoModel
import torch
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.vectorstores import Pinecone
import os
import dotenv
import pinecone
import pandas as pd
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain, SequentialChain
from langchain.output_parsers import ResponseSchema
from langchain.embeddings import HuggingFaceInferenceAPIEmbeddings
from langchain.vectorstores import FAISS
from langchain.document_loaders.csv_loader import CSVLoader


import warnings
warnings.filterwarnings("ignore")

dotenv.load_dotenv()

True

In [None]:
#Scraped content obtained via APIFY actor
openai_docs_data = pd.read_excel("./data/openai_docs.xlsx")

# Remove duplicates from the 'text' column
openai_docs_final =openai_docs_data .drop_duplicates(subset='text')

# Display the DataFrame without duplicates
openai_docs_final.info()
openai_docs_final.to_excel('./data/openai_docs_final.xlsx')

#### Huggingface embedding model "BAAI/bge-base-en-v1.5" is used to convert text into embeddings

In [3]:
os.environ["HUGGINGFACEHUB_API_TOKEN"] = os.getenv("HF_API")


In [10]:
# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-base-en-v1.5')
model = AutoModel.from_pretrained('BAAI/bge-base-en-v1.5')
model.eval()


BertModel(
  (embeddings): BertEmbeddings(
    (word_embeddings): Embedding(30522, 768, padding_idx=0)
    (position_embeddings): Embedding(512, 768)
    (token_type_embeddings): Embedding(2, 768)
    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (encoder): BertEncoder(
    (layer): ModuleList(
      (0-11): 12 x BertLayer(
        (attention): BertAttention(
          (self): BertSelfAttention(
            (query): Linear(in_features=768, out_features=768, bias=True)
            (key): Linear(in_features=768, out_features=768, bias=True)
            (value): Linear(in_features=768, out_features=768, bias=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (output): BertSelfOutput(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
  

In [116]:
def get_embedding(text):
    # Tokenize sentences
    encoded_input = tokenizer(text, padding=True, truncation=True, return_tensors='pt')
    
    # Compute token embeddings
    with torch.no_grad():
        model_output = model(**encoded_input)
        # Perform pooling. In this case, cls pooling.
        sentence_embeddings = model_output[0][:, 0]
    # normalize embeddings
    text_embeddings = torch.nn.functional.normalize(sentence_embeddings).tolist()    
    return text_embeddings
    



In [120]:
#All of the pages in the OpenAi documentation are scraped and saved in an excel spreadsheet using Apify. Each record contains the page's text and relevant metadata.
openai_docs = pd.read_excel("./data/openai_docs_final.xlsx")
openai_docs.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32 entries, 0 to 31
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   crawl/depth            32 non-null     int64  
 1   crawl/httpStatusCode   32 non-null     int64  
 2   crawl/loadedTime       32 non-null     object 
 3   crawl/loadedUrl        32 non-null     object 
 4   crawl/referrerUrl      32 non-null     object 
 5   markdown               32 non-null     object 
 6   metadata/author        0 non-null      float64
 7   metadata/canonicalUrl  32 non-null     object 
 8   metadata/description   32 non-null     object 
 9   metadata/keywords      0 non-null      float64
 10  metadata/languageCode  32 non-null     object 
 11  metadata/title         32 non-null     object 
 12  screenshotUrl          0 non-null      float64
 13  text                   32 non-null     object 
 14  url                    32 non-null     object 
dtypes: float

#### CHUNKING AND CONVERTING THE CHUNK INTO EMBEDDINGS

In [121]:
text_embedding = []
metadata_vectordb = []

#chunking strategy
text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1024,  # Set your desired chunk size
        chunk_overlap=50  # Set your desired overlap size
    )

for i in range(len(openai_docs)):
    # Split the text content into chunks using the text splitter
    chunks = text_splitter.split_text(openai_docs.loc[i,"text"])
    for j, chunk in enumerate(chunks):
        #creating metadata to store it in the vector database
        metadata = {"Chunk_Text": chunk, "URL":openai_docs.loc[i,"url"], "Title":openai_docs.loc[i,"metadata/title"]}
        metadata_vectordb.append(metadata)
        print(j,chunk)        
        #get the embeddings for the chunk
        text_embedding.append(get_embedding(chunk))
        print(i,"--"*40)
    

0 Introduction
The OpenAI API can be applied to virtually any task that requires understanding or generating natural language and code. The OpenAI API can also be used to generate and edit images or convert speech into text. We offer a range of models with different capabilities and price points, as well as the ability to fine-tune custom models.
Resources
Experiment in the playground
Read the API reference
Visit the help center
View the current API status
Check out the OpenAI Developer Forum
Learn about our usage policies
At OpenAI, protecting user data is fundamental to our mission. We do not train our models on inputs and outputs through our API. Learn more on our API data privacy page.
Key concepts
GPTs
0 --------------------------------------------------------------------------------
1 Key concepts
GPTs
OpenAI's GPT (generative pre-trained transformer) models have been trained to understand natural language and code. GPTs provide text outputs in response to their inputs. The input

In [122]:
len(text_embedding)

455

In [124]:
# Convert to a DataFrame
search_data_DB_format = pd.DataFrame.from_dict(metadata_vectordb)
search_data_DB_format["Text Embeddings"] = text_embedding
# Print the DataFrame
print(search_data_DB_format.info())

search_data_DB_format.to_csv("./data/data_stored_pinecone_OpenAI_1024.csv")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 455 entries, 0 to 454
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   Chunk_Text       455 non-null    object
 1   URL              455 non-null    object
 2   Title            455 non-null    object
 3   Text Embeddings  455 non-null    object
dtypes: object(4)
memory usage: 14.3+ KB
None


#### STORING THE DATA IN PINECONE DATABASE

In [130]:
# Initialize Pinecone (replace 'your-api-key' with your actual Pinecone API key)
pinecone.init(api_key=os.getenv("PINECONE_API_KEY"), environment=os.getenv("PINECONE_ENV"))

In [113]:
# Create or Retrieve an Index (replace 'metadata_index' with your preferred index name)
index_name = 'searchopenaidocs'
if index_name not in pinecone.list_indexes():
    pinecone.create_index(index_name, dimension = 768)

In [131]:
index = pinecone.Index("searchopenaidocs")

In [127]:
#InsertinG the embeddings along with its metadat into vector db
start_idx = 0
for i in range(len(text_embedding)):    
    meta = {"text":str(metadata_vectordb[i])}
    index.upsert(vectors= [("searchdocs_OpenAI{}".format(start_idx), text_embedding[i], meta)])
    start_idx+=1

In [128]:
#function to search similar matchings in pinecone db
def semantic_matching(text, topk): 
    embed = get_embedding(text)
    response = index.query(vector=embed, top_k = topk, include_metadata=True)
    return response

# function to extract the response using llm chain
def get_llm_response(llm_input):
    template = f""" Extract the answer for the question from the given context: {llm_input} 
    """
    prompt = PromptTemplate(template=template, input_variables=["input_query"])

    repo_id = "mistralai/Mistral-7B-Instruct-v0.1" # opens-source LLM
    llm = HuggingFaceHub(
        repo_id=repo_id, model_kwargs={"temperature": 0.1, "max_new_tokens":2000})
    
    llm_chain = LLMChain(prompt=prompt, llm=llm)
    overall_chain = SequentialChain(chains=[llm_chain], input_variables= ["llm_input"], verbose=True)
    return overall_chain.run("llm_input")

#function to get the recommended questions
def get_recommended_questions(query):
    #loading the file which has the set of questions
    loader = CSVLoader(file_path="./data/Questions_list.csv")
    data = loader.load()
       
    #Defining hugging face embeddings via langchain 
    embeddings = HuggingFaceInferenceAPIEmbeddings(
        api_key=os.getenv("HF_API"), model_name="BAAI/bge-base-en-v1.5"
    )
    
    #Indexing and storing the data into FAISS
    db = FAISS.from_documents(data, embeddings)
    
    #saving the data in local in which we can use it later 
    db.save_local("faiss_index_que_suggestion")
    
    #Finding the similar match for the query
    results = db.similarity_search(query, k=5)
    print("--"*70)
    print("\nRecommended Questions:\n")
    for doc in results:
        print(f" {doc.page_content}")

#### RETRIEVING ANSWER FOR THE GIVEN QUERY & QUESTION RECOMMENDATION

In [132]:
query = "How can I improve the performance of GPTs in my tasks?"
response = semantic_matching(query, 5)
content = 'content: '
for match in response['matches']:
    content+=match['metadata']['text']

#combining semantic matching results and query
input_query = f"Question: {query}\n Content: {content}".replace("{",'').replace("}",'')

print(get_llm_response(input_query))

get_recommended_questions(query)




[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m

Answer: Fine-tuning GPT models can make them better for specific applications, but it requires a careful investment of time and effort. The key reasons for using prompt engineering, prompt chaining (breaking complex tasks into multiple prompts), and function calling are that there are many tasks at which our models may not initially appear to perform well, but results can be improved with the right prompts, iterating over prompts and other tactics has a much faster feedback loop than iterating with fine-tuning, and initial prompt engineering work is not wasted - we typically see best results when using a good prompt in the fine-tuning data (or combining prompt chaining / tool use with fine-tuning).
--------------------------------------------------------------------------------------------------------------------------------------------

Recommended Questions:

 Questions: What are the best practices for maxi

In [133]:
query = "What is a rate limit in the context of APIs?"
response = semantic_matching(query, 5)
content = 'content: '
for match in response['matches']:
    content+=match['metadata']['text']

#combining semantic matching results and query
input_query = f"Question: {query}\n Content: {content}".replace("{",'').replace("}",'')

print(get_llm_response(input_query))

get_recommended_questions(query)



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m

Answer: A rate limit is a restriction that an API imposes on the number of times a user or client can access the server within a specified period of time. It helps protect against abuse or misuse of the API, ensures that everyone has fair access to the API, and helps prevent disruptions in service.
--------------------------------------------------------------------------------------------------------------------------------------------

Recommended Questions:

 Questions: Why do APIs have rate limits, and what is their primary purpose?
 Questions: How do rate limits help protect against abuse and misuse of APIs?
 Questions: How can I effectively batch requests to make the most of API rate limits?
 Questions: What are some error mitigation strategies for handling rate limits in API requests?
 Questions: What happens if I exceed my API rate limit, and how can I avoid rate limit errors?


In [134]:
query = "How can I disambiguate complex tasks for GPTs?"
response = semantic_matching(query, 5)
content = 'content: '
for match in response['matches']:
    content+=match['metadata']['text']

#combining semantic matching results and query
input_query = f"Question: {query}\n Content: {content}".replace("{",'').replace("}",'')
print(get_llm_response(input_query))

get_recommended_questions(query)



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m

Answer: To disambiguate complex tasks for GPTs, you can specify the steps required to complete a task.
--------------------------------------------------------------------------------------------------------------------------------------------

Recommended Questions:

 Questions: What are the best practices for maximizing the effectiveness of GPTs in different applications?
 Questions: What are some practical ways to automate the evaluation of GPT model outputs?
 Questions: How can I fine-tune my usage of GPT-4 to get superior results in my projects?
 Questions: What strategies and tactics can be used to improve the results obtained from GPT models?
 Questions: How can automated evaluations with objective criteria be employed in GPT system optimization?


In [135]:
query = "What are ChatGPT plugins, and how do they enable interactions with third-party applications?"
response = semantic_matching(query, 5)
content = 'content: '
for match in response['matches']:
    content+=match['metadata']['text']

#combining semantic matching results and query
input_query = f"Question: {query}\n Content: {content}".replace("{",'').replace("}",'')
print(get_llm_response(input_query))

get_recommended_questions(query)



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m

Answer: ChatGPT plugins are third-party applications that connect ChatGPT to APIs defined by developers, enabling ChatGPT to interact with these APIs and perform a wide range of actions. These plugins can retrieve real-time information, knowledge-base information, and assist users with actions. They are in a beta and developer access may not be accessible to everyone. To connect a plugin via the ChatGPT UI, you can either run it locally in a development environment or on a remote server. If you have a local version of your API running, you can point the plugin interface to your localhost server. To register your plugin in the ChatGPT UI, you must manually activate it in the ChatGPT UI.
--------------------------------------------------------------------------------------------------------------------------------------------

Recommended Questions:

 Questions: How can developers create plugins for ChatGPT to 

In [136]:
query = "How can I transcribe audio using the transcriptions endpoint of the Whisper API in Python?"
response = semantic_matching(query, 5)
content = 'content: '
for match in response['matches']:
    content+=match['metadata']['text']

#combining semantic matching results and query
input_query = f"Question: {query}\n Content: {content}".replace("{",'').replace("}",'')

print(get_llm_response(input_query))

get_recommended_questions(query)



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m

Answer: To transcribe audio using the transcriptions endpoint of the Whisper API in Python, you can use the following code:

    import openai
    from docx import Document

    def transcribe_audio(audio_file_path):
        with open(audio_file_path, 'rb') as audio_file:
            transcription = openai.Audio.transcribe("whisper-1", audio_file)
        return transcription['text']

    audio_file_path = "/path/to/file/audio.mp3"
    transcription = transcribe_audio(audio_file_path)
    print(transcription)
--------------------------------------------------------------------------------------------------------------------------------------------

Recommended Questions:

 Questions: Are there specific Python libraries required for transcribing audio with the Whisper API?
 Questions: What audio file formats are supported for transcription with the Whisper API in Python?
 Questions: How can I improve the accur

In [137]:
query ="Can you explain the key components included in the plugin manifest file?"
response = semantic_matching(query, 5)
content = 'content: '
for match in response['matches']:
    content+=match['metadata']['text']
#combining semantic matching results and query
input_qery = f"Question: {query}\n Content: {content}".replace("{",'').replace("}",'')

print(get_llm_response(input_query))

get_recommended_questions(query)



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m

Answer: To transcribe audio using the transcriptions endpoint of the Whisper API in Python, you can use the following code:

    import openai
    from docx import Document

    def transcribe_audio(audio_file_path):
        with open(audio_file_path, 'rb') as audio_file:
            transcription = openai.Audio.transcribe("whisper-1", audio_file)
        return transcription['text']

    audio_file_path = "/path/to/file/audio.mp3"
    transcription = transcribe_audio(audio_file_path)
    print(transcription)
--------------------------------------------------------------------------------------------------------------------------------------------

Recommended Questions:

 Questions: What is the purpose of the plugin manifest file, and where should it be hosted?
 Questions: What is the recommended approach for developers when exposing endpoints in a plugin?
 Questions: How do users activate and use plugins wi