[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/integrations/openai/beyond_search_webinar/02_pinecone-code-demo.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/integrations/openai/beyond_search_webinar/02_pinecone-code-demo.ipynb)

In [1]:
import json

with open('data/mapping.json', 'r') as fp:
    mappings = json.load(fp)

In [2]:
import openai
from openai.embeddings_utils import get_embedding

openai.api_key = 'OPENAI_API_KEY'  # platform.openai.com/login/

We can now create embeddings with OpenAIs embedding models like so:

```python
q_embeddings = get_embedding(
    'how to use gradient tape in tensorflow',
    engine=f'text-search-curie-query-001'
)
```

We initialize our connection to Pinecone.

In [4]:
from pinecone import Pinecone

def load_index():
    pinecone.init(
        api_key='PINECONE_API_KEY',  # app.pinecone.io
        environment="YOUR_ENV"  # find next to API key in console
    )

    index_name = 'beyond-search-openai'

    if not index_name in pinecone.list_indexes().names():
        raise KeyError(f"Index '{index_name}' does not exist.")

    return pinecone.Index(index_name)

In [6]:
index = load_index()

Define a function that will use OpenAI to create a query embedding, then use it to retrieve the most relevant context embeddings from Pinecone. These contexts are appended into a larger string ready for feeding into OpenAIs next generation step.

In [5]:
import openai
from openai.embeddings_utils import get_embedding

def create_context(question, index, max_len=3750, size="curie"):
    """
    Find most relevant context for a question via Pinecone search
    """
    q_embed = get_embedding(question, engine=f'text-search-{size}-query-001')
    res = index.query(vector=q_embed, top_k=5, include_metadata=True)
    

    cur_len = 0
    contexts = []

    for row in res['matches']:
        text = mappings[row['id']]
        cur_len += row['metadata']['n_tokens'] + 4
        if cur_len < max_len:
            contexts.append(text)
        else:
            cur_len -= row['metadata']['n_tokens'] + 4
            if max_len - cur_len < 200:
                break
    return "\n\n###\n\n".join(contexts)

Let's test context retrieval...

In [7]:
create_context("how do I use gradient tapes in tensorflow", index)



Now we can move onto answering a question. This step will take a query, encode it, retrieve other contexts (as done above), and then pass them onto OpenAIs generation model within a specific format that can be modified via the `instruction` parameter below.

In [8]:
def answer_question(
    index,
    fine_tuned_qa_model="text-davinci-002",
    question="Am I allowed to publish model outputs to Twitter, without a human review?",
    instruction="Answer the question based on the context below, and if the question can't be answered based on the context, say \"I don't know\"\n\nContext:\n{0}\n\n---\n\nQuestion: {1}\nAnswer:",
    max_len=3550,
    size="curie",
    debug=False,
    max_tokens=400,
    stop_sequence=None,
    domains=["huggingface", "tensorflow", "streamlit", "pytorch"],
):
    """
    Answer a question based on the most similar context from the dataframe texts
    """
    context = create_context(
        question,
        index,
        max_len=max_len,
        size=size,
    )
    if debug:
        print("Context:\n" + context)
        print("\n\n")
    try:
        # fine-tuned models requires model parameter, whereas other models require engine parameter
        model_param = (
            {"model": fine_tuned_qa_model}
            if ":" in fine_tuned_qa_model
            and fine_tuned_qa_model.split(":")[1].startswith("ft")
            else {"engine": fine_tuned_qa_model}
        )
        #print(instruction.format(context, question))
        response = openai.Completion.create(
            prompt=instruction.format(context, question),
            temperature=0,
            max_tokens=max_tokens,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0,
            stop=stop_sequence,
            **model_param,
        )
        return response["choices"][0]["text"].strip()
    except Exception as e:
        print(e)
        return ""

Let's initialize a few query/instruction formats...

In [10]:
instructions = {
    "conservative Q&A": "Answer the question based on the context below, and if the question can't be answered based on the context, say \"I don't know\"\n\nContext:\n{0}\n\n---\n\nQuestion: {1}\nAnswer:",
    "paragraph about a question":"Write a paragraph, addressing the question, and use the text below to obtain relevant information\"\n\nContext:\n{0}\n\n---\n\nQuestion: {1}\nParagraph long Answer:",
    "bullet point": "Write a bullet point list of possible answers, addressing the question, and use the text below to obtain relevant information\"\n\nContext:\n{0}\n\n---\n\nQuestion: {1}\nBullet point Answer:",
    "summarize problems given a topic": "Write a summary of the problems addressed by the questions below\"\n\n{0}\n\n---\n\n",
    "extract key libraries and tools": "Write a list of libraries and tools present in the context below\"\n\nContext:\n{0}\n\n---\n\n",
    "just instruction": "{1} given the common questions and answers below \n\n{0}\n\n---\n\n",
    "summarize": "Write an elaborate, paragraph long summary about \"{1}\" given the questions and answers from a public forum on this topic\n\n{0}\n\n---\n\nSummary:",
}

By default we use *"conservative Q&A"* which returns `"I don't know"` when unsure of the answer.

In [9]:
answer_question(index, question="What are GPT-2's strengths and weaknesses?")

"I don't know."

Let's try and ask a few more questions with different instructions...

In [11]:
print(answer_question(index, question="OpenAI CLIP", 
                            instruction = instructions["summarize"], debug=False))

The question is about how to finetune the CLIPModel further on their own dataset. The person tried using the default data_collator with the Trainer, but it didn't work.


In [12]:
print(answer_question(index, question="OpenAI CLIP", 
                            instruction = instructions["summarize problems given a topic"], debug=False))

The questions above address the problems of using OpenAI's CLIP model for image search and style transfer, as well as how to load the model onto a GPU. Additionally, the question about training a CLIP-like model for German language raises the challenge of modifying existing models to add projection layers.


In [13]:
print(answer_question(index, question="embedding models, which embed images and text", 
                            instruction = instructions["extract key libraries and tools"], debug=False))

- Huggingface Transformers
- Bert
- MarkupLM
- HTML
- Kiela et all.
- UMAP
- HDBSCAN
- T5
- BERT-base
- GPT-2


In [14]:
print(answer_question(index, question="Compare and contrast Tensorflow and Pytorch", 
                            instruction = instructions["just instruction"], debug=False,
                            domains=[ "tensorflow", "pytorch"]))

Tensorflow and Pytorch are both major machine learning libraries. Tensorflow is maintained and released by Google while Pytorch is maintained and released by Facebook.

Tensorflow is more convenient in the industry (prototyping, deployment and scalability is easier) and PyTorch more handy in research (its more pythonic and it is easier to implement complex stuff).


In [15]:
print(answer_question(index, question="What are some of the ways to summarize text?", 
                            instruction = instructions["bullet point"], debug=False))

- One way to summarize text is to use extractive summarization, where you choose the top k sentences from the text.
- Another way is to use abstractive summarization, where you generate a summary of the text that is shorter than the original text.
- You can also combine the two methods, using extractive summarization to choose the sentences you want to include in the summary, and then using abstractive summarization to generate the summary itself.
- Finally, you can also use successive abstractive summarization, where you summarize the text in chunks, and then use those chunks to generate a summary of the desired length.


In [16]:
print(answer_question(index, question="What are some of the common problems of trying to run GPT-J 6B yourself?", 
                            instruction = instructions["paragraph about a question"], debug=False))


Some of the common problems of trying to run GPT-J 6B yourself include:
- Not being able to find the model on the Hugging Face Hub
- Getting a KeyError when trying to download the checkpoint
- Not being able to get the model working with the latest transformers version


In [17]:
print(answer_question(index, question="How can I convert tensorflow code into pytorch?", 
                            instruction = instructions["paragraph about a question"], debug=False))

There are a few ways to convert TensorFlow code into PyTorch. One way is to use the Open Neural Network Exchange (ONNX) format. ONNX is a format that allows models to be transferred between different frameworks. To convert a TensorFlow model to PyTorch using ONNX, you can use the onnx-tensorflow converter.

Another way to convert TensorFlow code is to use the PyTorch converter. The PyTorch converter is a tool that converts TensorFlow models into PyTorch models. The converter is still in beta, but it should be able to convert most TensorFlow models into PyTorch models.

Finally, you can also convert TensorFlow code into PyTorch code manually. This is usually not recommended, as it can be quite difficult to get the code to work correctly. However, if you are familiar with both frameworks, it may be possible to convert the code manually.


In [18]:
answer_question(index, question="What are the open source models released by OpenAI?", 
                            instruction = instructions["paragraph about a question"], debug=False)

'OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. OpenAI is headquartered in San Francisco, California. OpenAI\'s goal is to "advance digital intelligence in the way that is most likely to benefit humanity as a whole".[1] Since its founding, OpenAI has worked on a number of projects involving machine learning and artificial intelligence.\n\nIn 2015, OpenAI released an open source artificial intelligence software called Universe.[2] Universe allows any computer program to be used as a potential environment for training artificial intelligence agents.\n\nIn 2016, OpenAI released an open source machine learning library called OpenAI Gym.[3] OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms.\n\nIn 2017, OpenAI released an open source artificial intelligence software called OpenAI Baselines.[4] OpenAI Baselines is a set of high-quality implementa

In [19]:
print(answer_question(index, question="How can I use embeddings to visualize my data?"))

I don't know.


Once you're finished with the index, delete it to save resources:

In [None]:
pinecone.delete_index(index_name)

---