<a href="https://colab.research.google.com/github/singh-priyanshi/py/blob/main/OpenAI_API_Investigation_(Summer_2023).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## References


API Key:

  + https://platform.openai.com/account/api-keys

Tokens / Limits:

  + https://platform.openai.com/docs/introduction/key-concepts

> As a rough rule of thumb, 1 token is approximately 4 characters or 0.75 words for English text. One limitation to keep in mind is that your text prompt and generated completion combined must be no more than the model's maximum context length (for most models this is 2048 tokens, or about 1500 words).


Models:

  + https://platform.openai.com/docs/models/overview


Python package docs:

  + https://github.com/openai/openai-python

## Setup

In [None]:
#!pip list

In [None]:
%%capture

!pip install openai

## Authorization

In [None]:
from getpass import getpass

OPENAI_API_KEY = getpass("Please provide your OpenAI API Key:")

Please provide your OpenAI API Key:··········


In [None]:
import openai

openai.api_key = OPENAI_API_KEY

## Investigation / Exploration of Features

### List of Available Models

In [None]:
from openai import Model

models = Model.list()
print(type(models))

# print(models.api_version)
#dir(models)

<class 'openai.openai_object.OpenAIObject'>


In [None]:
models_info = []

for model in sorted(models.data, key=lambda m: m.id):
    #print(model.id, "...", model.owned_by, "...", model.parent, "...", model.object)
    model_info = model.to_dict()
    del model_info["permission"] # nested list
    #print(model_info)
    models_info.append(model_info)


from pandas import DataFrame

models_df = DataFrame(models_info)
models_df.to_csv("openai_models.csv")
models_df.sort_values(by=["id"])

Unnamed: 0,id,object,created,owned_by,root,parent
0,ada,model,1649357491,openai,ada,
1,ada-code-search-code,model,1651172505,openai-dev,ada-code-search-code,
2,ada-code-search-text,model,1651172510,openai-dev,ada-code-search-text,
3,ada-search-document,model,1651172507,openai-dev,ada-search-document,
4,ada-search-query,model,1651172505,openai-dev,ada-search-query,
5,ada-similarity,model,1651172507,openai-dev,ada-similarity,
6,babbage,model,1649358449,openai,babbage,
7,babbage-code-search-code,model,1651172509,openai-dev,babbage-code-search-code,
8,babbage-code-search-text,model,1651172509,openai-dev,babbage-code-search-text,
9,babbage-search-document,model,1651172510,openai-dev,babbage-search-document,


In [None]:
models_df.groupby("owned_by")["id"].count()

owned_by
openai             20
openai-dev         32
openai-internal     4
Name: id, dtype: int64

In [None]:
#models_df.groupby("root")["id"].count()

In [None]:
#print(models_df["owned_by"].unique())

### Text Completions

https://platform.openai.com/docs/guides/completion

> The completions API endpoint received its final update in July 2023.

Let's consider using Chat Completions instead (see section below).


In [None]:
from openai import Completion

prompt = "Today is Wednesday, so tomorrow is"
#prompt = "Write a tagline for an icecream business"
completion = Completion.create(model="text-davinci-003", prompt=prompt)

print(completion.choices[0].text)

 Thursday

Thursday is the day after Wednesday.


### Chat Completions

https://platform.openai.com/docs/guides/gpt/chat-completions-vs-completions

> The chat completions API is the interface to our most capable model (gpt-4), and our most cost effective model (gpt-3.5-turbo). For reference, gpt-3.5-turbo performs at a similar capability level to text-davinci-003 but at 10% the price per token!


In [None]:
from openai import ChatCompletion

#chat_text = "Today is Wednesday, so tomorrow is"

#chat_text = "You are familiar with the Meyers Briggs personality test and its results. What kind of activities would you recommend which can help INTJ and a ENFP develop a closer relationship / build trust"
chat_text = "Write a tagline for an icecream business"

chat_completion = ChatCompletion.create(model="gpt-3.5-turbo", messages=[
    {"role": "user", "content": chat_text}
])

# print the chat completion
print(chat_completion.choices[0].message.content)

"Indulge in frozen delights, like no other!"


### Image Generation

https://platform.openai.com/docs/guides/images/usage


In [None]:
from openai import Image as ImageGen
from IPython.display import display, Image

#prompt = "a white siamese cat"
#prompt = "a clown juggling bananas"
prompt = "cartoon panda on bicycle"
response = ImageGen.create(prompt=prompt, n=1, size="1024x1024")

image_url = response['data'][0]['url']

print(prompt)
display(Image(url=image_url, height=500))

cartoon panda on bicycle


### Embeddings

https://platform.openai.com/docs/guides/embeddings/what-are-embeddings

Embeddings are a representation of the text in vector space (i.e. a list of numbers). These embeddings can be used as input features to train machine learning models.

In [None]:
from openai import Embedding

model = "text-embedding-ada-002"
texts = [
    "I like apples, but bananas are gross.",
    "This is a tweet about bananas",
    "Drink apple juice!"
]
result = Embedding.create(input=texts, model=model)
print(type(result))
result

In [None]:
result.keys()

dict_keys(['object', 'data', 'model', 'usage'])

In [None]:
result["model"]

'text-embedding-ada-002-v2'

In [None]:
len(result["data"])

print(result["data"][0].keys())

print(len(result["data"][0]["embedding"]))

dict_keys(['object', 'index', 'embedding'])
1536
