# Investigating the openai API's internal model list

The documentation of the openai Python library could be better, so let's do an exploratory investigation to see what models are available through the API, and what data about the models we can retrieve programmatically. First we load the library and set our API key from a .env file in the project folder. 

Note that in order to use this notebook, you will need to copy the `sample.env` file as `.env`, and then copy and paste your valid API key into the file. Get your API key [here](https://platform.openai.com/account/api-keys). When uploading code to the Internet, make sure not to upload `.env`! Note that it is listed in `.gitignore` to prevent accidental sharing with git.

In [21]:
# Load libraries and set API key
import openai
from dotenv import load_dotenv
import os

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

# Available models

We begin by calling the `list` method of the `Models` class from the openai library. This should give us a list of models we can call through the API.

In [22]:
# Get models object from the openai API, and detect object type
models = openai.Model.list()
print(type(models))

<class 'openai.openai_object.OpenAIObject'>


Investigating the returned object, we find that it's a special OpenAI object type. The object behaves like a Python dictionary, though with a few additional methods for manipulating it.

In [23]:
print(models.keys())

dict_keys(['object', 'data'])


The object has two keys, 'object' and 'data'. The 'object' value tells us what object type is in the 'data' value.

In [24]:
# Access 'object' value and verify that it's the same as the type of the 'data' value
print(models['object'])
assert models['object'] in str(type(models['data']))

list


Each list item corresponds to a model. Checking the length of the list reveals that there are many available models.

In [25]:
# Check length of models list
print(len(models['data']))

66


To see what data is available for each model, we can examine one of the list items.

In [26]:
models['data'][0]

<Model model id=whisper-1 at 0x28262b0d010> JSON: {
  "created": 1677532384,
  "id": "whisper-1",
  "object": "model",
  "owned_by": "openai-internal",
  "parent": null,
  "permission": [
    {
      "allow_create_engine": false,
      "allow_fine_tuning": false,
      "allow_logprobs": true,
      "allow_sampling": true,
      "allow_search_indices": false,
      "allow_view": true,
      "created": 1683912666,
      "group": null,
      "id": "modelperm-KlsZlfft3Gma8pI6A8rTnyjs",
      "is_blocking": false,
      "object": "model_permission",
      "organization": "*"
    }
  ],
  "root": "whisper-1"
}

The "root" key gives us the model name. For a complete list of model names, we can iterate through the list and print this "root" value for each list item.

In [27]:
# Print each model id
for n, model in enumerate(models['data']):
    print(str(n) + ". " + model['root'])

0. whisper-1
1. babbage
2. gpt-3.5-turbo
3. davinci
4. text-davinci-edit-001
5. text-davinci-003
6. babbage-code-search-code
7. text-similarity-babbage-001
8. code-davinci-edit-001
9. text-davinci-001
10. ada
11. babbage-code-search-text
12. babbage-similarity
13. code-search-babbage-text-001
14. text-curie-001
15. gpt-4
16. code-search-babbage-code-001
17. text-ada-001
18. text-embedding-ada-002
19. text-similarity-ada-001
20. curie-instruct-beta
21. gpt-4-0314
22. ada-code-search-code
23. ada-similarity
24. code-search-ada-text-001
25. text-search-ada-query-001
26. davinci-search-document
27. ada-code-search-text
28. text-search-ada-doc-001
29. davinci-instruct-beta
30. text-similarity-curie-001
31. code-search-ada-code-001
32. ada-search-query
33. text-search-davinci-query-001
34. curie-search-query
35. davinci-search-query
36. babbage-search-document
37. ada-search-document
38. text-search-curie-query-001
39. text-search-babbage-doc-001
40. curie-search-document
41. text-search-cur

To view the available data for a particular model, we can subset it by number or use a 'for' loop or list comprehension to iterate through the models list until we find the model id that matches the one we want. For instance, to view "gpt-4":

In [28]:
# Use a list comprehension to examine a particular model
item = next((m for m in models['data'] if m['root'] == 'gpt-4'), None)
print(item)

{
  "created": 1678604602,
  "id": "gpt-4",
  "object": "model",
  "owned_by": "openai",
  "parent": null,
  "permission": [
    {
      "allow_create_engine": false,
      "allow_fine_tuning": false,
      "allow_logprobs": false,
      "allow_sampling": false,
      "allow_search_indices": false,
      "allow_view": false,
      "created": 1684465847,
      "group": null,
      "id": "modelperm-HnvVZ1tf2jVawVaM1B3yjZnD",
      "is_blocking": false,
      "object": "model_permission",
      "organization": "*"
    }
  ],
  "root": "gpt-4"
}


Unfortunately, this model list doesn't contain a lot of the information we'd like to know, such as the cost to use each model, or its context limit. However, it does tell us whether we can fine-tune the model, which could be useful. We can use a list comprehension to get a list of all models that can be fine-tuned:

In [29]:
items = [m['root'] for m in models['data'] if m['permission'][0]['allow_fine_tuning']]
print(items)

['cushman:2020-05-03', 'if-davinci:3.0.0', 'davinci-if:3.0.0', 'davinci-instruct-beta:2.0.0']
