# OPENAI INTRODUCTION

### Import Libraries

In [2]:
import os
import openai

### Create Environment Variable for OpenAI Api Key

In [6]:
# Read Api key from text file
with open('openai-api-key.txt', 'r') as f:
    openai_api_key = f.read()

In [7]:
# Create Environment Variable
os.environ['OPENAI_API_KEY'] = openai_api_key

### Test

In [10]:
openai.api_key = os.getenv('OPENAI_API_KEY')

In [11]:
response = openai.Completion.create(
    model='text-davinci-003',
    prompt='Give me two reasons to learn OpenAI API with Python',
    max_tokens=300
)

In [12]:
print(response['choices'][0]['text'])



1. OpenAI API with Python offers easy access to powerful open-source tools that can be used to develop powerful AI applications. This API offers a high level of flexibility, allowing developers to manipulate and control the underlying algorithms. 

2. OpenAI API with Python allows developers to create applications that are specifically tailored to the user's need, providing them with the power of advanced AI technology. This API also simplifies the process of creating complex AI applications, as it provides simple APIs and methods that allow developers to quickly and easily prototype AI applications.


# HISTORY

**2015 - OpenAI as a Non-Profit**

- In December 2015, at the end of the NIPS (Neural Information Processing Systems) conference in Montreal, Elon Musk and Sam Altman announced the creation of a new non-profit organization, called OpenAI

**Sam Altman**

- Dropped out of Stanford to create a location-based social network called Loopt in 2005, which was later acquired in 2012 for &#36;43.4 million.
- In 2011 Altman started as a part-time partner at startup accelerator Y Combinator. In 2014, he became the president of YC.
- Y Combinator super famous accelerator in Silicon Valley. Lots of famous companies have come from this, like Airbnb and Dropbox.
- Even though OpenAI started in 2015, it wasn’t until 2019 that Altman transitioned away from being president of YC to focus on OpenAI.
- Gary Tan is now currently the president of YC.

**Elon Musk**

<i>1995</i>
- Starts Zip2 with his brother Kimbal Musk and Greg Kouri.
- Sold to Compaq in 1999 for &#36; 307 million (Elon’s stake was worth about &#36; 7 million).

<i>1999</i>
- Starts online financial services company X.com, which in 2000 merged with Confinity to form PayPal.

<i>2002</i>
- Paypal is acquired by eBay for &#36;1.5 billion.
- Musk’s stake was approximately &#36;176 million.
- Musk founded the company SpaceX, a rocket manufacturer and launcher.
- Later on SpaceX would develop its own satellite internet.

<i>2004</i>
- Musk invests in Tesla, an electric car company, becoming its largest shareholder.
- Starting in 2005 Musk took a much more active role in Tesla.

<i>2022</i>
- Musk acquires Twitter for &#36;43 billion.

<i>Other companies:</i>
- Solar City (acquired by Tesla)
- Boring Company
- Neuralink
- Starlink
- OpenAI

**Early Major Investors in OpenAI**

- Reid Hoffman (founder of LinkedIn)
- Jessica Livingston (one of the founders of YC)
- Peter Thiel
- Infosys
- Khosla Ventures
- YC Research

**Early Employees at OpenAI**

- Greg Brockman
- Ilya Sutskever
- Trevor Blackwell
- Andrej Karpathy
- Durk Kingma
- Wojciech Zaremba
- And many more well-regarded advisors!

**OpenAI**

- Initially formed as a non-profit to safely develop Artificial Intelligence.
- Musk and Altman were influenced by Google acquiring DeepMind in 2014, concerned that A.I. technology would only be developed and controlled by just a few of the world’s largest technology companies.
- In 2016, OpenAI released the Gym library, which allowed for an easy to use environment for reinforcement learning.
- In 2018, OpenAI announced the first version of GPT (Generative Pre-Training Transformer).
- In 2018 Elon Musk resigned his board seat at OpenAI, due to “potential future conflict of interest” due to Tesla’s own development of AI systems (mainly for self-driving cars at the time, but now Tesla is working on a humanoid robot called Optimus).
- In 2019, OpenAI transitioned from a non-profit organization to a “capped” for-profit organization, in order to accept an investment of &#36; 1 billion dollars, partnering with Microsoft in the process (who was also the lead investor).
- In 2019, OpenAI announces a new model called GPT-2.
- GPT-2 was not initially released to the public due to safety concerns regarding the ability to possibly create false misinformation at a large scale.
- In 2020, GPT-3 was announced.
- In June of 2020 OpenAI would announced the creation of an API to access its new AI Models.
- Initial users had to apply to be accepted to have access to the API.
- In 2021, OpenAI announced the creation of DALL-E, a model capable of producing images from text.
- The model is not open-sourced or available via an API.
- In 2022, DALLE-2 is announced, creating much higher fidelity images from text prompts.
- Also in 2022 ChatGPT is announced, which is an optimized version of GPT for dialogue, trained on human feedback.
- At the start of 2023, OpenAI announced that Microsoft made a new &#36; 10 billion investment for OpenAI.
- As part of this investment, Azure became the exclusive cloud provider of OpenAI model API calls. Note that the API is still available directly from OpenAI.
- Clearly a lot has happened in just a short timespan, and the pace of development is exponential!
- If you’re interested in learning more about this recent history of artificial intelligence and the role its played in the dynamics of technology companies, you may want to read Genius Makers by Metz.

# HOW IT WORKS?

**GPT**

- One of the more novel aspects of GPT-3 versus its predecessors GPT and GPT-2 was its size.
- GPT-3 has 175 billion parameters, which in storage terms is approximately 800 GB.
- It also cost about &#36;4.6 million dollars to train in GPU costs.
- GPT-3 has a context window of 2048 tokens, and its estimated that the latest models of GPT-3.5 (the colloquial name for the underlying model used for creating ChatGPT is 4000 tokens).
- Longer context windows allow the model to retain more information over long pieces of text, giving the impression of “memory”.
- Refer to the paper for some clever mathematical sinusoidal “tricks” used to achieve this!
- An embedding neural network is used to convert the tokens into a vector, GPT-3 initially used a 12,288 dimension vector.
- The nature of the GPT style models do not lend themselves well to “on-edge” inference (i.e. running GPT-3 on your phone).

**DALL-E**

- In January of 2021, OpenAI announced work on DALL-E, and then one year later revealed DALL-E 2.
- DALL-E 2 was initially released in private beta and then later opened up to the public, with an API built soon after.
- The name comes from the combination of “WALL-E” and “Dali” (as in Salvador Dali).
- Fundamentally, all DALL-E does is take in an input text string and output an image.
- Note how overall this idea is actually quite similar to the idea of GPT-3, except in this case the modality of the output is different.
- An embedding is the creation of a vector representation of an object, such as a text embedding, allowing us to represent a word as a vector of N-dimensions.

##### There are actually two main stages:

- Prior: Performs the text embedding, generating a CLIP image embedding.
- Decoder (unCLIP): A diffusion model which actually generates the image from the prior embedding.

##### Contrastive Language–Image Pre-training: CLIP is trained on image-text pairs

1. Conrastive pre-training (Text Encoder-Image Encoder)
2. Create dataset classifier from label text
3. Use for zero-shot prediction

#####

- CLIP is only incentivized to learn the features of an image that are sufficient to match it up with the correct caption (as opposed to any of the others in the list).
- This makes CLIP not ideal for learning about certain aspects of images, like relative positions of objects.
- If you read the DALLE publication papers, you will notice the authors were curious to see what happened when attempting to just directly pass the text embedding directly to the encoder.
- The authors noted much better results when using the additional Prior model using CLIP.
- Intuitively, we can think of this as having one embedding for text meaning, and another embedding for the “gist” of an image from text.
- An infinite number of images could be consistent with a given caption, so the outputs of the two encoders will not perfectly coincide.
- Hence, a separate prior model is needed to “translate” the text embedding into an image embedding that could plausibly match it.

##### Diffusion

- A diffusion model is trained to undo the steps of a fixed corruption process.
- Each step of the corruption process adds a small amount of gaussian noise to an image, which erases some of the information in it.
- After the final step, the image becomes indistinguishable from pure noise.
- The diffusion model is trained to reverse this process, and in doing so learns to regenerate what might have been erased in each step.

##### Two Main Stages

##### Prior Stage:

- Generates the CLIP image embedding (intended to describe the “gist” of the image) from the given caption (which itself is actually a text embedding).

##### Decoder Stage:

- A diffusion model called unCLIP generates the image itself from this embedding.
- unCLIP receives both a corrupted version of the image it is trained to reconstruct, as well as the CLIP image embedding of the clean image.

#####

- After these two stages an upsampling is performed on the image to get higher resolution.
- DALLE-2 was trained on 512x512 images, so any higher resolution output is actually upscaled from 512x512.
- An interesting aspect of image generation models is the multiple stages.
- This means image generation models lend themselves to be run “on-edge”, in fact there have already been releases of Stable Diffusion models running locally on an iPhone!

# AI SAFETY AND ALIGNMENT

- Recall that the creation of OpenAI by Musk, Altman, and others was motivated by trying to make sure AI was developed in an open and safe manner.
- Deeply embedded in the systems of OpenAI is attention to safety, alignment and biases.
- If you’ve read the publications OpenAI has released about GPT-3 and DALLE-2, you would have noticed that a lot of work is done to prevent harm.
- In your usage, you may notice this sometimes limits potential outputs.
- You should also keep in mind that OpenAI is one of the most committed companies in this area of AI, especially in terms of research and development of AI safety and alignment.

**limitations of text and image generation**

- OpenAI will restrict outputs of prompts that violate their content policy.
- These include topics like hate, threats, self-harm, sexual output, and violence.
- For example, you can’t ask GPT-3 to teach you how to make an improvised explosive device, it will flag the request and filter it out.
- You should also be aware that because the training data is based off the internet, there are inherent biases in the data!
- The original GPT-3 paper “Language Models are Few-Shot Learners” does an excellent job of discussing fairness, bias, and representation.
- OpenAI also publishes many papers focused just on AI safety and moderation.
- DALLE-2 is also under the same moderation restrictions, you can not ask for images that would potentially violate the moderation policies.
- This sometimes leans too much to “safety” for certain users preferences, for example:
 "over the shoulder shot of a man"
 "over the shoulder view of a man"
- An interesting technical note is that since requests for GPT and DALLE are both text inputs, the same moderation check on input text can be done on either model.

**practice text prompts safely**

- However, if you are concerned about potentially triggering the content policy limitations (which is violated too many times may pause your access), there is a moderation endpoint where you can check queries before actually getting results.
- More information and details can be found here: [platform.openai.com/docs/guides/moderation/](platform.openai.com/docs/guides/moderation/)
- While it may feel annoying at first for the models to be very inclined to “be safe”, keep in mind that this is one of the areas that OpenAI is actually most “open”, and you can read their publications to understand their thought processes and methodologies.
- For full information on OpenAI’s policies, rules, and limitations, check out their platform policy page:[platform.openai.com/docs/usage-policies ](platform.openai.com/docs/usage-policies )






In [17]:
# Moderation endpoint
reponse_mod = openai.Moderation.create(
    input="Sample text goes here"
)
print(reponse_mod["results"][0])

{
  "categories": {
    "hate": false,
    "hate/threatening": false,
    "self-harm": false,
    "sexual": false,
    "sexual/minors": false,
    "violence": false,
    "violence/graphic": false
  },
  "category_scores": {
    "hate": 4.921302206639666e-06,
    "hate/threatening": 1.0990176546599173e-09,
    "self-harm": 8.864341261016762e-09,
    "sexual": 2.6443567548994906e-05,
    "sexual/minors": 2.4819328814373876e-07,
    "violence": 2.1955165721010417e-05,
    "violence/graphic": 5.248724392004078e-06
  },
  "flagged": false
}
