# Embeddings (3)

This file is for running the embeddings API of OpenAI for the **3rd quarter** of the original DataFrame.

- Read DataFrame from csv (words only).

In [1]:
import pandas as pd

embeddings3 = pd.read_csv('embs3.csv')
del embeddings3['Unnamed: 0']
embeddings3

Unnamed: 0,word,vec
0,make a fool of,
1,make an April fool of,
2,make an ass of,
3,trifle with,
4,cajole,
...,...,...
24995,nullify,
24996,declare null and void,
24997,cancel,
24998,retract,


- Use personal api-key to access the API (choose the large embedding model that provides vectors up to **3072 elements**).  
- Then read each word and create the corresponding embedding vector.  
- Store vector in our DataFrame.

In [2]:
from openai import OpenAI
client = OpenAI(api_key='my-key') # Establish connection

# Make sure the vec column is filled with None before running
embeddings3['vec'] = [None]*embeddings3['vec'].size 

for i in range(embeddings3['word'].size):
    response = client.embeddings.create(
        input=embeddings3['word'][i], # Give input
        model="text-embedding-3-large" # Define model
    )
    embeddings3['vec'][i] = response.data[0].embedding # Extract output vector

In [3]:
embeddings3

Unnamed: 0,word,vec
0,make a fool of,"[0.02152961678802967, -0.022376881912350655, -..."
1,make an April fool of,"[0.0005690762773156166, -0.020257286727428436,..."
2,make an ass of,"[0.011938272975385189, -0.008908421732485294, ..."
3,trifle with,"[-0.016612699255347252, -0.007306511513888836,..."
4,cajole,"[-0.024196593090891838, -0.03589264303445816, ..."
...,...,...
24995,nullify,"[-0.03417118638753891, -0.047342561185359955, ..."
24996,declare null and void,"[-0.0159055944532156, -0.04391665384173393, -0..."
24997,cancel,"[-0.013750141486525536, -0.02101719379425049, ..."
24998,retract,"[-0.007555052638053894, -0.005431007593870163,..."


- Store vectors to csv file.

In [4]:
embeddings3.to_csv('emb3L.csv')