# 00 Import Libraries
We need `Pandas` to read in Data and `openai` to get the Embeddings.

In [1]:
import pandas as pd
from openai import OpenAI

# 01 Functions
We need a function to open Files and a function to return the Embeddings of our Data. 

The function named <code>get_embedding</code> takes two arguments: `text`and `model`:<br><br>
<ol>
    <li><strong>Text Processing:</strong> The function first processes the `text` argument by replacing newline characters (`\n`) with spaces so that the input text is in a single line.</li>
<li><strong>Generate Embeddings</strong> We then call `embeddings_create` with our list containing the processed text and the model we want to use to get the Embeddings.</li>
<li><strong>Return Embeddings</strong> We access the first element from our list to get return the Embedding four our input text.</li>
</ol>

In [2]:
# FUNCTION TO OPEN FILES
def open_file(filepath):
    with open(filepath, 'r', encoding='utf-8') as infile:
        return infile.read()

# FUNCTION TO GET EMBEDDINGS
def get_embedding(text, model):
   text = text.replace("\n", " ")
   return client.embeddings.create(input = [text], model=model).data[0].embedding

# 02 Credentials
As we are going to using OpenAI's Embedding model `text-embedding-3-small` we need a valif OpenAi key.

In [4]:
# LOAD OPENAI CREDENTIALS
client = OpenAI(api_key=open_file('KEYS/openaiapikey.txt'))

# 03 Read in Data
We now need to load our database which we have saved in a JSON file.

In [9]:
# READ JSON INTO DATAFRAME
df = pd.read_json('json/books_info.json', encoding='utf-8')

# 04 Get Embeddings
Now we call our `get_embeddings`method for all book descriptions in our database and save them in a new column **text_embedding**.<br><br> <strong><em>CAUTION: This may take a while!</em></strong>

In [10]:
# GET EMEBDDINGS FOR DATABASE
df['text_embedding'] = df.book_description.apply(lambda x: get_embedding(x, model='text-embedding-3-small'))

# 05 Save Embeddings
Lastly we save everything back into a new local JSON file.

In [11]:
# SAVE EMBEDDINGS IN NEW JSON
with open('json/books_info_embeddings.json', 'w', encoding='utf-8') as file:
    df.to_json(file, orient='records', indent=4, force_ascii=False)