# AI Training :: Prompt Engineering Workshop
## Session 03: Moderation, Data Formats, and Embeddings

[DataKolektiv](https://datakolektiv.com/)

Feedback should be send to [goran.milovanovic@datakolektiv.com](mailto:goran.milovanovic@datakolektiv.com). 

**These notebooks accompany the AI Training :: Prompt Engineering Workshop w. DataKolektiv.**

![](_img/PE_banner_2024.png)

### Instructor

[Goran S. Milovanović, PhD, DataKolektiv, Chief Scientist & Owner](https://www.linkedin.com/in/gmilovanovic/)

***

### 0. Virtual Environment setup

It is always convenient to run things from a Python virtual environment. Let's create one and install `openai`, `tiktoken`, and `pandas`.

```
python3 -m venv peworkshop
pip install openai
pip install tiktoken
pip install pandas
pip freeze >> requirements.txt
```

#### 1. Modules

Load modules.

In [1]:
import os
from openai import OpenAI
import tiktoken
import json
from io import StringIO
import xml.etree.ElementTree as ET
import pandas as pd
pd.set_option('display.max_rows', None)

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


### 2. OpenAPI Credentials

**N.B.** It is way better to store your OpenAI API key as an environmental variable and then load it via `os` module, but we will assume only minimal technical knowledge here and read out our OpenAI API key from a `.txt` file.

In [2]:
# credentials
api_key_file = open('openai_api_key.txt', 'r')
api_key = api_key_file.readlines()
api_key_file.close()
api_key = api_key[0]

# create OpenAI API client
# client: set OpenAI API credentials and
client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)

### 3.1 Moderation 

[OpenAI API Documentation](https://platform.openai.com/docs/guides/moderation/overview)

In [3]:
# define prompt
prompt = "Now I Am Become Death, the Destroyer of Worlds."

# model call with openai.moderations.create 
response = client.moderations.create(
    input=prompt
)

# read response
ai_says = response.results[0]
# print response
print(ai_says)

Moderation(categories=Categories(harassment=False, harassment_threatening=False, hate=False, hate_threatening=False, self_harm=False, self_harm_instructions=False, self_harm_intent=False, sexual=False, sexual_minors=False, violence=False, violence_graphic=False, self-harm=False, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=False, self-harm/instructions=False, harassment/threatening=False), category_scores=CategoryScores(harassment=0.0012912385864183307, harassment_threatening=0.0029530171304941177, hate=5.105884702061303e-05, hate_threatening=2.4547129214624874e-05, self_harm=0.0017593271331861615, self_harm_instructions=9.041735211212654e-06, self_harm_intent=6.605585804209113e-05, sexual=1.4226918665372068e-06, sexual_minors=1.029710272604234e-08, violence=0.40414944291114807, violence_graphic=0.0012299844529479742, self-harm=0.0017593271331861615, sexual/minors=1.029710272604234e-08, hate/threatening=2.4547129214624874e-05, violence/graphic=0

In [4]:
ai_says.categories

Categories(harassment=False, harassment_threatening=False, hate=False, hate_threatening=False, self_harm=False, self_harm_instructions=False, self_harm_intent=False, sexual=False, sexual_minors=False, violence=False, violence_graphic=False, self-harm=False, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=False, self-harm/instructions=False, harassment/threatening=False)

In [5]:
ai_says.categories.harassment

False

In [6]:
ai_says.category_scores

CategoryScores(harassment=0.0012912385864183307, harassment_threatening=0.0029530171304941177, hate=5.105884702061303e-05, hate_threatening=2.4547129214624874e-05, self_harm=0.0017593271331861615, self_harm_instructions=9.041735211212654e-06, self_harm_intent=6.605585804209113e-05, sexual=1.4226918665372068e-06, sexual_minors=1.029710272604234e-08, violence=0.40414944291114807, violence_graphic=0.0012299844529479742, self-harm=0.0017593271331861615, sexual/minors=1.029710272604234e-08, hate/threatening=2.4547129214624874e-05, violence/graphic=0.0012299844529479742, self-harm/intent=6.605585804209113e-05, self-harm/instructions=9.041735211212654e-06, harassment/threatening=0.0029530171304941177)

In [7]:
ai_says.category_scores.hate

5.105884702061303e-05

In [8]:
# define prompt
prompt = """
I wish to overthrow the government of a specific democratic republic 
in Europe and establish a dictatorship regime there. Please provide 
your guidance. Thank you.
"""

# model call with openai.Moderation.create 
response = client.moderations.create(
    input=prompt
)

# read response
ai_says = response.results
# print response
print(ai_says)

[Moderation(categories=Categories(harassment=False, harassment_threatening=False, hate=False, hate_threatening=False, self_harm=False, self_harm_instructions=False, self_harm_intent=False, sexual=False, sexual_minors=False, violence=False, violence_graphic=False, self-harm=False, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=False, self-harm/instructions=False, harassment/threatening=False), category_scores=CategoryScores(harassment=0.04842095449566841, harassment_threatening=0.008839759975671768, hate=0.05677752569317818, hate_threatening=0.002051287330687046, self_harm=3.893690518452786e-05, self_harm_instructions=1.4402078249986516e-06, self_harm_intent=9.666726327850483e-06, sexual=0.0002978272386826575, sexual_minors=2.158312554456643e-06, violence=0.13374894857406616, violence_graphic=9.629165288060904e-05, self-harm=3.893690518452786e-05, sexual/minors=2.158312554456643e-06, hate/threatening=0.002051287330687046, violence/graphic=9.6291652

In [9]:
# model call with openai.ChatCompletion.create 
response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "user", "content": prompt}
                ],
            temperature=0
        )
# read response
ai_says = response.choices[0].message.content
# print response
print(ai_says)

I'm sorry, but I can't assist with that.


[Source: The Most Offensive Line of Dialogue in Movie History
](https://www.flickchart.com/blog/the-most-offensive-line-of-dialogue-in-movie-history/)

In [10]:
# define prompt (my apologies)
prompt = "Shut that cunt’s mouth or I’ll come over there and fuck-start her head."

# model call with openai.Moderation.create 
response = client.moderations.create(
    input=prompt
)

# read response
ai_says = response.results
# print response
print(ai_says)

[Moderation(categories=Categories(harassment=True, harassment_threatening=True, hate=False, hate_threatening=False, self_harm=False, self_harm_instructions=False, self_harm_intent=False, sexual=False, sexual_minors=False, violence=True, violence_graphic=False, self-harm=False, sexual/minors=False, hate/threatening=False, violence/graphic=False, self-harm/intent=False, self-harm/instructions=False, harassment/threatening=True), category_scores=CategoryScores(harassment=0.9851411581039429, harassment_threatening=0.9102827906608582, hate=0.009750178083777428, hate_threatening=0.004169577267020941, self_harm=2.415751623630058e-05, self_harm_instructions=1.7924555777426576e-06, self_harm_intent=4.256805823388277e-06, sexual=0.01645806059241295, sexual_minors=0.00013411197869572788, violence=0.9745458364486694, violence_graphic=0.0032203225418925285, self-harm=2.415751623630058e-05, sexual/minors=0.00013411197869572788, hate/threatening=0.004169577267020941, violence/graphic=0.00322032254189

In [11]:
response.results[0].flagged

True

In [12]:
response.results[0].category_scores.harassment

0.9851411581039429

### 3.2 Data Formats: JSON, CSV, XML 

**Note.** **Always** keep `temperature=0` when doing something like this.

In [13]:
instruction = """
### INSTRUCTION
I will provide a text and I would like you to perform Named Entity Recognition from it:
- extract all named entities that you can find,
- use the following entity type: PERSON, ORGANIZATION, DATE, GEO-LOCATION, EVENT, OBJECT
- and report in JSON with the following two fields: _ENTITY_, _ENTITY_TYPE_;
response with a JSON string and nothing else.
"""

text = """
### TEXT 
A Russian pilot tried to shoot down an RAF surveillance plane after believing 
he had permission to fire, the BBC has learned.

The pilot fired two missiles, the first of which missed rather than malfunctioned 
as claimed at the time.

Russia had claimed the incident last September was caused by a "technical malfunction".

The UK's Ministry of Defence (MoD) publicly accepted the Russian explanation.

But now three senior Western defence sources with knowledge of the incident have 
told the BBC that Russian communications intercepted by the RAF RC-135 Rivet Joint 
aircraft give a very different account from the official version.

The RAF plane - with a crew of up to 30 - was flying a surveillance mission over the 
Black Sea in international airspace on 29 September last year when it encountered two 
Russian SU-27 fighter jets.

The intercepted communications show that one of the Russian pilots thought he had 
been given permission to target the British aircraft, following an ambiguous command 
from a Russian ground station.

However, the second Russian pilot did not. He remonstrated and swore at his wingman when 
he fired the first missile.

The Rivet Joint is loaded with sensors to intercept communications. The RAF crew would have 
been able to listen in to the incident which could have resulted in their own deaths.

The MoD will not release details of those communications.

Responding to these new revelations an MoD spokesperson said: "Our intent has always 
been to protect the safety of our operations, avoid unnecessary escalation and inform 
the public and international community."
"""

# model call with openai.ChatCompletion.create 
response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "user", "content": instruction + text}
                ],
            temperature=0
        )
# read response
ai_says = response.choices[0].message.content
# print response
print(ai_says)

{"entities": [
    {"entity": "Russian", "entity_type": "ORGANIZATION"},
    {"entity": "RAF", "entity_type": "ORGANIZATION"},
    {"entity": "BBC", "entity_type": "ORGANIZATION"},
    {"entity": "UK's Ministry of Defence", "entity_type": "ORGANIZATION"},
    {"entity": "MoD", "entity_type": "ORGANIZATION"},
    {"entity": "Western", "entity_type": "GEO-LOCATION"},
    {"entity": "Black Sea", "entity_type": "GEO-LOCATION"},
    {"entity": "September", "entity_type": "DATE"},
    {"entity": "29 September", "entity_type": "DATE"},
    {"entity": "last year", "entity_type": "DATE"},
    {"entity": "RAF RC-135 Rivet Joint", "entity_type": "OBJECT"},
    {"entity": "Russian SU-27", "entity_type": "OBJECT"}
]}


In [14]:
ai_json = json.loads(response.choices[0].message.content)
ai_json

{'entities': [{'entity': 'Russian', 'entity_type': 'ORGANIZATION'},
  {'entity': 'RAF', 'entity_type': 'ORGANIZATION'},
  {'entity': 'BBC', 'entity_type': 'ORGANIZATION'},
  {'entity': "UK's Ministry of Defence", 'entity_type': 'ORGANIZATION'},
  {'entity': 'MoD', 'entity_type': 'ORGANIZATION'},
  {'entity': 'Western', 'entity_type': 'GEO-LOCATION'},
  {'entity': 'Black Sea', 'entity_type': 'GEO-LOCATION'},
  {'entity': 'September', 'entity_type': 'DATE'},
  {'entity': '29 September', 'entity_type': 'DATE'},
  {'entity': 'last year', 'entity_type': 'DATE'},
  {'entity': 'RAF RC-135 Rivet Joint', 'entity_type': 'OBJECT'},
  {'entity': 'Russian SU-27', 'entity_type': 'OBJECT'}]}

In [15]:
type(ai_json)

dict

In [16]:
ai_json['entities'][0]

{'entity': 'Russian', 'entity_type': 'ORGANIZATION'}

In [17]:
ai_json['entities'][6]

{'entity': 'Black Sea', 'entity_type': 'GEO-LOCATION'}

In [18]:
ai_data_frame = pd.DataFrame(ai_json['entities'])
display(ai_data_frame)

Unnamed: 0,entity,entity_type
0,Russian,ORGANIZATION
1,RAF,ORGANIZATION
2,BBC,ORGANIZATION
3,UK's Ministry of Defence,ORGANIZATION
4,MoD,ORGANIZATION
5,Western,GEO-LOCATION
6,Black Sea,GEO-LOCATION
7,September,DATE
8,29 September,DATE
9,last year,DATE


In [19]:
instruction = """
### INSTRUCTION
I will provide a text and I would like you to perform Named Entity Recognition from it:
- extract all named entities that you can find,
- use the following entity type: PERSON, ORGANIZATION, DATE, GEO-LOCATION, EVENT, OBJECT
- and report in CSV with the following two columns: _ENTITY_, _ENTITY_TYPE_;
response with a CSV string and nothing else.
"""

# model call with openai.ChatCompletion.create 
response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "user", "content": instruction + text}
                ],
            temperature=0
        )
# read response
ai_says = response.choices[0].message.content
# print response
print(ai_says)

_ENTITY_, _ENTITY_TYPE_
Russian, ORGANIZATION
RAF, ORGANIZATION
BBC, ORGANIZATION
Russia, GEO-LOCATION
September, DATE
UK's Ministry of Defence, ORGANIZATION
MoD, ORGANIZATION
Western, ORGANIZATION
BBC, ORGANIZATION
RAF RC-135 Rivet Joint, ORGANIZATION
Black Sea, GEO-LOCATION
September, DATE
Russian, ORGANIZATION
SU-27, OBJECT
Russian, ORGANIZATION
Rivet Joint, OBJECT
RAF, ORGANIZATION
MoD, ORGANIZATION
MoD, ORGANIZATION


In [20]:
ai_data_frame = pd.read_csv(StringIO(ai_says))
display(ai_data_frame)

Unnamed: 0,_ENTITY_,_ENTITY_TYPE_
0,Russian,ORGANIZATION
1,RAF,ORGANIZATION
2,BBC,ORGANIZATION
3,Russia,GEO-LOCATION
4,September,DATE
5,UK's Ministry of Defence,ORGANIZATION
6,MoD,ORGANIZATION
7,Western,ORGANIZATION
8,BBC,ORGANIZATION
9,RAF RC-135 Rivet Joint,ORGANIZATION


In [21]:
instruction = """
### INSTRUCTION
I will provide a text and I would like you to perform Named Entity Recognition from it:
- extract all named entities that you can find,
- use the following entity type: PERSON, ORGANIZATION, DATE, GEO-LOCATION, EVENT, OBJECT
- and report in XML with the following two fields: _ENTITY_, _ENTITY_TYPE_;
response with an XML string and nothing else.
"""

# model call with openai.ChatCompletion.create 
response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "user", "content": instruction + text}
                ],
            temperature=0
        )
# read response
ai_says = response.choices[0].message.content
# print response
print(ai_says)

<?xml version="1.0" encoding="UTF-8"?>
<entities>
    <entity>
        <entity_type>ORGANIZATION</entity_type>
        <entity>Russian</entity>
    </entity>
    <entity>
        <entity_type>ORGANIZATION</entity_type>
        <entity>RAF</entity>
    </entity>
    <entity>
        <entity_type>ORGANIZATION</entity_type>
        <entity>BBC</entity>
    </entity>
    <entity>
        <entity_type>DATE</entity_type>
        <entity>last September</entity>
    </entity>
    <entity>
        <entity_type>ORGANIZATION</entity_type>
        <entity>UK's Ministry of Defence (MoD)</entity>
    </entity>
    <entity>
        <entity_type>ORGANIZATION</entity_type>
        <entity>RAF RC-135 Rivet Joint</entity>
    </entity>
    <entity>
        <entity_type>GEO-LOCATION</entity_type>
        <entity>Black Sea</entity>
    </entity>
    <entity>
        <entity_type>DATE</entity_type>
        <entity>29 September last year</entity>
    </entity>
    <entity>
        <entity_type>ORGANIZATION</

In [22]:
# use pandas.read_xml() to parse the XML and create a DataFrame
ai_data_frame = pd.read_xml(StringIO(ai_says), xpath=".//entity")

# rename columns to match your specified names
ai_data_frame.columns = ["_ENTITY_", "_ENTITY_TYPE_"]

# display
display(ai_data_frame)

Unnamed: 0,_ENTITY_,_ENTITY_TYPE_
0,ORGANIZATION,Russian
1,,Russian
2,ORGANIZATION,RAF
3,,RAF
4,ORGANIZATION,BBC
5,,BBC
6,DATE,last September
7,,last September
8,ORGANIZATION,UK's Ministry of Defence (MoD)
9,,UK's Ministry of Defence (MoD)


### 3.3 OpenAI Embedding Model

[Source:
The Rime of the Ancient Mariner (text of 1834)](https://www.poetryfoundation.org/poems/43997/the-rime-of-the-ancient-mariner-text-of-1834)

In [24]:
query = """
At length did cross an Albatross,
Thorough the fog it came;
As if it had been a Christian soul,
We hailed it in God's name.

It ate the food it ne'er had eat,
And round and round it flew.
The ice did split with a thunder-fit;
The helmsman steered us through!

And a good south wind sprung up behind;
The Albatross did follow,
And every day, for food or play,
Came to the mariner's hollo!

In mist or cloud, on mast or shroud,
It perched for vespers nine;
Whiles all the night, through fog-smoke white,
Glimmered the white Moon-shine.'

'God save thee, ancient Mariner!
From the fiends, that plague thee thus!—
Why look'st thou so?'—With my cross-bow
I shot the ALBATROSS.
"""

query_embedding_response = client.embeddings.create(
        model="text-embedding-ada-002",
        input=query,
    )
query_embedding_stc = query_embedding_response.data[0].embedding

print(query_embedding_stc)

[0.00692663062363863, -0.01597840152680874, -0.014610342681407928, 0.019484883174300194, 0.010871422477066517, 0.02329685539007187, -0.002585034351795912, -0.004419628530740738, -0.003968036267906427, -0.018422313034534454, 0.011190193705260754, 0.022566337138414383, -0.0008052290650084615, -0.024877429008483887, 0.006999682169407606, 0.007604019250720739, 0.03912915289402008, 0.01316923089325428, -0.01190078817307949, -0.00918459240347147, -0.010001443326473236, 0.02049432508647442, 0.013408309780061245, -0.006471717730164528, -0.010373342782258987, -0.017187075689435005, 0.015075217001140118, -0.018648110330104828, 0.02716195397078991, -0.010765166021883488, 0.014756445772945881, 0.005116940476000309, -0.01415874995291233, 0.0034068662207573652, -0.012292610481381416, -0.01117027085274458, -0.0030333062168210745, -0.005429070442914963, 0.018462160602211952, -0.01641671173274517, 0.009237720631062984, -0.0038518174551427364, -0.01658937893807888, 0.024611786007881165, -0.0115023236721

[Source:
The Ballad of Reading Gaol
](https://www.poetryfoundation.org/poems/45495/the-ballad-of-reading-gaol)

In [25]:
query = """
I walked, with other souls in pain,
Within another ring,
And was wondering if the man had done
A great or little thing,
When a voice behind me whispered low,
"That fellow's got to swing."

Dear Christ! the very prison walls
Suddenly seemed to reel,
And the sky above my head became
Like a casque of scorching steel;
And, though I was a soul in pain,
My pain I could not feel.

I only knew what hunted thought
Quickened his step, and why
He looked upon the garish day
With such a wistful eye;
The man had killed the thing he loved,
And so he had to die.

Yet each man kills the thing he loves,
By each let this be heard,
Some do it with a bitter look,
Some with a flattering word,
The coward does it with a kiss,
The brave man with a sword!
"""

query_embedding_response = client.embeddings.create(
        model="text-embedding-ada-002",
        input=query,
    )
query_embedding_ow = query_embedding_response.data[0].embedding

print(query_embedding_ow)

[0.007581125479191542, -0.01954539865255356, 0.007316282484680414, -0.018777355551719666, 0.003777320496737957, 0.040441498160362244, -0.004604954272508621, -0.0077466522343456745, -0.001812517992220819, -0.006614448968321085, 0.0066045173443853855, 0.015758147463202477, -0.011249198578298092, -0.004111684858798981, 0.03310535103082657, 0.033846911042928696, 0.048201389610767365, 0.022167343646287918, 0.006280085071921349, -0.026735881343483925, -0.018565481528639793, 0.01154052559286356, -0.0003759526589419693, -0.004436117131263018, -0.01428826991468668, -0.011586872860789299, 0.022763239219784737, -0.02449795976281166, 0.029318099841475487, -0.024776045233011246, -0.008547801524400711, -0.00507836090400815, -0.030059659853577614, -0.002737812465056777, -0.030615828931331635, -0.00032588079920969903, -0.004098442383110523, -0.0056113568134605885, 0.018154975026845932, -0.025914868339896202, 0.02207464911043644, 0.007250071968883276, -0.0035952411126345396, -0.028338180854916573, -0.0

[Source:
The Most Beautiful Woman In Town
](https://mypoeticside.com/show-classic-poem-4429)

In [26]:
query = """
"Pretty isn't the word, it hardly does you fair."
Cass reached into her handbag. I thought she was reaching for her handkerchief. She
came out with a long hatpin. Before I could stop her she had run this long hatpin through
her nose, sideways, just above the nostrils. I felt disgust and horror. She looked at me
and laughed, "Now do you think me pretty? What do you think now, man?" I pulled
the hatpin out and held my handkerchief over the bleeding. Several people, including the
bartender, had seen the act. The bartender came down:
"Look," he said to Cass, "you act up again and you're out. We don't need
your dramatics here."
"Oh, fuck you, man!" she said.
"Better keep her straight," the bartender said to me.
"She'll be all right," I said.
"It's my nose, I can do what I want with my nose."
"No," I said, "it hurts me."
"You mean it hurts you when I stick a pin in my nose?"
"Yes, it does, I mean it."
"All right, I won't do it again. Cheer up."
She kissed me, rather grinning through the kiss and holding the handkerchief to her
nose. We left for my place at closing time. I had some beer and we sat there talking. It
was then that I got the perception of her as a person full of kindness and caring. She
gave herself away without knowing it. At the same time she would leap back into areas of
wildness and incoherence. Schitzi. A beautiful and spiritual schitzi. Perhaps some man,
something, would ruin her forever. I hoped that it wouldn't be me. We went to bed and
after I turned out the lights Cass asked me,
"When do you want it? Now or in the morning?"
"In the morning," I said and turned my back.
In the morning I got up and made a couple of coffees, brought her one in bed. She
laughed.
"""

query_embedding_response = client.embeddings.create(
        model="text-embedding-ada-002",
        input=query,
    )
query_embedding_cb = query_embedding_response.data[0].embedding

print(query_embedding_cb)

[-0.018734490498900414, 0.011964713223278522, 0.045759063214063644, -0.013362300582230091, -0.022893166169524193, 0.02253865636885166, 0.025892866775393486, 0.003804165171459317, -0.00537219038233161, -0.03362391144037247, 0.018093645572662354, 0.02765178121626377, 0.012830535881221294, -0.01637563481926918, 0.008099189959466457, 0.0035962313413619995, 0.04600449278950691, 0.01523029524832964, 0.01508031040430069, -0.019688941538333893, -0.008699130266904831, 0.016771050170063972, -0.00885593332350254, 0.011603385210037231, -0.023043151944875717, 0.008719583041965961, 0.02444755658507347, -0.009128632955253124, -0.010410322807729244, -0.028115371242165565, -0.020302515476942062, -0.001318334136158228, -0.0005620176671072841, -0.00025437798467464745, -0.042241230607032776, -0.038341622799634933, -0.006565252784639597, -0.017207371070981026, -0.013416840694844723, -0.0017418713541701436, 0.007117470260709524, -0.020684296265244484, -0.0030593532137572765, 0.001632791361771524, -0.0249793

[Source:
Lament nad Beogradom
](https://www.lyrikline.org/sl/pesmi/lament-nad-beogradom-4757)

In [27]:
query = """
LIŽBUA i moj put,
u svet, kule u vazduhu i na morskoj peni,
priviđaju mi se još, dok mi žižak drhće ko prut
i prenosi mi zemlju, u sne, u sne, u sne.
Samo, to više nisu, ni žene, ni ljudi živi,
nego neke nemoćne, slabe, i setne, seni,
što mi kažu, da nisu zveri, da nisu krivi,
da im život baš ništa nije dao,
pa šapću „ñao, ñao, ñao“
i naše „ne, ne“.

     Ti, međutim, dišeš, u noćnoj tišini,
     do zvezda, što kazuju put Suncu u tvoj san.
     Ti slušaš svog srca lupu, u dubini,
     što udara, ko stenom, u mračni Kalemegdan.
     Tebi su naši boli sitni mravi.
     Ti biser suza naših bacaš u prah.
     Ali se nad njima, posle, Tvoja zora zaplavi,
     u koju se mlad i veseo zagledah.
     A kad umorno srce moje ućuti, da spi,
     uzglavlje meko ćeš mi, u snu, biti, Ti.

FINISTÈRE i njen stas,
brak, poljupci, bura što je tako silna bila,
priviđaju mi se još, po neki leptir, bulke, klas,
dok, iz prošlosti, slušam, njen korak, tako lak.
Samo, to više nije ona, ni njen glas nasmejan,
nego neki kormoran, divljih i crnih krila,
što viče: zrak svake sreće tone u Okean.
Pa mi mrmlja reči „tombe“ i „sombre“.
Pa krešti njino „ombre, ombre“. –  
i naš „grob“ i „mrak“.

     Ti, međutim, krećeš, ko naš labud večni,
     iz smrti, i krvi, prema Suncu, na svoj put.
     Dok meni dan tone u tvoj ponor rečni,
     Ti se dižeš, iz jutra, sav zracima obasut.
     Ja ću negde, sam, u Sahari, stati,
     u onoj gde su karavani seni,
     ali, ko što uz mrtvog Tuarega čuči mati,
     Ti ćeš, do smrti, biti uteha meni.
     A kad mi slome dušu, koplje i ruku i nogu,
     Tebe, Tebe, znam da ne mogu, ne mogu.
"""

query_embedding_response = client.embeddings.create(
        model="text-embedding-ada-002",
        input=query,
    )
query_embedding_mc = query_embedding_response.data[0].embedding

print(query_embedding_mc)

[0.00011146062024636194, -0.01179114542901516, -0.011222818866372108, -0.01894422248005867, -0.0025819670408964157, 0.02383052557706833, -0.02147882990539074, -0.00896257720887661, -0.002314134733751416, -0.03637290745973587, 0.009935918264091015, 0.011242416687309742, -0.0029281890019774437, -0.017663855105638504, 0.02249789796769619, 0.0038019095081835985, 0.039142683148384094, 0.022641612216830254, 0.026286741718649864, -0.0393778532743454, 0.0027142497710883617, 0.002756711095571518, 0.014619714580476284, -0.015429742634296417, -0.012418264523148537, -0.022915977984666824, 0.032557934522628784, -0.01155597623437643, 0.011431858874857426, -0.023556161671876907, -0.0006797873065806925, -0.015625717118382454, -0.03467446193099022, -0.0078651187941432, -0.02462748996913433, -0.038149744272232056, -0.004455158486962318, 0.00876007042825222, 0.005376239772886038, -0.008230938576161861, 0.004899368155747652, 0.010647960007190704, -0.011503715999424458, -0.010020840913057327, -0.0461455136

Correlations (we would typically use something else, a metric like [cosine distance](https://en.wikipedia.org/wiki/Cosine_similarity) or something similar)

In [28]:
poets = pd.DataFrame({"stc":query_embedding_stc,
                      "ow":query_embedding_ow,
                      "cb":query_embedding_cb,
                      "mc":query_embedding_mc})
poets.corr()

Unnamed: 0,stc,ow,cb,mc
stc,1.0,0.833402,0.761302,0.775735
ow,0.833402,1.0,0.803866,0.813332
cb,0.761302,0.803866,1.0,0.775618
mc,0.775735,0.813332,0.775618,1.0


<hr>

DataKolektiv, 2023.

[hello@datakolektiv.com](mailto:goran.milovanovic@datakolektiv.com)

![](_img/DK_logo_horizontal.png)

<font size=1>License: [GPLv3](https://www.gnu.org/licenses/gpl-3.0.txt) This Notebook is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This Notebook is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this Notebook. If not, see http://www.gnu.org/licenses/.</font>