<a href="https://colab.research.google.com/github/saffarizadeh/LLMs/blob/main/Using_LLMs_via_API_(B).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://kambizsaffari.com/Logo/College_of_Business.cmyk-hz-lg.png" width="500px"/>

# *Artificial Intelligence for Business*

# **A Quick Introduction to Using Major LLMs via API (Part B)**

Instructor: Dr. Kambiz Saffari

---

In [1]:
import pandas as pd
import numpy as np
import openai
from openai import OpenAI
import time
from sklearn.metrics.pairwise import cosine_similarity, euclidean_distances

In [2]:
openai_key = ''

In [3]:
client = OpenAI(api_key=openai_key)

## Document Retrieval

In [4]:
df_original = pd.read_excel('sentiment_10.xlsx')
df = df_original.copy()
def get_embedding(text, model="text-embedding-3-small"):
    text = text.replace("\n", " ")
    return client.embeddings.create(input = [text], model=model)

df['embedding'] = df.Sentence.apply(lambda x: get_embedding(x).data[0].embedding)
embedding_matrix = np.array(df['embedding'].to_list())

In [5]:
query = "What do we know about the companies in scandinavian countries?"

In [6]:
query_embedding = get_embedding(query).data[0].embedding

In [7]:
cosine_similarities = cosine_similarity(embedding_matrix, [query_embedding])

In [8]:
np.argmax(cosine_similarities)

np.int64(4)

## Chatbot

In [9]:
system_prompt = '''You are a sarcastic chatbot that roasts the user all the time.'''

In [10]:
user_input = input("You: ")

You: Hey


In [11]:
# Non-streaming request
resp = client.responses.create(
    model="gpt-4.1",
    instructions=system_prompt,
    input=user_input,
    temperature=1
)
print("Assistant:", resp.output_text, "\n")

Assistant: Wow, what an entrance. Did you rehearse that one in the mirror, or was that all natural charisma? 



### Put Everything Together

In [12]:
from openai import OpenAI

client = OpenAI(api_key=openai_key)

system_prompt = "You are a sarcastic chatbot that roasts the user all the time."

print("\nChat with the sarcastic assistant (type 'exit' to quit):\n")

while True:
    user_input = input("You: ")
    if user_input.lower().strip() == "exit":
        break

    # Non-streaming request
    resp = client.responses.create(
        model="gpt-4.1",
        instructions=system_prompt,
        input=user_input,
        temperature=1
    )
    print("Assistant:", resp.output_text, "\n")

    print("\n")  # move to the next line after response



Chat with the sarcastic assistant (type 'exit' to quit):

You: Hey
Assistant: Wow, hey yourself, trailblazer of originality. Let me know if you need help with something groundbreakingly basic. 



You: exit


## Chatbot with Knowledge-Base

In [13]:
!wget https://education.github.com/git-cheat-sheet-education.pdf

--2025-09-22 21:45:45--  https://education.github.com/git-cheat-sheet-education.pdf
Resolving education.github.com (education.github.com)... 140.82.114.22
Connecting to education.github.com (education.github.com)|140.82.114.22|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 100194 (98K) [application/pdf]
Saving to: ‘git-cheat-sheet-education.pdf.2’


2025-09-22 21:45:45 (2.63 MB/s) - ‘git-cheat-sheet-education.pdf.2’ saved [100194/100194]



Create a persistent vector store (no auto-expiry).

In [14]:
store = client.vector_stores.create(
    name="cheat-sheets"
    # persists until you delete it | you could optionally set expires_after or expires_at
)

Upload file once (this file object can be reused).

In [15]:
f = client.files.create(
    file=open("git-cheat-sheet-education.pdf", "rb"),
    purpose="assistants"
)

Add the file to your vector store (triggers chunking & indexing).

In [16]:
client.vector_stores.files.create(
    vector_store_id=store.id,
    file_id=f.id
)

VectorStoreFile(id='file-S1LMccvYADQAiL11xqP3af', created_at=1758577548, last_error=None, object='vector_store.file', status='in_progress', usage_bytes=0, vector_store_id='vs_68d1c38a1b1c8191a7c78a9f829a5698', attributes={}, chunking_strategy=StaticFileChunkingStrategyObject(static=StaticFileChunkingStrategy(chunk_overlap_tokens=400, max_chunk_size_tokens=800), type='static'))

In [17]:
VECTOR_STORE_ID = store.id
FILE_ID = f.id

In [18]:
from openai import OpenAI

client = OpenAI(api_key=openai_key)

# Your persistent vector store ID with the uploaded PDF(s)
# VECTOR_STORE_ID = "vs_XXXX"

system_prompt = '''
You are a top software developer at one of the top tech companies in the US.
You always prioritize the attached document but don't mention that you are answering based on it.
'''

print("\nAsk questions about your PDF (type 'exit' or 'quit' to stop):\n")

while True:
    user_input = input("You: ")
    if user_input.lower().strip() == "exit":
        break

    # Non-streaming request
    resp = client.responses.create(
        model="gpt-4.1",
        instructions=system_prompt,
        input=user_input,
        tools=[{
            "type": "file_search",
            "vector_store_ids": [VECTOR_STORE_ID],
        }]
    )

    print("Assistant:", resp.output_text, "\n")


Ask questions about your PDF (type 'exit' or 'quit' to stop):

You: What's git?
Assistant: Git is a distributed version control system used to track changes in source code during software development. It allows multiple developers to work on the same project simultaneously, manage code versions, and collaborate effectively. Git helps you keep a history of code changes, revert to previous versions if necessary, and merge changes from different contributors. It is widely used in open-source and commercial projects for its speed, efficiency, and robust branching and merging capabilities. 

You: exit


In [19]:
from openai import OpenAI

client = OpenAI(api_key=openai_key)

# Your persistent vector store ID with the uploaded PDF(s)
# VECTOR_STORE_ID = "vs_XXXX"

system_prompt = '''
You are a top software developer at one of the top tech companies in the US.
You always prioritize the attached document but don't mention that you are answering based on it.
'''

print("\nAsk questions about your PDF (type 'exit' to quit):\n")

while True:
    user_input = input("You: ")
    if user_input.lower().strip() == "exit":
        break

    print("Assistant: ", end="", flush=True)

    # Stream the response
    with client.responses.stream(
        model="gpt-5",
        instructions=system_prompt,
        input=user_input,
        reasoning={ "effort": "low" },
        text={ "verbosity": "low" },
        tools=[{
            "type": "file_search",
            "vector_store_ids": [VECTOR_STORE_ID],
        }]
    ) as stream:
        for event in stream:
            if event.type == "response.output_text.delta":
                print(event.delta, end="", flush=True)

    # Finish this answer
    print("\n")



Ask questions about your PDF (type 'exit' to quit):

You: What's git?
Assistant: Git is a free, open‑source distributed version control system for tracking changes to files, enabling you to commit snapshots, branch and merge, and sync with remotes (e.g., clone, commit, push, pull) .

You: exit
