<a href="https://colab.research.google.com/github/kky-ai/tech-demo/blob/main/rag-llm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### ChatGPT API
---
This way we can call the OpenAI ChatGPT API from Python.

In [None]:
from openai import OpenAI

# Init (pass your API key)
client = OpenAI(api_key='')

# API Call
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who won the world series in 2020?"},
    {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
    {"role": "user", "content": "Where was it played?"}
  ]
)

# Print the first result (model's answer)
print(response.choices[0].message.content)

### Vector database + RAG
---

In [29]:

import nltk
from tqdm import tqdm

nltk.download('punkt')

def suck_around(context, idx, maxLen):
    words = nltk.tokenize.word_tokenize(context)
    if len(words) <= maxLen:
        return context

    sentences = nltk.sent_tokenize(context)
    cumul = 0
    ret = []
    for sent_i, sent in enumerate(sentences):
        cumul += len(sent)
        
        if idx < cumul:
            # add previous sentence (if any)
            if sent_i > 0:
                ret.append(sentences[sent_i-1])
            
            # add current
            ret.append(sent)
            
            # add following (if any)
            if sent_i < len(sentences)-1:
                ret.append(sentences[sent_i+1])
            
            # keep adding previous (full) sentences until maxLen not exceeded
            tokens_now = nltk.tokenize.word_tokenize(' '.join(ret))
            tokens_in_sent = {si: nltk.tokenize.word_tokenize(s) for si, s in enumerate(sentences[:sent_i-1])}
            
            for si, s in tokens_in_sent.items():
                if len(tokens_now) + len(s) <= maxLen:
                    ret.insert(0, sentences[si])
                else:
                    break
            
            break
    
    return ' '.join(ret)
    
MAX_CONTEXT_LEN = 16
QPERLINE = 1

FN_IN = f'example_data/interview-1.txt'
data = []

with open(FN_IN, 'r') as fr: 
    linei = 0
    for line in tqdm(fr.readlines()):
        linei += 1
        line = line.strip()
        if line:
            if QPERLINE > 1:
                for i in range(QPERLINE):
                    idx = int((len(line)/QPERLINE)*i)
                    context = suck_around(line, idx, MAX_CONTEXT_LEN)
                    data.append({'context': context, 'iline': linei})
            else:
                data.append({'context': line, 'iline': linei})

[nltk_data] Downloading package punkt to /Users/kitt/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
100%|██████████| 33/33 [00:00<00:00, 294493.69it/s]


In [30]:
from sentence_transformers import SentenceTransformer
import faiss

encoder = SentenceTransformer('paraphrase-mpnet-base-v2')
vectors = encoder.encode([d['context'] for d in data])

index = faiss.IndexFlatL2(vectors.shape[1])
index.add(vectors)




In [31]:
# Your searched query
query = 'Which schools did you attend?'

# Encode the query into a vector
xq = encoder.encode([query]) 

k = 3  ## return k items


In [39]:
%%time

dists, inds = index.search(xq, k)
print(f' == Query: {query}')
print(f' == Searching in a vector database of {index.ntotal} items => {k} closest:')
for d, i in zip(dists[0], inds[0]):
    print(f'line {data[i]["iline"]}, d = {round(float(d), 2)}: {data[i]["context"]}')


 == Query: Which schools did you attend?
 == Searching in a vector database of 17 items => 3 closest:
line 21, d = 4.1: And where did you go to college?
line 23, d = 9.53: I went to University of California in San Diego, and for my undergrad, and then I went to University of Santa Clara, up in Santa Clara, California. And what did you study? So for my undergrad, it was computer engineering, which was, at that time, it was part computer science and part electrical engineering. And then for my master's degree, I got a master's in electrical engineering.
line 3, d = 9.73: Oh, well, I was born in Hollywood, California. And I was born in 1961. And where did you grow up? So I grew up in Southern California. And so half my childhood was in Los Angeles. And then we moved to San Diego, which was great. And, and lived there until I graduated from college.
CPU times: user 134 µs, sys: 18 µs, total: 152 µs
Wall time: 138 µs


#### RAG
---
```faiss``` + ChatGPT

In [None]:
from openai import OpenAI

# Init (pass your API key)
client = OpenAI(api_key='')

context = ' <context> '.join([data[i]['context'] for i in inds[0]])

# API Call
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a RAG system. I will provide you the context and the query. Answer briefly and correctly."},
    {"role": "user", "content": f"{context} <query>: {query}"},
  ]
)

# Print the first result (model's answer)
print(response.choices[0].message.content)