# Parallel Requests Demo with Training, Embedding, and Search

This notebook demonstrates how to:
1. Read a text file (`cosmos.txt`), split it into sentences.
2. Use the first 10% of sentences to call the `/train` endpoint.
3. Use the remaining 90% of sentences to call the `/embed` endpoint in parallel.
4. Generate 20 questions about *Cosmos* by Carl Sagan and send them in parallel to the `/search` endpoint.

We use Python's `concurrent.futures` and `requests` to send multiple requests concurrently.

In [1]:
# Install necessary libraries if not already installed
!pip install requests nltk

# Download nltk punkt tokenizer data
import nltk
nltk.download('punkt')
# Download the specific punkt_tab resource needed for sentence tokenization
nltk.download('punkt_tab')



[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


True

In [14]:
import requests
from concurrent.futures import ThreadPoolExecutor, as_completed
import nltk
from nltk.tokenize import sent_tokenize

# API endpoint URLs
TRAIN_URL = 'http://localhost:8000/train'
EMBED_URL = 'http://localhost:8000/embed'
SEARCH_URL = 'http://localhost:8000/search'

# Path to the text file containing content of 'Cosmos' by Carl Sagan
TEXT_FILE = 'cosmos.txt'


In [15]:
# Read the book text from 'cosmos.txt'
with open(TEXT_FILE, 'r', encoding='utf-8') as f:
    text = f.read()

# Use nltk's sent_tokenize to split into sentences
sentences = sent_tokenize(text)
print(f"Total sentences extracted: {len(sentences)}")


Total sentences extracted: 6455


In [16]:
# Determine split index for 10% train, 95% embed
split_idx = max(1, int(len(sentences) * 0.1))  # Ensure at least one sentence in train
train_sentences = sentences[:split_idx]
embed_sentences = sentences[split_idx:]
print(f"Using {len(train_sentences)} sentences for training and {len(embed_sentences)} for embedding.")


Using 322 sentences for training and 1678 for embedding.


In [17]:
def call_train(sent_list):
    payload = {"sentences": sent_list}
    try:
        response = requests.post(TRAIN_URL, json=payload)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        return {"error": str(e)}

def call_embed(sent_list):
    payload = {"sentences": sent_list}
    try:
        response = requests.post(EMBED_URL, json=payload)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        return {"error": str(e)}

def call_search(query, k=5):
    payload = {"query": query, "k": k}
    try:
        response = requests.post(SEARCH_URL, json=payload)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        return {"error": str(e)}


In [18]:
print("Calling /train endpoint...")
train_resp = call_train(train_sentences)
print("Train response:", train_resp)


Calling /train endpoint...
Train response: {'message': 'Training complete'}


In [19]:
results_embed = []
print("Calling /embed endpoint in parallel for embed sentences...")
with ThreadPoolExecutor(max_workers=5) as executor:
    future_to_sent = {executor.submit(call_embed, [sent]): sent for sent in embed_sentences}
    for future in as_completed(future_to_sent):
        sent = future_to_sent[future]
        data = future.result()
        results_embed.append((sent, data))
print(f"Completed embedding {len(results_embed)} sentences.")
print(f"Sample result: {results_embed[0]}")


Calling /embed endpoint in parallel for embed sentences...
Completed embedding 1678 sentences.
Sample result: ('He used the smallest possible circumference of the Earth and the greatest\neastward extension of Asia he could find in all the books available to him, and then\nexaggerated even those.', {'embeddings': [[0.016672559082508087, 0.11609943956136703, -0.027639606967568398, -0.005315684247761965, -0.07242153584957123, -0.04227735474705696, -0.013767463155090809, 0.11548683047294617, -0.03979802131652832, 0.05069296061992645, 0.007707843091338873, -0.007357427850365639, 0.004697429481893778, 0.010539188049733639, -0.03893382102251053, -0.0630641058087349, -0.06190153583884239, 0.05519114434719086, -0.07222773134708405, -0.0764242559671402, 0.009512334130704403, 0.0229574516415596, 0.06448265165090561, 0.026010476052761078, 0.03295792266726494, -0.025764580816030502, -0.0014381457585841417, -0.0995594784617424, 0.03767015412449837, 0.019046543166041374, -0.005719668231904507, 0.0411

Below are 20 example questions about *Cosmos* by Carl Sagan. We will send these to the `/search` endpoint in parallel to retrieve nearest neighbors from the index.

In [20]:
questions = [
    "What inspired Carl Sagan to write Cosmos?",
    "How does Cosmos explain the formation of galaxies?",
    "Describe the role of Carl Sagan in popularizing science.",
    "What is the significance of the Pale Blue Dot in Cosmos?",
    "How does Cosmos discuss the concept of light travel time?",
    "Explain Sagan's perspective on the search for extraterrestrial life.",
    "What does Cosmos say about the cosmic calendar?",
    "How are black holes described in Cosmos?",
    "What examples of cosmic evolution are provided in Cosmos?",
    "How does Sagan illustrate the scale of the universe?",
    "What role do molecules play according to Cosmos?",
    "Describe the segment on planetary exploration in Cosmos.",
    "What is Carl Sagan's view on the future of humanity?",
    "How does Cosmos address the origin of life?",
    "Explain the trilogy of Cosmos episodes related to stars.",
    "What narrative does Cosmos give about ancient civilizations?",
    "How does Sagan describe the importance of skepticism?",
    "What warnings does Cosmos offer regarding environmental issues?",
    "How are supernovas explained in Cosmos?",
    "What is the importance of the cosmic perspective in Cosmos?"
]
len(questions)


20

In [24]:
results_search = []
print("Calling /search endpoint in parallel for questions...")
with ThreadPoolExecutor(max_workers=5) as executor:
    future_to_q = {executor.submit(call_search, q, 5): q for q in questions}
    for future in as_completed(future_to_q):
        q = future_to_q[future]
        data = future.result()
        results_search.append((q, data))
print(f"Completed search for {len(results_search)} questions.")

# Print search results
for idx, (q, res) in enumerate(results_search, 1):
    print(f"Question {idx}: {q}")
    print(res)
    print('-' * 60)


Calling /search endpoint in parallel for questions...
Completed search for 20 questions.
Question 1: How does Cosmos explain the formation of galaxies?
{'distances': [[0.9216623306274414, 0.9452494382858276, 1.0770094394683838, 1.1771094799041748, 1.2502926588058472]], 'indices': [[65, 69, 102, 106, 85]]}
------------------------------------------------------------
Question 2: What inspired Carl Sagan to write Cosmos?
{'distances': [[0.8279215097427368, 1.0027190446853638, 1.0031050443649292, 1.0977003574371338, 1.1019446849822998]], 'indices': [[28, 27, 64, 70, 31]]}
------------------------------------------------------------
Question 3: Describe the role of Carl Sagan in popularizing science.
{'distances': [[1.110882043838501, 1.1625306606292725, 1.1923024654388428, 1.2349016666412354, 1.3604211807250977]], 'indices': [[33, 32, 56, 36, 41]]}
------------------------------------------------------------
Question 4: How does Cosmos discuss the concept of light travel time?
{'distances'