In [1]:
import cohere
import faiss

import numpy as np
import pandas as pd
from google.colab import userdata

Loading Cohere's API.

In [2]:
co = cohere.Client(userdata.get('COHERE_KEY'))

Chunking Interestelar's Wikipedia text.

In [3]:
text = """
Interstellar is a 2014 epic science fiction film co-written,
directed, and produced by Christopher Nolan.
It stars Matthew McConaughey, Anne Hathaway, Jessica Chastain,
Bill Irwin, Ellen Burstyn, Matt Damon, and Michael Caine.
Set in a dystopian future where humanity is struggling to
survive, the film follows a group of astronauts who travel
through a wormhole near Saturn in search of a new home for
mankind.
Brothers Christopher and Jonathan Nolan wrote the screenplay,
which had its origins in a script Jonathan developed in 2007.
Caltech theoretical physicist and 2017 Nobel laureate in
Physics [4] Kip Thorne was an executive producer, acted as a
scientific consultant, and wrote a tie-in book, The Science of
Interstellar.
Cinematographer Hoyte van Hoytema shot it on 35 mm movie film in
the Panavision anamorphic format and IMAX 70 mm.
Principal photography began in late 2013 and took place in
Alberta, Iceland, and Los Angeles.
Interstellar uses extensive practical and miniature effects and
the company Double Negative created additional digital effects.
Interstellar premiered on October 26, 2014, in Los Angeles.
In the United States, it was first released on film stock,
expanding to venues using digital projectors.
The film had a worldwide gross over $677 million (and $773
million with subsequent re-releases), making it the tenth-highest
grossing film of 2014.
It received acclaim for its performances, direction, screenplay,
musical score, visual effects, ambition, themes, and emotional
weight.
It has also received praise from many astronomers for its
scientific accuracy and portrayal of theoretical astrophysics.
Since its premiere, Interstellar gained a cult following,[5] and now is regarded by many sci-fi experts as one of the best
science-fiction films of all time.
Interstellar was nominated for five awards at the 87th Academy
Awards, winning Best Visual Effects, and received numerous other
accolades"""

texts = [t.strip() for t in text.replace('\n', ' ').split('.') if t.strip()]

print(texts[0])

Interstellar is a 2014 epic science fiction film co-written, directed, and produced by Christopher Nolan


Embedding the chunks.

In [4]:
response = co.embed(
    texts=texts,
    input_type="search_document",
).embeddings

embeds = np.array(response)

dim = embeds.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(np.float32(embeds))

Building a query function.

In [5]:
def search_query(query, number_of_results=3):
    query_embed = co.embed(
        texts=[query],
        input_type="search_query"
    ).embeddings[0]

    distances, similar_item_ids = index.search(
        np.float32([query_embed]),
        number_of_results
    )

    results = [texts[i] for i in similar_item_ids[0]]

    return results, distances[0]

Printing the most similar chunks.

In [6]:
query = "how precise was the science in Interestelar"
retrieved_docs, distances = search_query(query)

for i, (doc, dist) in enumerate(zip(retrieved_docs, distances)):
    print(f"Rank {i+1} (dist: {dist:.2f}): {doc}")

Rank 1 (dist: 9129.67): Caltech theoretical physicist and 2017 Nobel laureate in Physics [4] Kip Thorne was an executive producer, acted as a scientific consultant, and wrote a tie-in book, The Science of Interstellar
Rank 2 (dist: 9431.79): It has also received praise from many astronomers for its scientific accuracy and portrayal of theoretical astrophysics
Rank 3 (dist: 10427.22): Since its premiere, Interstellar gained a cult following,[5] and now is regarded by many sci-fi experts as one of the best science-fiction films of all time


Creting and testing a mini RAG function.

In [7]:
def rag_chat(query):
    query_embed = co.embed(
        texts=[query],
        input_type="search_query"
    ).embeddings[0]

    distances, indices = index.search(
        np.array([query_embed], dtype=np.float32),
        k=3
    )

    retrieved_texts = [texts[i] for i in indices[0]]
    structured_docs = [{'text': t} for t in retrieved_texts]

    response = co.chat(
        message=query,
        documents=structured_docs
    )

    print(f"Query: {query}")
    print(f"Answer: {response.text}")

    if response.citations:
        print("\n--- Citations ---")
        for cite in response.citations:
            print(f"Ref: '{response.text[cite.start:cite.end]}' -> Supporting: {cite.text}")

rag_chat("Who helped with the physics in the movie?")

Query: Who helped with the physics in the movie?
Answer: Kip Thorne, a Caltech theoretical physicist and 2017 Nobel laureate in Physics, was an executive producer, acted as a scientific consultant, and wrote a tie-in book, *The Science of Interstellar*.

--- Citations ---
Ref: 'Kip Thorne' -> Supporting: Kip Thorne
Ref: 'Caltech theoretical physicist' -> Supporting: Caltech theoretical physicist
Ref: '2017 Nobel laureate in Physics' -> Supporting: 2017 Nobel laureate in Physics
Ref: 'executive producer' -> Supporting: executive producer
Ref: 'acted as a scientific consultant' -> Supporting: acted as a scientific consultant
Ref: 'wrote a tie-in book' -> Supporting: wrote a tie-in book
Ref: 'The Science of Interstellar' -> Supporting: The Science of Interstellar
