In [52]:
from typing import *
import numpy as np
from sentence_transformers import SentenceTransformer, models
import math

In [53]:
# Document which will be stored in the vector space
document = """The concept of an expanding universe revolutionized our understanding of cosmology and the nature of the cosmos. This groundbreaking idea was first proposed by Georges Lemaître in the 1920s and later substantiated by Edwin Hubble's observations. Hubble discovered that galaxies are moving away from us at velocities proportional to their distances, a phenomenon now known as Hubble's Law. This observation provided compelling evidence for the Big Bang theory, suggesting that the universe had a beginning and has been expanding ever since. The expansion is not into empty space but rather involves the stretching of space itself, causing galaxies to move apart over time.

At the core of this discovery is the redshift of light from distant galaxies. As galaxies recede, the light they emit shifts toward the red end of the spectrum, an effect analogous to the Doppler shift in sound. This redshift is a direct indication of the galaxies' velocities relative to us. The farther away a galaxy is, the faster it appears to be moving away. This relationship, quantified in Hubble's Law, has profound implications for understanding the size, age, and ultimate fate of the universe. The current rate of expansion, known as the Hubble constant, allows astronomers to estimate the age of the universe at approximately 13.8 billion years.

The expansion of the universe also has significant implications for the future. One key question is whether the expansion will continue indefinitely or eventually reverse. The fate of the universe depends on its total mass and the mysterious dark energy that constitutes about 68 percentage of the cosmos. Dark energy, a form of energy that permeates all of space, is believed to be responsible for the accelerated expansion of the universe observed in recent decades. If dark energy continues to drive the expansion, the universe could expand forever, leading to a cold, dark, and dilute state known as the "heat death." Conversely, if gravity were to overcome dark energy, the universe might eventually collapse in a "Big Crunch."

The discovery of the expanding universe has opened up new avenues of research and raised fundamental questions about the nature of space, time, and matter. It challenges our understanding of physics and compels scientists to develop new theories to explain the observed phenomena. As we continue to probe deeper into the cosmos with advanced telescopes and technologies, we gain more insights into the mechanisms driving the expansion and the ultimate fate of the universe. The expanding universe remains a central theme in cosmology, shaping our quest to unravel the mysteries of the cosmos and our place within it."""

In [54]:
# Model which we will be using for embedding the input document
model = SentenceTransformer('sentence-transformers/all-MiniLM-L12-v2')

In [55]:
# document splitting and embedding

docs = document.split('\n')
embedding = model.encode(docs)

In [56]:
print(type(embedding))

<class 'numpy.ndarray'>


In [57]:
def cosine(embed):
    a = np.linalg.norm(embedding, axis=1)
    b = np.linalg.norm(embed)
    return np.dot(embedding, embed) / a * b

def euclidean(embed):
    distance: np.ndarray = np.sqrt((embedding - embed) ** 2).sum(axis=1)
    print("Euclidean distance function output: ", distance)
    return distance

def get_topk_similar_results(query_embedding, k, similarity_metric):
    assert query_embedding.ndim == 1
    if similarity_metric == "cosine":
        array = cosine(query_embedding)
        sorted_array = np.argsort(array)
        topk = sorted_array[::-1][:k]
    elif similarity_metric == "euclidean":
        array = euclidean(query_embedding)
        sorted_array = np.argsort(array)
        print(sorted_array)
        topk = sorted_array[:k]

    print(topk)
    topk_docs = []
    for i in topk:
        topk_docs.append(docs[i])
    topk_distances = array[topk]
    print(topk_docs)

    return topk_docs, topk_distances


In [60]:
# Searching similar
query = "WHat does the author say about the implications for the future?"
query_embedding = model.encode(query)
res, b = get_topk_similar_results(query_embedding, 2, "cosine")

[4 6]
['The expansion of the universe also has significant implications for the future. One key question is whether the expansion will continue indefinitely or eventually reverse. The fate of the universe depends on its total mass and the mysterious dark energy that constitutes about 68 percentage of the cosmos. Dark energy, a form of energy that permeates all of space, is believed to be responsible for the accelerated expansion of the universe observed in recent decades. If dark energy continues to drive the expansion, the universe could expand forever, leading to a cold, dark, and dilute state known as the "heat death." Conversely, if gravity were to overcome dark energy, the universe might eventually collapse in a "Big Crunch."', 'The discovery of the expanding universe has opened up new avenues of research and raised fundamental questions about the nature of space, time, and matter. It challenges our understanding of physics and compels scientists to develop new theories to explain

In [61]:
print(res)

['The expansion of the universe also has significant implications for the future. One key question is whether the expansion will continue indefinitely or eventually reverse. The fate of the universe depends on its total mass and the mysterious dark energy that constitutes about 68 percentage of the cosmos. Dark energy, a form of energy that permeates all of space, is believed to be responsible for the accelerated expansion of the universe observed in recent decades. If dark energy continues to drive the expansion, the universe could expand forever, leading to a cold, dark, and dilute state known as the "heat death." Conversely, if gravity were to overcome dark energy, the universe might eventually collapse in a "Big Crunch."', 'The discovery of the expanding universe has opened up new avenues of research and raised fundamental questions about the nature of space, time, and matter. It challenges our understanding of physics and compels scientists to develop new theories to explain the o

In [62]:
res_2, b_2 = get_topk_similar_results(query_embedding, 2, "euclidean")

Euclidean distance function output:  [19.423164 18.868782 19.084885 18.868782 16.779388 18.868782 18.689133]
[4 6 1 3 5 2 0]
[4 6]
['The expansion of the universe also has significant implications for the future. One key question is whether the expansion will continue indefinitely or eventually reverse. The fate of the universe depends on its total mass and the mysterious dark energy that constitutes about 68 percentage of the cosmos. Dark energy, a form of energy that permeates all of space, is believed to be responsible for the accelerated expansion of the universe observed in recent decades. If dark energy continues to drive the expansion, the universe could expand forever, leading to a cold, dark, and dilute state known as the "heat death." Conversely, if gravity were to overcome dark energy, the universe might eventually collapse in a "Big Crunch."', 'The discovery of the expanding universe has opened up new avenues of research and raised fundamental questions about the nature of 