<a href="https://colab.research.google.com/github/pjvillasista/Quote-Retrieval-AI/blob/main/Quotes_VectorStore.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install openai kagglehub langchain transformers torch langchain-community chromadb sentence-transformers faiss-cpu google-generativeai --quiet

# Import Dependencies

In [None]:
import kagglehub
import google.generativeai as genai
import pandas as pd
from transformers import AutoTokenizer, AutoModel
import torch
import faiss
import numpy as np
import os


from google.colab import userdata
GEMINI_API_KEY = userdata.get('GEMINI_API_KEY')

# Download and Transform the Dataset

In [None]:
# Download the dataset
path = kagglehub.dataset_download("mattimansha/inspirational-quotes")
df = pd.read_csv(os.path.join(path,'insparation.csv'))

Downloading from https://www.kaggle.com/api/v1/datasets/download/mattimansha/inspirational-quotes?dataset_version_number=1...


100%|██████████| 139k/139k [00:00<00:00, 38.5MB/s]

Extracting files...





In [None]:
# Keep only categories and quotes
df_filtered = df[['Category','Quote']]

# Create Embeddings

In [None]:
model_name = "sentence-transformers/all-MiniLM-L6-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

In [None]:
# Function to generate embeddings
def get_embedding(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
    with torch.no_grad():
        outputs = model(**inputs)
        embeddings = outputs.last_hidden_state.mean(dim=1).squeeze().cpu().numpy()
    return embeddings

In [None]:
# Generate embeddings for each quote
df_filtered['Embeddings'] = df_filtered['Quote'].apply(get_embedding)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_filtered['Embeddings'] = df_filtered['Quote'].apply(get_embedding)


In [None]:
df_filtered.head()

Unnamed: 0,Category,Quote,Embeddings
0,LOVE,Let us see what love can do.,"[-0.36089113, 0.05247897, 0.051789574, -0.4663..."
1,LOVE,We can’t heal the world today. But we can begi...,"[-0.3255086, 0.15255906, 0.23945698, 0.0073539..."
2,LISTENING,Listen with curiosity. Speak with honesty. Act...,"[-0.16999228, 0.07081837, 0.15499736, -0.43336..."
3,LISTENING,The most basic and powerful way to connect to ...,"[0.028896725, -0.41765365, -0.13275643, -0.184..."
4,LISTENING,"Knowledge speaks, but wisdom listens.","[0.512262, 0.34960765, -0.19285986, 0.07507623..."


In [None]:
# Convert Embeddings into a np array for faiss
embedding_matrix = np.vstack(df_filtered['Embeddings'].values)

# Create FAISS index
d = embedding_matrix.shape[1]  # dimension of the embeddings
index = faiss.IndexFlatL2(d)   # build a flat (L2) index
index.add(embedding_matrix)    # add the embeddings to the index

In [None]:
# Retrieve the msot relevant quote based on queries
def get_relevant_quote(query, top_n=1):
    query_embedding = get_embedding(query)
    distances, indices = index.search(np.array([query_embedding]), top_n)
    results = df_filtered.iloc[indices[0]]
    return results

In [None]:
query = "I feel unmotivated"
results = get_relevant_quote(query, top_n=1)

In [None]:
# Print the final response in Markdown format
print("Retrieved Quotes:")
for i, quote in enumerate(results['Quote']):
    print(f"Quote {i+1}: {quote}")

Retrieved Quotes:
Quote 1: Desire is the key to motivation, but it’s determination and commitment to an unrelenting pursuit of your goal - a commitment to excellence - that will enable you to attain the success you seek. 


# Gemini API

In [None]:
# Configure Google Gemini
genai.configure(api_key=GEMINI_API_KEY)
gemini_model = genai.GenerativeModel("gemini-1.5-flash")

In [None]:
# Function to use Google Gemini for natural language response
def generate_gemini_response(query, relevant_quotes):
    # Generate a natural language response using Google Gemini
    prompt = f"Act like a motivational and life coach. Given the query '{query}', and the following relevant quotes: {relevant_quotes}. Provide an insightful and succinct response."
    gemini_response = gemini_model.generate_content(prompt)
    return gemini_response.text

In [None]:
# Format the retrieved quotes for the Gemini prompt
retrieved_quotes = "\n".join(results['Quote'].tolist())

In [None]:
# generate response using gemini
gemini_response = generate_gemini_response(query, retrieved_quotes)

In [None]:
print(gemini_response)

You're feeling unmotivated, and that's okay.  We all hit those moments. But remember,  **desire is the spark that ignites motivation.**  What are you truly passionate about? What lights you up?  Once you identify that, **determination and commitment** are your fuel.  Don't just dream it, chase it relentlessly.  Commit to excellence, and the success you seek will be yours.  Start small, take action, and watch the motivation grow.  You've got this! 



# Response Formatting

In [None]:
from IPython.display import display
from IPython.display import Markdown
import textwrap


def to_markdown(text):
    text = text.replace("•", "  *")
    return Markdown(textwrap.indent(text, "> ", predicate=lambda _: True))

In [None]:
to_markdown(gemini_response)

> You're feeling unmotivated, and that's okay.  We all hit those moments. But remember,  **desire is the spark that ignites motivation.**  What are you truly passionate about? What lights you up?  Once you identify that, **determination and commitment** are your fuel.  Don't just dream it, chase it relentlessly.  Commit to excellence, and the success you seek will be yours.  Start small, take action, and watch the motivation grow.  You've got this! 


In [None]:
retrieved_quotes

'Desire is the key to motivation, but it’s determination and commitment to an unrelenting pursuit of your goal - a commitment to excellence - that will enable you to attain the success you seek. '