## Creating Vector Store

In [1]:
!pip install -q langchain-community langchain-google-genai faiss-cpu

Collecting langchain-community
  Downloading langchain_community-0.3.24-py3-none-any.whl.metadata (2.5 kB)
Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.5-py3-none-any.whl.metadata (5.2 kB)
Collecting langchain-core<1.0.0,>=0.3.59 (from langchain-community)
  Downloading langchain_core-0.3.63-py3-none-any.whl.metadata (5.8 kB)
Collecting langchain<1.0.0,>=0.3.25 (from langchain-community)
  Downloading langchain-0.3.25-py3-none-any.whl.metadata (7.8 kB)
Collecting SQLAlchemy<3,>=1.4 (from langchain-community)
  Downloading sqlalchemy-2.0.41-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting langsmith<0.4,>=0.1.125 (from langchain-community)
  Do

In [2]:
# Improting libraries and setting up api keys
import os
import warnings
import pandas as pd
from dotenv import load_dotenv
from langchain.vectorstores import FAISS
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# Load the .env file
load_dotenv()

# Set your API key
os.environ["GOOGLE_API_KEY"] = os.getenv("GOOGLE_API_KEY")

# Handling warnings
warnings.filterwarnings('ignore')

# Google Embedding Model
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

In [3]:
# loading & exploring data
books = pd.read_csv('books_with_emotion_scores.csv')
books.head()

Unnamed: 0,isbn13,isbn10,title,authors,categories,thumbnail,description,published_year,average_rating,num_pages,...,title_and_subtitle,tagged_description,broad_category,fear,neutral,sadness,surprise,disgust,joy,anger
0,9780002005883,2005883,Gilead,Marilynne Robinson,Fiction,http://books.google.com/books/content?id=KQZCP...,A NOVEL THAT READERS and critics have been eag...,2004.0,3.85,247.0,...,Gilead,9780002005883 A NOVEL THAT READERS and critics...,Fiction,0.654841,0.169852,0.116409,0.020701,0.019101,0.015161,0.003935
1,9780002261982,2261987,Spider's Web,Charles Osborne;Agatha Christie,Detective and mystery stories,http://books.google.com/books/content?id=gA5GP...,A new 'Christie for Christmas' -- a full-lengt...,2000.0,3.83,241.0,...,Spider's Web: A Novel,9780002261982 A new 'Christie for Christmas' -...,Fiction,0.755521,0.050591,0.08562,0.068844,0.018019,0.003123,0.018282
2,9780006178736,6178731,Rage of angels,Sidney Sheldon,Fiction,http://books.google.com/books/content?id=FKo2T...,"A memorable, mesmerizing heroine Jennifer -- b...",1993.0,3.93,512.0,...,Rage of angels,"9780006178736 A memorable, mesmerizing heroine...",Fiction,0.939291,0.007241,0.002299,0.003145,0.005369,0.018979,0.023676
3,9780006280897,6280897,The Four Loves,Clive Staples Lewis,Christian life,http://books.google.com/books/content?id=XhQ5X...,Lewis' work on the nature of love divides love...,2002.0,4.15,170.0,...,The Four Loves,9780006280897 Lewis' work on the nature of lov...,Non Fiction,0.230528,0.201328,0.027787,0.004284,0.198185,0.005105,0.332783
4,9780006280934,6280935,The Problem of Pain,Clive Staples Lewis,Christian life,http://books.google.com/books/content?id=Kk-uV...,"""In The Problem of Pain, C.S. Lewis, one of th...",2002.0,4.09,176.0,...,The Problem of Pain,"9780006280934 ""In The Problem of Pain, C.S. Le...",Non Fiction,0.00475,0.854798,0.015526,0.004517,0.068829,0.029622,0.021958


In [4]:
# tagged_description is created because after getting most similar description
# we need to fetch all the details of the similar books so using isbn data we can fetch that
books['tagged_description']

Unnamed: 0,tagged_description
0,9780002005883 A NOVEL THAT READERS and critics...
1,9780002261982 A new 'Christie for Christmas' -...
2,"9780006178736 A memorable, mesmerizing heroine..."
3,9780006280897 Lewis' work on the nature of lov...
4,"9780006280934 ""In The Problem of Pain, C.S. Le..."
...,...
5053,9788172235222 On A Train Journey Home To North...
5054,9788173031014 This book tells the tale of a ma...
5055,9788179921623 Wisdom to Create a Life of Passi...
5056,9788185300535 This collection of the timeless ...


In [5]:
# Creating new text file on which RAG is going to be performed
books['tagged_description'].to_csv('books_tagged_description.txt', sep='\n', index=False, header=False)

In [6]:
# loading the Book Description file
raw_document = TextLoader('books_tagged_description.txt', encoding='utf-8').load()

In [7]:
# Splitting raw documents into chunks
text_splitter = CharacterTextSplitter(separator='\n', chunk_size=0, chunk_overlap=0)
documents = text_splitter.split_documents(raw_document)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m


In [8]:
documents[0]

Document(metadata={'source': 'books_tagged_description.txt'}, page_content='9780002005883 A NOVEL THAT READERS and critics have been eagerly anticipating for over a decade, Gilead is an astonishingly imagined story of remarkable lives. John Ames is a preacher, the son of a preacher and the grandson (both maternal and paternal) of preachers. It’s 1956 in Gilead, Iowa, towards the end of the Reverend Ames’s life, and he is absorbed in recording his family’s story, a legacy for the young son he will never see grow up. Haunted by his grandfather’s presence, John tells of the rift between his grandfather and his father: the elder, an angry visionary who fought for the abolitionist cause, and his son, an ardent pacifist. He is troubled, too, by his prodigal namesake, Jack (John Ames) Boughton, his best friend’s lost son who returns to Gilead searching for forgiveness and redemption. Told in John Ames’s joyous, rambling voice that finds beauty, humour and truth in the smallest of life’s detai

In [12]:
# Create FAISS vector store from documents
faiss_db = FAISS.from_documents(
    documents=documents,
    embedding=embeddings
)

# Specify a directory to save the FAISS index and metadata
faiss_dir = "faiss_books_store"
os.makedirs(faiss_dir, exist_ok=True)

# Save FAISS vector index
faiss_db.save_local(faiss_dir)

## Loading & Processing Vector Data

In [13]:
# loading saved vector database
import pandas as pd
from langchain.vectorstores import FAISS
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# lading data
books = pd.read_csv('books_with_emotion_scores.csv')

# Load the same embedding model
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# Load FAISS vector store from local directory
faiss_dir = "faiss_books_store"
db_books = FAISS.load_local(faiss_dir, embeddings, allow_dangerous_deserialization=True)

In [14]:
query = "A book to teach children about the nature"
docs = db_books.similarity_search(query, k=10)

isbn13 = [int(doc.page_content.split()[0].strip('""')) for doc in docs]
isbn13

[9780786808069,
 9780786808717,
 9780753459645,
 9780067575208,
 9780763620875,
 9780142302279,
 9780060782139,
 9780786855384,
 9780789458209,
 9780786808397]

In [15]:
books[books['isbn13'].isin(isbn13)]['title']

Unnamed: 0,title
226,Time For Kids: Butterflies!
436,The Sense of Wonder
798,Dirty Beasts
3429,I Wonder Why the Sun Rises
3486,Judy Moody Saves the World!
3644,Baby Einstein: Neighborhood Animals
3647,Baby Einstein: Dogs
3648,Baby Einstein: What Does Violet See? Raindrops...
3667,Disney's Little Einsteins: Butterfly Suits
3694,Tree


In [16]:
# Function to retrieve book recommendations
def retrieve_semantic_recommendation(query: str, top_k: int = 10,)->pd.DataFrame:
    recommendation = db_books.similarity_search(query, k=50)

    books_list = []
    for i in range(len(recommendation)):
        books_list += [int(recommendation[i].page_content.split()[0].strip('""'))]

    return books[books['isbn13'].isin(books_list)].head(top_k)

In [18]:
retrieve_semantic_recommendation("A book to teach children about nature", top_k=10)

Unnamed: 0,isbn13,isbn10,title,authors,categories,thumbnail,description,published_year,average_rating,num_pages,...,title_and_subtitle,tagged_description,broad_category,fear,neutral,sadness,surprise,disgust,joy,anger
59,9780007151240,0007151241,The Family Way,Tony Parsons,Parenthood,http://books.google.com/books/content?id=dJEIx...,It should be the most natural thing in the wor...,2005.0,3.51,400.0,...,The Family Way,9780007151240 It should be the most natural th...,Non Fiction,0.004657,0.26829,0.029999,0.008528,0.32058,0.012199,0.355747
105,9780060256579,0060256575,The Missing Piece Meets the Big O,Shel Silverstein,Juvenile Fiction,http://books.google.com/books/content?id=-m4gw...,The missing piece sat alone waiting for someon...,1981.0,4.33,104.0,...,The Missing Piece Meets the Big O,9780060256579 The missing piece sat alone wait...,Children Fiction,0.00485,0.893323,0.023256,0.006268,0.048882,0.016201,0.007221
226,9780060782139,0060782137,Time For Kids: Butterflies!,Editors of TIME For Kids,Juvenile Nonfiction,http://books.google.com/books/content?id=OdZxn...,"Butterflies There are 20,000 different kinds o...",2006.0,4.0,32.0,...,Time For Kids: Butterflies!,"9780060782139 Butterflies There are 20,000 dif...",Children Non Fiction,0.015704,0.244437,0.00568,0.602652,0.012127,0.075343,0.044056
401,9780064403870,0064403874,"R-T, Margaret, and the Rats of NIMH",Jane Leslie Conly,Juvenile Fiction,http://books.google.com/books/content?id=WTHHH...,"When Margaret and her younger brother, Artie, ...",1991.0,3.52,272.0,...,"R-T, Margaret, and the Rats of NIMH",9780064403870 When Margaret and her younger br...,Children Fiction,0.078056,0.773385,0.094225,0.007528,0.006754,0.01418,0.025872
414,9780064408677,0064408671,The Trumpet of the Swan,E. B. White,Juvenile Fiction,http://books.google.com/books/content?id=2lybT...,"Swan Song Like the rest of his family, Louis i...",2000.0,4.07,252.0,...,The Trumpet of the Swan,9780064408677 Swan Song Like the rest of his f...,Children Fiction,0.00535,0.528684,0.079504,0.263945,0.044736,0.017332,0.06045
423,9780064434980,0064434982,The Deer in the Wood,Laura Ingalls Wilder,Juvenile Fiction,http://books.google.com/books/content?id=V7YDW...,Even the youngest child can enjoy a special ad...,1999.0,4.17,32.0,...,The Deer in the Wood,9780064434980 Even the youngest child can enjo...,Children Fiction,0.001565,0.563328,0.009096,0.007114,0.028798,0.383336,0.006765
436,9780067575208,006757520X,The Sense of Wonder,Rachel Carson,Nature,http://books.google.com/books/content?id=Zee5S...,"First published more than three decades ago, t...",1998.0,4.39,112.0,...,The Sense of Wonder,9780067575208 First published more than three ...,Non Fiction,0.015527,0.240659,0.011032,0.030308,0.014934,0.68332,0.004219
703,9780140621624,0140621628,The Railway Children,E. Nesbit,Fiction,http://books.google.com/books/content?id=fFesd...,"When their father is sent away to prison, thre...",1995.0,4.0,212.0,...,The Railway Children,9780140621624 When their father is sent away t...,Fiction,0.00547,0.059792,0.702477,0.001648,0.132486,0.03676,0.061368
752,9780141186078,0141186070,The Log from the Sea of Cortez,John Steinbeck,Biography & Autobiography,http://books.google.com/books/content?id=9CrIf...,This light-hearted journal tells of John Stein...,2001.0,3.84,288.0,...,The Log from the Sea of Cortez,9780141186078 This light-hearted journal tells...,Non Fiction,0.000561,0.017309,0.008801,0.003068,0.003476,0.965842,0.000943
798,9780142302279,0142302279,Dirty Beasts,Roald Dahl,Juvenile Nonfiction,,Poems tell the stories of a smart pig who outw...,2002.0,4.02,32.0,...,Dirty Beasts,9780142302279 Poems tell the stories of a smar...,Children Non Fiction,0.004527,0.053778,0.002497,0.001855,0.906768,0.001518,0.029058
