# Generate Answers with LLM using RAG

Approaches:

1. Without using langchain:
- Create Prompt = Context + Query
    - Manually Retrieve Docs using metadata filter & retrieval method like 'mmr'
    - If method is 'mmr', filter retrieved docs based on siml / relv threshold
    - rerank if required
- Pass prompt to LLM call

2. With Langchain:
- Create a base retrieval based on metadata filter & retrieval method like 'mmr'
- Do reranking if rerank==True
- invoke chain with query

In [12]:
from codes.retrieve_docs import RetrieveDocs, ReRanking
from codes.generate_w_rag import LlmWithManualRag, LlmWithRag

## Query

In [13]:
metadata_keys = ['title', 'source', 'data_type', 'topic']
for key in metadata_keys: 
    metadata_key = Docs2VectorDb.sources_from_vdb(vector_store_multi, key)
    print('\n')
    print(metadata_key)



{'title': {'marketing.txt', 'qna_table.csv', 'Luminate Report Builder.pdf', 'Luminate Report Builder.docx'}}


{'source': {'data/multi_docs/Luminate Report Builder.docx', 'data/multi_docs/qna_table.csv', 'data/multi_docs/marketing.txt', 'data/multi_docs/Luminate Report Builder.pdf'}}


{'data_type': {'dataframe', 'word document', 'pdf', 'txt'}}


{'topic': {'marketing, toys', 'RB, Luminate', 'qna on topics like RB, luminate'}}


In [14]:
query = 'How have the toy stores changed over the years?'

### Explicitly filter on metadata and generate response based on query

#### Filter and retrieve documents based on query

In [15]:
metadata_filt = {
    'filter': {
        '$and': [
            {'title': {'$eq':'marketing.txt'}},
            {'data_type': {'$eq':'txt'}},
            # {'data_type': {'$in':['txt', 'dataframe']}},
            # {'topic': {'$eq':'RB, Luminate'}},
            ]
        }
    }

# search_kwargs={
#         'k': 4,
#         'fetch_k': 20,
# }

search_kwargs={
        'k': 20,
        'fetch_k': 100,
}


search_kwargs.update(metadata_filt)
print(search_kwargs)

{'k': 20, 'fetch_k': 100, 'filter': {'$and': [{'title': {'$eq': 'marketing.txt'}}, {'data_type': {'$eq': 'txt'}}]}}


In [16]:
docs_retrvd_w_reranking = RetrieveDocs.main(query, 
                                            vector_store_multi, 
                                            method_search='mmr', 
                                            rerank=False,
                                            **search_kwargs)
RetrieveDocs.pprint_docs(docs_retrvd_w_reranking)

------------------------------
Still, a visit to a toy store or the toys section of a grocery store in most places will make you feel that not much has changed in the past few decades.
{'data_type': 'txt', 'topic': 'marketing, toys'}


All in all, toy ads haven’t really evolved much over time in terms of the social messages they convey about gender. Most toys still have clearly gendered associations, with dolls being targeted
{'data_type': 'txt', 'topic': 'marketing, toys'}


To this end, Norgaard and Wider analysed 175 television commercials for toys listed as ‘best selling’ for children ages five through eleven years old by the top three toy retailers — Target, Walmart,
{'data_type': 'txt', 'topic': 'marketing, toys'}


What’s more, the study also found that the toy industry’s marketing techniques continue to forge gendered associations in more subtle ways, such as through the use of colour — pink for girls, blue
{'data_type': 'txt', 'topic': 'marketing, toys'}


Here’s the thing, th

#### Filter retrieved docs based on relevance
Remove docs that have similarity < threshold w.r.t query

In [17]:
docs_filtd_manual = LlmWithManualRag.filter_docs_on_siml(query, 
                                                         docs_retrvd_w_reranking, 
                                                         thresh=0.5, 
                                                         k=4)
len(docs_filtd_manual)
print(docs_filtd_manual)

Starting to Embed texts ...




Starting to Embed texts ...
[0.758124   0.52594703 0.51659364 0.5138917  0.5136213  0.512375
 0.5077028  0.49675336 0.49260297 0.4850593  0.48138386 0.4783549
 0.46552876 0.46435118 0.46306083 0.46000752 0.4561665  0.4402113
 0.42687887 0.4218801 ]
[ True  True  True  True  True  True  True False False False False False
 False False False False False False False False]


4

[Document(page_content='Still, a visit to a toy store or the toys section of a grocery store in most places will make you feel that not much has changed in the past few decades.', metadata={'data_type': 'txt', 'source': 'data/multi_docs/marketing.txt', 'title': 'marketing.txt', 'topic': 'marketing, toys'}), Document(page_content='All in all, toy ads haven’t really evolved much over time in terms of the social messages they convey about gender. Most toys still have clearly gendered associations, with dolls being targeted', metadata={'data_type': 'txt', 'source': 'data/multi_docs/marketing.txt', 'title': 'marketing.txt', 'topic': 'marketing, toys'}), Document(page_content='To this end, Norgaard and Wider analysed 175 television commercials for toys listed as ‘best selling’ for children ages five through eleven years old by the top three toy retailers — Target, Walmart,', metadata={'data_type': 'txt', 'source': 'data/multi_docs/marketing.txt', 'title': 'marketing.txt', 'topic': 'marketing

In [18]:
RetrieveDocs.pprint_docs(docs_filtd_manual)
# RetrieveDocs.pprint_docs([doc[0] for doc in docs_filtd_manual])

------------------------------
Still, a visit to a toy store or the toys section of a grocery store in most places will make you feel that not much has changed in the past few decades.
{'data_type': 'txt', 'topic': 'marketing, toys'}


All in all, toy ads haven’t really evolved much over time in terms of the social messages they convey about gender. Most toys still have clearly gendered associations, with dolls being targeted
{'data_type': 'txt', 'topic': 'marketing, toys'}


To this end, Norgaard and Wider analysed 175 television commercials for toys listed as ‘best selling’ for children ages five through eleven years old by the top three toy retailers — Target, Walmart,
{'data_type': 'txt', 'topic': 'marketing, toys'}


What’s more, the study also found that the toy industry’s marketing techniques continue to forge gendered associations in more subtle ways, such as through the use of colour — pink for girls, blue
{'data_type': 'txt', 'topic': 'marketing, toys'}




#### Create prompt inclusive of context

In [19]:
prompt_upd_wo_rr = LlmWithManualRag.add_context_to_prompt(query, 
                                                    docs_filtd_manual, 
                                                    rerank=False)

print(prompt_upd_wo_rr)

Context: Still, a visit to a toy store or the toys section of a grocery store in most places will make you feel that not much has changed in the past few decades.;All in all, toy ads haven’t really evolved much over time in terms of the social messages they convey about gender. Most toys still have clearly gendered associations, with dolls being targeted;To this end, Norgaard and Wider analysed 175 television commercials for toys listed as ‘best selling’ for children ages five through eleven years old by the top three toy retailers — Target, Walmart,;What’s more, the study also found that the toy industry’s marketing techniques continue to forge gendered associations in more subtle ways, such as through the use of colour — pink for girls, blue

        Answer the question based only on the context provided. 
        If you don't know the answer, say you do not know. 
        Decide based on the question if answer can be made concise or not. 
        If so, keep answer within three sent

  warn_deprecated(


In [20]:
prompt_upd_w_rr = LlmWithManualRag.add_context_to_prompt(query, 
                                                         docs_filtd_manual, 
                                                         rerank=True, 
                                                         rerank_method='simple')

print(prompt_upd_w_rr)

Context: Still, a visit to a toy store or the toys section of a grocery store in most places will make you feel that not much has changed in the past few decades.;To this end, Norgaard and Wider analysed 175 television commercials for toys listed as ‘best selling’ for children ages five through eleven years old by the top three toy retailers — Target, Walmart,;What’s more, the study also found that the toy industry’s marketing techniques continue to forge gendered associations in more subtle ways, such as through the use of colour — pink for girls, blue;All in all, toy ads haven’t really evolved much over time in terms of the social messages they convey about gender. Most toys still have clearly gendered associations, with dolls being targeted

        Answer the question based only on the context provided. 
        If you don't know the answer, say you do not know. 
        Decide based on the question if answer can be made concise or not. 
        If so, keep answer within three sent

### Use Langchain Retriever to filter on metadata and generate response 

In [21]:
search_kwargs

{'k': 20,
 'fetch_k': 100,
 'filter': {'$and': [{'title': {'$eq': 'marketing.txt'}},
   {'data_type': {'$eq': 'txt'}}]}}

In [22]:
retriever_base = vector_store_multi.as_retriever(
    
    search_type='mmr', # "similarity" (default), "mmr", or "similarity_score_threshold"
    search_kwargs=search_kwargs,
)

In [23]:
# only for checking
print(query)
docs_filtd = retriever_base.invoke(query)
RetrieveDocs.pprint_docs(docs_filtd)

How have the toy stores changed over the years?
------------------------------
Still, a visit to a toy store or the toys section of a grocery store in most places will make you feel that not much has changed in the past few decades.
{'data_type': 'txt', 'topic': 'marketing, toys'}


To this end, Norgaard and Wider analysed 175 television commercials for toys listed as ‘best selling’ for children ages five through eleven years old by the top three toy retailers — Target, Walmart,
{'data_type': 'txt', 'topic': 'marketing, toys'}


What’s more, the study also found that the toy industry’s marketing techniques continue to forge gendered associations in more subtle ways, such as through the use of colour — pink for girls, blue
{'data_type': 'txt', 'topic': 'marketing, toys'}


Toys are simply learning tools that communicate to children how they should move through the world and the kinds of things they might be interested in and aspire to.
{'data_type': 'txt', 'topic': 'marketing, toys'}




#### Without Reranking

In [24]:
chain_multi_docs_wo_rr = LlmWithRag.create_chain(retriever_base, 
                                                 rerank=False)

#### With reranking

In [25]:
chain_multi_docs_w_rr = LlmWithRag.create_chain(retriever_base, 
                                                rerank=True, 
                                                rerank_method='hf_crossencoder')



# Generate Answers with LLMs
> Switch on the VPN before running the below cells

In [26]:
answers = []

## Without Langchain

### Without Reranking

In [27]:
response_wo_lc = LlmWithManualRag.invoke_chain(prompt_upd_wo_rr)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [28]:
answer = response_wo_lc.content
print(answer)
answers.append(answer)

Based on the context provided, toy stores have not significantly changed over the years in terms of the social messages they convey about gender. Toys still have clearly gendered associations, and marketing techniques continue to use strategies like color-coding to reinforce these associations.


### With Reranking

In [29]:
response_wo_lc_pl_rr = LlmWithManualRag.invoke_chain(prompt_upd_wo_rr)

In [30]:
answer = response_wo_lc_pl_rr.content
print(answer)
answers.append(answer)

Based on the context provided, toy stores have not significantly changed over the years in terms of gendered marketing. Toys still have clearly gendered associations, such as dolls for girls and different colors like pink for girls and blue for boys being used to target children. The social messages about gender in toy advertising also remain relatively unchanged.


## With Langchain

### Without Ranking

In [31]:
response_w_lc = chain_multi_docs_wo_rr.invoke({'input':query})

In [32]:
answer = response_w_lc['answer']
print(answer)
answers.append(answer)

The provided context does not contain specific information about the changes in toy stores over the years. It mentions that a visit to a toy store may give the impression that not much has changed, and it discusses the persistence of gendered marketing in the toy industry, as well as some recent shifts towards gender-neutral toys in certain places. However, detailed changes in toy stores are not described in the given text.


### With Reranking

In [33]:
response_w_lc_pl_rr = chain_multi_docs_w_rr.invoke({'input':query})

In [34]:
answer = response_w_lc_pl_rr['answer']
print(answer)
answers.append(answer)

Based on the context provided, it appears that toy stores have not significantly changed their marketing techniques over the years, as they continue to promote gendered associations with toys, particularly through the use of color coding (pink for girls, blue for boys). Despite the study's findings on subtle changes in marketing, the general observation suggests that the toy industry's approach remains much the same as it has been in past decades.


## All Answers Analysis

In [35]:
methods = ['wo_lc', 'wo_lc_pl_rr', 'w_lc', 'w_lc_pl_rr']
methods

['wo_lc', 'wo_lc_pl_rr', 'w_lc', 'w_lc_pl_rr']

In [36]:
for method, answer in zip(methods, answers[-6:]):
    print(f"{method}:")
    print(answer, '\n')

wo_lc:
Based on the context provided, toy stores have not significantly changed over the years in terms of the social messages they convey about gender. Toys still have clearly gendered associations, and marketing techniques continue to use strategies like color-coding to reinforce these associations. 

wo_lc_pl_rr:
Based on the context provided, toy stores have not significantly changed over the years in terms of gendered marketing. Toys still have clearly gendered associations, such as dolls for girls and different colors like pink for girls and blue for boys being used to target children. The social messages about gender in toy advertising also remain relatively unchanged. 

w_lc:
The provided context does not contain specific information about the changes in toy stores over the years. It mentions that a visit to a toy store may give the impression that not much has changed, and it discusses the persistence of gendered marketing in the toy industry, as well as some recent shifts tow