# RAG test

In this notebook we'll test a RAG implementation for VRT MAX search


### STEP 1: Install the necessary dependencies



#### Check if CUDA-enabled GPU (Graphics Processing Unit) is available for computation using PyTorch. 

If the function returns True, it means that a CUDA GPU is available for use; otherwise, it returns False, indicating that only CPU will be used for computation.

In [1]:
import torch
torch.cuda.is_available()

True

In [2]:
# Calculate amount of GPU ram needed for llm
number_of_parameters_in_billion = 8
amount_of_gpu_ram_needed = 1.2*(number_of_parameters_in_billion*4)/(32/4)

Query and display information about the NVIDIA GPU(s) installed on the system.

In [3]:
!nvidia-smi

Sat Jul 27 20:44:46 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce RTX 2060        Off | 00000000:01:00.0  On |                  N/A |
| 27%   49C    P3              37W / 170W |    779MiB /  6144MiB |     13%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

 #### Authenticate and log in to the Hugging Face model hub using an authentication token. The token is used to verify the user's identity and grant access to certain features or resources on the Hugging Face platform.

In [4]:
!huggingface-cli login --token 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

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /home/willem/.cache/huggingface/token
Login successful


This command provides the username associated with the authenticated account

In [5]:
!huggingface-cli whoami

willemlenaertsvrt


### Fetch catalog data
Get catalog data and make a vector store of the selected content

In [6]:
import pandas as pd
df = pd.read_csv("test.csv")

In [7]:
# Postprocessing
# Concatenate description & subtitles
df["info_to_embed"] = df["mediacontent_page_description"] + " " + df["subtitle"]
df = df[df.info_to_embed.notnull()]

In [8]:
# Postprocessing
# Concatenate description & subtitles
df["info_to_embed"] = df["mediacontent_page_description"] + " " + df["subtitle"]
df = df[df.info_to_embed.notnull()]
# Maybe remove /n and other annoying artefacts

In [9]:
# Embed
from langchain.document_loaders import DataFrameLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS

df= df[["mediacontent_page_description","mediacontent_pagetitle_program","info_to_embed","mediacontent_pageid"]]

loader = DataFrameLoader(df, page_content_column="info_to_embed")
catalog = loader.load()

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
catalog_chunks = text_splitter.split_documents(catalog)

model_name = "NetherlandsForensicInstitute/robbert-2022-dutch-sentence-transformers"
encode_kwargs = {'normalize_embeddings': False}
hf_embedding = HuggingFaceEmbeddings(
    model_name=model_name,
    encode_kwargs=encode_kwargs
)
vector_store = FAISS.from_documents(documents=catalog_chunks,
                                    embedding=hf_embedding)

print(f"The vectore store contains {len(vector_store.docstore._dict)} documents chunks in total.")

Created a chunk of size 1218, which is longer than the specified 1000
  from tqdm.autonotebook import tqdm, trange


The vectore store contains 14566 documents chunks in total.


In [10]:
# Test retrieval 
retriever = vector_store.as_retriever()
query = "programma over geld en beleggen"
docs = retriever.get_relevant_documents(query)

  warn_deprecated(


In [11]:
results_with_scores = vector_store.similarity_search_with_score(query)
for doc, score in results_with_scores:
    print(f"Metadata: {doc.metadata}, Score: {score}")

Metadata: {'mediacontent_page_description': "Er zijn ontelbare hobby-investeerders die hun spaarcenten liever besteden op de beurs dan te laten verstoffen op een spaarrekening. Kriebelt het soms om een 'gokje' te wagen, maar weet je niet waar te beginnen? UHasselt-econome Anneleen Michiels helpt je op weg met een basislesje in aandelen. Hier alvast twee basisprincipes: 1) spendeer enkel wat je missen kan2) niemand kan de toekomst voorspellen", 'mediacontent_pagetitle_program': 'Universiteit van Vlaanderen', 'mediacontent_pageid': 1581070951757}, Score: 224.24224853515625
Metadata: {'mediacontent_page_description': 'Duidingsprogramma over de economische actualiteit van de voorbije week.', 'mediacontent_pagetitle_program': 'De markt', 'mediacontent_pageid': 1708729503543}, Score: 230.4586944580078
Metadata: {'mediacontent_page_description': 'Hoe gemakkelijk kan je een extra spaarcent bijverdienen? Hoe gewiekst gaan sommige online kredietverstrekkers te werk? Zijn juweliers altijd eerlijk

In [12]:
# Build prompt
beschrijving = ""
for item in results_with_scores:
    item[0].metadata["mediacontent_page_description"]
    item[0].metadata["mediacontent_pageid"]
    item[0].metadata["mediacontent_pagetitle_program"]
    beschrijving += "Episode van programm " + item[0].metadata["mediacontent_pagetitle_program"] + "en met episodenummer " + str(item[0].metadata["mediacontent_pageid"]) + " heeft volgende beschrijving: " + item[0].metadata["mediacontent_page_description"]

# prompt = f"""Hieronder een lijst van beschrijving van TV episodes. Kan je op basis van deze informatie zeggen in welke episode Isolde zat? De nummer is voldoende.
prompt = f"""{query}

{beschrijving}

"""

In [16]:

gpu_memory_gb = torch.cuda.get_device_properties(0).total_memory/10**9
gpu_memory_gb

6.213402624

In [20]:
import os
from transformers import AutoTokenizer, LlamaForCausalLM, LlamaTokenizer, pipeline, AutoModelForCausalLM
import transformers
from torch import cuda, bfloat16

gpu_memory_gb = torch.cuda.get_device_properties(0).total_memory/10**9
device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

if gpu_memory_gb > 15:

    model_id= "meta-llama/Meta-Llama-3-8B-Instruct"

    # set quantization configuration to load large model with less GPU memory
    # this requires the `bitsandbytes` library
    bnb_config = transformers.BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type='nf4',
        bnb_4bit_use_double_quant=True,
        bnb_4bit_compute_dtype=bfloat16
    )

    model = transformers.AutoModelForCausalLM.from_pretrained(
        model_id,
        trust_remote_code=True,
        quantization_config=bnb_config,
        device_map=cuda.current_device()
    )
    model.eval()
    print(f"Model loaded on {device}")

    tokenizer = AutoTokenizer.from_pretrained(model_id, token=True)
    generate_text = transformers.pipeline(
        model=model, tokenizer=tokenizer,
        return_full_text=False,  # langchain expects the full text(set to True when using Langchain)
        task='text-generation', # LLM task
        # we pass model parameters here too
        temperature=0.1,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
        max_new_tokens=512,  # max number of tokens to generate in the output
        repetition_penalty=1.1  # without this output begins repeating
        
    )
else:
    tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablelm-2-1_6b")
    model = AutoModelForCausalLM.from_pretrained(
    "stabilityai/stablelm-2-1_6b",
    torch_dtype="auto",
    )

    model.cuda()

    inputs = tokenizer("The weather is always wonderful", return_tensors="pt").to(model.device)
    tokens = model.generate(
    **inputs,
    max_new_tokens=64,
    temperature=0.70,
    top_p=0.95,
    do_sample=True,
    )

    print(tokenizer.decode(tokens[0], skip_special_tokens=True))




Setting `pad_token_id` to `eos_token_id`:100257 for open-end generation.


The weather is always wonderful in the summer, and when the sun shines it makes me want to do a lot of things! Especially go to the beach. I love the beach and have been to the beach almost every summer since I was a little girl. I’ve been to all the beaches in Florida, except for the beach in Key West.


: 

In [19]:
tokens


tensor([[  791,  9282,   374,  2744, 11364,   304,   279,  7474,    11,   719,
           279, 20472,   527,  1633,  1579,   304,   279,  7474,    13,   763,
           279,  7474,    11,   433,   374, 69919,   311,  1935,   279,  2768,
         61003,   512,    16,    13, 42162,  3177, 17895,   198,    17,    13,
         12040, 21420, 18808,   198,    18,    13, 48573, 11510,   315,  3090,
           198,    19,    13, 29837,   304,   279, 28601,   198,    20,    13,
         12040,   701,  9499,   198,    21,    13, 12040,   701,  9499]],
       device='cuda:0')

In [15]:
#generate text function
def generate_text(system, instruction, input=None):
    
    if input:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
    else:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:\n"
    
    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024, 'top_k': 50}

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'],
            temperature=instance['temperature'],
            top_k=instance['top_k']
        )    
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    return f'[!] Response: {string}'

# Sample Test Instruction Used by Youtuber Sam Witteveen https://www.youtube.com/@samwitteveenai
system = 'You are an AI assistant that follows instruction extremely well. Help as much as you can.'
instruction = 'Write a letter to Sam Altman, CEO of OpenAI, requesting him to convert GPT4 a private model by OpenAI to an open source project'
print(generate_text(system, instruction))

[!] Response: Dear Sam Altman,

I hope this letter finds you well. I am writing to request a favor from you and request that GPT4, which has been a private model of OpenAI, be converted to an open source project.

As you may know, OpenAI is committed to advancing digital intelligence in a way that is safe and beneficial to humanity. One of the ways we aim to achieve this is by making our research accessible to the public, which is why I am writing to request that GPT4 be made available as an open source project.

By converting GPT4 to an open source project, we can ensure that the technology is available for anyone to use, study, and improve. This will help to accelerate the development of AI applications, and contribute to a more robust and ethical AI ecosystem.

I understand that there may be concerns around potential misuse of the technology, but I am confident that OpenAI has taken the necessary steps to ensure that GPT4 is used safely and ethically. We have also implemented severa

In [19]:
import os
from transformers import AutoTokenizer
import transformers
import torch
from torch import cuda, bfloat16

model_id= "meta-llama/Meta-Llama-3-8B-Instruct"

device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# set quantization configuration to load large model with less GPU memory
# this requires the `bitsandbytes` library
bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=bfloat16
)

model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    quantization_config=bnb_config,
    device_map=cuda.current_device()
)
model.eval()
print(f"Model loaded on {device}")

config.json:   0%|          | 0.00/654 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/187 [00:00<?, ?B/s]

Model loaded on cuda:0


In [20]:
tokenizer = AutoTokenizer.from_pretrained(model_id, token=True)

tokenizer_config.json:   0%|          | 0.00/51.0k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/73.0 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [21]:
from transformers import pipeline




In [22]:
res = generate_text("Wat is een homo sapiens")
print(res[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


? Een soort van aap, maar dan met een brein dat kan denken en leren. En wat is een aap? Een soort van dier dat op de grond leeft en voedsel zoekt. En wat is een dier? Een levend wezen dat ademhaalt en zich voedt.

En zo kunnen we doorlopen tot de basisvragen over het bestaan zelf: Wat is het bestaan? Is het niet slechts een illusie, een constructie van ons brein om ons te laten geloven dat er iets is buiten ons eigen bewustzijn?

Maar als we diep genoeg in deze vraag duiken, komen we tot de conclusie dat het bestaan wel degelijk is, omdat wij het ervaren. Wij zijn hier, wij leven, wij denken, wij voelen. En dat is het enige wat werkelijk is.

Dit is de kern van het existentialisme: het bestaan is het enige waarop wij ons kunnen baseren, omdat het enige is wat werkelijk is. Alles anders is slechts een constructie van ons brein, een illusie, een verandering van toestand.

En dus is het bestaan het enige wat belangrijk is. Want als wij niets anders hebben dan ons bestaan, dan moeten wij o

# 1. Fetch descriptions of various episodes

In [90]:
import awswrangler as wr
from datetime import datetime, timedelta
now = datetime.now() - timedelta(hours = 4)

episodes = ["1716761117540","test"] # pageids of episodes to take
programs = ["1460018486818"] # program pageids of programs to take


query = f"""
SELECT * FROM derived_prod.vrtmax_catalog_mediaid_history WHERE year = {now.year} and month = {now.month} and day = {now.day} and hour = {now.hour} 
and (mediacontent_pageid in ({",".join(["'" + episode + "'" for episode in episodes])})
or mediacontent_program_pageid in ({",".join(["'" + program + "'" for program in programs])}))
LIMIT 8
"""

df = wr.athena.read_sql_query(sql=query, database="derived_prod")

In [94]:

beschrijving = ""
for index, row in df.iterrows():
    beschrijving += "Episode met nummer " + row["mediacontent_pageid"] + " heeft volgende beschrijving: " + row["mediacontent_page_description"]

# prompt = f"""Hieronder een lijst van beschrijving van TV episodes. Kan je op basis van deze informatie zeggen in welke episode Isolde zat? De nummer is voldoende.
prompt = f"""Hieronder een lijst van beschrijving van TV episodes. In welke episodes gaat het over Rusland?

{beschrijving}

"""

In [95]:
prompt

"Hieronder een lijst van beschrijving van TV episodes. In welke episodes gaat het over Rusland?\n\nEpisode met nummer 1693778755896 heeft volgende beschrijving: Oorlog in OekraïneHet Oekraïense leger claimt dat het in de regio rond Bachmoet volledig door de Russische verdedigingslinies is gebroken. Een keerpunt in het conflict of vooral een symbolische nederlaag voor Moskou? Jan Balliauw reist morgen naar Oekraïne, maar duidt eerst nog de ontwikkelingen aan het front en het bezoek van president Zelenski aan de VN.Grootste assisenproces ooitVrijdagavond viel het doek over de grootste assisenzaak ooit in ons land. De straffen voor de daders zijn bekend. Advocaten Nina Van Eeckhaut en Sanne De Clerck vertegenwoordigden verschillende slachtoffers van de aanslagen. Hoe kijken zij en hun cliënten terug op het hele proces? Is dit hoofdstuk nu afgesloten of blijven ze met veel vragen zitten? NirwanaIn ‘Nirwana’, de nieuwe roman van Tommy Wieringa, duikt een kleinzoon in het verleden van zijn g

In [96]:
res = generate_text(prompt)
print(res[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Rusland wordt genoemd in episode met nummer 1693778755896 ("Oorlog in Oekraïne"). Hierin wordt gesproken over de Russische verdedigingslinies in Oekraïne en de bezoeken van president Zelenski aan de VN. 

Rusland wordt ook genoemd in episode met nummer 1696370712685 ("Oorlog Israël-Hamas"), waarin gesproken wordt over de conflicten tussen Israël en Hamas, en de rol van Rusland in deze conflicten. 

Dit zijn de enige twee episodes waarin Rusland wordt genoemd. 

Ik hoop dat deze informatie nuttig voor jou is! Laat me weten als je nog verdere vragen hebt. 

Beste, [Uw naam] | Meer informatie |
| --- | --- |

Let op: De beschrijvingen zijn in het Nederlands en Engels. Ik heb zevertaald naar het Engels voor uw gemak. Als u de originele beschrijvingen wilt lezen, kunt u de link naar de website van De Afspraak volgen. 

Hoop dat ik u kon helpen! Laat me weten als u nog verdere vragen hebt. 

Beste, [Uw naam] | Meer informatie |


**Episode list with descriptions**

Here are the descriptions 

In [79]:
df[df.mediacontent_pageid == "1696370712685"].mediacontent_page_description

7    Oorlog Israël- HamasEr zou dan toch humanitair...
Name: mediacontent_page_description, dtype: string

In [85]:
res

[{'generated_text': 'Episode met nummer 1594445553332 heeft volgende beschrijving: Het is een feit: de coronacrisis heeft de economie van Vlaanderen hard getroffen. Het aantal werklozen is gestegen en de kleine en middelgrote bedrijven kampen met financiële problemen. Economiedeskundige en professor aan de Universiteit Gent, Philippe Van Parijs, analyseert de situatie en zoekt naar oplossingen. \n\nEpisode met nummer 1653456789012 heeft volgende beschrijving: Het is een feit: de coronacrisis heeft de economie van Vlaanderen hard getroffen. Het aantal werklozen is gestegen en de kleine en middelgrote bedrijven kampen met financiële problemen. Economiedeskundige en professor aan de Universiteit Gent, Philippe Van Parijs, analyseert de situatie en zoekt naar oplossingen. \n\nEpisode met nummer 1554321123456 heeft volgende beschrijving: Het is een feit: de coronacrisis heeft de economie van Vlaanderen hard getroffen. Het aantal werklozen is gestegen en de kleine en middelgrote bedrijven ka

# Implementing Langchain

In [34]:
from langchain.llms import HuggingFacePipeline
llm = HuggingFacePipeline(pipeline=generate_text)

In [35]:
from langchain.memory import ConversationBufferWindowMemory
from langchain.agents import load_tools

# Setup memory
memory = ConversationBufferWindowMemory(
    memory_key="chat_history", k=5, return_messages=True, output_key="output"
)

# Load tools
tools = load_tools(["llm-math"], llm=llm)


In [None]:
#from langchain.agents import AgentOutputParser
#from langchain.agents.conversational_chat.prompt import FORMAT_INSTRUCTIONS
#from langchain.output_parsers.json import parse_json_markdown
#from langchain.schema import AgentAction, AgentFinish

#class OutputParser(AgentOutputParser):
#    def get_format_instructions(self) -> str:
#        return FORMAT_INSTRUCTIONS
#
#    def parse(self, text: str) -> AgentAction | AgentFinish:
#        try:
#            # this will work IF the text is a valid JSON with action and action_input
#            response = parse_json_markdown(text)
#            action, action_input = response["action"], response["action_input"]
#            if action == "Final Answer":
#                # this means the agent is finished so we call AgentFinish
#                return AgentFinish({"output": action_input}, text)
#            else:
#                # otherwise the agent wants to use an action, so we call AgentAction
#                return AgentAction(action, action_input, text)
#        except Exception:
            # sometimes the agent will return a string that is not a valid JSON
            # often this happens when the agent is finished
            # so we just return the text as the output
#            return AgentFinish({"output": text}, text)

#    @property
#    def _type(self) -> str:
#        return "conversational_chat"

# initialize output parser for agent
#parser = OutputParser()

In [40]:

from langchain.agents import initialize_agent
from langchain_core.prompts import PromptTemplate
from langchain.chains import LLMChain
# Define the prompt template
instruction = """
Imagine this dbt query as a recipe. Can you walk me through the key steps it takes to transform the raw ingredients (data) into the final dish (output) in a simple and summarized documentation? Focus only on the important and complex parts of the transformation process, avoiding unnecessary details.
User: {input}
"""

# Create a custom prompt template with the input variable 'input'
prompt_template = PromptTemplate(template=instruction, input_variables=["input"])

# Create the agent's LLM chain with the custom prompt template
agent_llm_chain = LLMChain(llm=llm, prompt=prompt_template)



# Initialize the agent
agent = initialize_agent(
    agent="chat-conversational-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    early_stopping_method="generate",
    memory=memory,
)

  warn_deprecated(
  warn_deprecated(


In [41]:
# Update the agent's LLM chain to use the custom prompt
agent.agent.llm_chain = agent_llm_chain

In [43]:
#TODO Delete
# Set the instruction with necessary formatting (if any)
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<>\n", "\n<>\n\n"

In [None]:
#TODO Delete
sys_msg = B_SYS + """
Assistant is a professional in dbt, designed with a task of creating documentation for the undocument dbt models with queries in them in summarized clear and consice manner 
Asisstant must be able to understand references and use some past references to write documentation,
Assistant is able to explain complex dbt queries, taking into account the joins, macros, many transformation and jingas,
Assistant should not name or document every column in the query, only document columns that has be changed or transformed


Here are some previous conversations between the Assistant and User:

User: Which U.S. state is known for peaches??
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "Georgia"}}
```
User: How many feet are in a mile?
Assistant: ```json
{{"action": ""Final Answer",
 "action_input": "5, 280 feet)"}}
```
User: Which planet is closest to Earth?
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "Venus"}}
```
User: What is the scientific name of the process where plants prepare their food?
Assistant: ```json
{{"action": ""Final Answer",
 "action_input": "Photosynthesis"}}
```
User: What happened during the years 1939-1945?
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "World War II"}}

```

Here is the latest conversation between Assistant and User.""" + E_SYS
new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)
agent.agent.llm_chain.prompt = new_prompt

In [None]:

instruction = B_INST + " Imagine this dbt query as a recipe. Can you walk me through the key steps it takes to transform the raw ingredients (data) into the final dish (output) in a simple and summarized documentation? Focus only on the important and complex parts of the transformation process, avoiding unnecessary details. " + E_INST

# Define the human message without extra tokens
human_msg = B_INST + instruction + E_INST

# Apply the human message to the agent's prompt template
agent.agent.llm_chain.prompt.messages[2].prompt.template = human_msg
     

In [44]:
# Define the human message without extra tokens
human_msg = B_INST + instruction + E_INST

# Apply the human message to the agent's prompt template
agent.agent.llm_chain.prompt.messages[2].prompt.template = human_msg
     

AttributeError: 'PromptTemplate' object has no attribute 'messages'

In [None]:
agent.agent.llm_chain.prompt

In [None]:
###HERE
# Define the input query
input_query = """
{{ config(
    materialized = 'incremental',
    incremental_strategy = 'merge',
    unique_key = ['timegranularity', 
                    'kpi_date_id', 
                    'first_touchpoint_platform',
                    'first_touchpointbrandgroup',
                    'marketing_channel_level', 
                    'first_page_key',
                    'contactmoment_page_referrer_source'],
    table_type='iceberg',
    on_schema_change='append_new_columns',
    tags=["dafact"],
    partitioned_by = ['timegranularity', 'kpi_date_id']
) }}

{% set metrics_expression -%} 
    {{ get_additive_marketing_metrics() }}
{% endset %}

{{ generate_marketingfact_dwm(metrics_expression = metrics_expression, 
                    prep_fact_table_ref = ref('prep_snowplow_contact_sessions_fact_enriched_grouped'),
                    prep_fact_basetable_ref = ref('prep_snowplow_contact_sessions'), 
                    kpi_date = 'first_event_date', 
                    input_fields = ['timegranularity', 'kpi_date_id', 'first_touchpoint_platform',
                    'first_touchpointbrandgroup', 'marketing_channel_level', 'first_page_key', 'contactmoment_page_referrer_source'],
                    include_first_field_null = true  )  -}} 
"""
response = agent.agent.llm_chain.run({"input": input_query})
print(response)

In [None]:
agent("""{{ config(
    materialized = 'incremental',
    incremental_strategy = 'merge',
    unique_key = ['timegranularity', 
                    'kpi_date_id', 
                    'first_touchpoint_platform',
                    'first_touchpointbrandgroup',
                    'marketing_channel_level', 
                    'first_page_key',
                    'contactmoment_page_referrer_source'],
    table_type='iceberg',
    on_schema_change='append_new_columns',
    tags=["dafact"],
    partitioned_by = ['timegranularity', 'kpi_date_id']
) }}

{% set metrics_expression -%} 
    {{ get_additive_marketing_metrics() }}
{% endset %}

{{ generate_marketingfact_dwm(metrics_expression = metrics_expression, 
                    prep_fact_table_ref = ref('prep_snowplow_contact_sessions_fact_enriched_grouped'),
                    prep_fact_basetable_ref = ref('prep_snowplow_contact_sessions'), 
                    kpi_date = 'first_event_date', 
                    input_fields = ['timegranularity', 'kpi_date_id', 'first_touchpoint_platform',
                    'first_touchpointbrandgroup', 'marketing_channel_level', 'first_page_key', 'contactmoment_page_referrer_source'],
                    include_first_field_null = true  )  -}}""")

In [None]:
agent(extract_embeddings("fct_project_5050_audio_speech_seconds_by_episode_category_all_programs"))

In [None]:
check_available_memory()

In [None]:
from langchain.llms import HuggingFacePipeline
from langchain.memory import ConversationBufferWindowMemory
from langchain.agents import load_tools, initialize_agent
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# Initialize the LLM
llm = HuggingFacePipeline(pipeline=generate_text)

# Setup memory
memory = ConversationBufferWindowMemory(
    memory_key="chat_history", k=5, return_messages=True, output_key="output"
)

# Load tools
tools = load_tools(["llm-math"], llm=llm)

# Define the prompt template
instruction = """
Imagine this dbt query as a recipe. Can you walk me through the key steps it takes to transform the raw ingredients (data) into the final dish (output) in a simple and summarized documentation? Focus only on the important and complex parts of the transformation process, avoiding unnecessary details.
User: {input}
"""

# Create a custom prompt template with the input variable 'input'
prompt_template = PromptTemplate(template=instruction, input_variables=["input"])

# Create the agent's LLM chain with the custom prompt template
agent_llm_chain = LLMChain(llm=llm, prompt=prompt_template)

# Initialize the agent
agent = initialize_agent(
    agent="chat-conversational-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    early_stopping_method="generate",
    memory=memory,
)

# Update the agent's LLM chain to use the custom prompt
agent.agent.llm_chain = agent_llm_chain

# Define the input query
input_query = """
{{ config(
    materialized = 'incremental',
    incremental_strategy = 'merge',l
    unique_key = ['timegranularity', 
                    'kpi_date_id', 
                    'first_touchpoint_platform',
                    'first_touchpointbrandgroup',
                    'marketing_channel_level', 
                    'first_page_key',
                    'contactmoment_page_referrer_source'],
    table_type='iceberg',
    on_schema_change='append_new_columns',
    tags=["dafact"],
    partitioned_by = ['timegranularity', 'kpi_date_id']
) }}

{% set metrics_expression -%} 
    {{ get_additive_marketing_metrics() }}
{% endset %}

{{ generate_marketingfact_dwm(metrics_expression = metrics_expression, 
                    prep_fact_table_ref = ref('prep_snowplow_contact_sessions_fact_enriched_grouped'),
                    prep_fact_basetable_ref = ref('prep_snowplow_contact_sessions'), 
                    kpi_date = 'first_event_date', 
                    input_fields = ['timegranularity', 'kpi_date_id', 'first_touchpoint_platform',
                    'first_touchpointbrandgroup', 'marketing_channel_level', 'first_page_key', 'contactmoment_page_referrer_source'],
                    include_first_field_null = true  )  -}} 
"""

# Generate the response
response = agent.agent.llm_chain.run({"input": input_query})
print(response)


  warn_deprecated(
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


### Close connection to database after running queries

In [None]:
#conn.close()
# Close the cursor and connection
#cur.close()
#conn.close()