Retrieval Augmented Generation

Vector store: Redis
Model: T5-Large

*Add docker compose to set up Redis.
*The model used for storing the vectors and vectorizing the question can be different from the model doing the text generation.

In [None]:
# Only needed if downloading large model to switch to the HDD, bigscience/T0 for example
"""
import os
os.environ['TRANSFORMERS_CACHE'] = 'D:\python\.cache'
os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'
"""

In [5]:
from redis import from_url, Redis

# Connection to the redis instance
REDIS_URL = 'redis://localhost:6379'
client = from_url(REDIS_URL)
client.ping()

True

In [7]:
r = Redis()
total_keys = r.dbsize()
print(total_keys)

20


Obtain the embeddings from the model.

In [2]:
from transformers import T5Tokenizer, T5Model
import torch

tokenizer = T5Tokenizer.from_pretrained('t5-large')
model = T5Model.from_pretrained('t5-large')

def get_vector(text):
    input_ids = tokenizer.encode(text, return_tensors="pt")

    with torch.no_grad():
        # Pass the input to the encoder
        encoder_outputs = model.get_encoder()(input_ids)
        
    # Retrieve the last hidden state of the encoder
    hidden_state = encoder_outputs.last_hidden_state

    # Compute the mean over the sequence dimension
    return hidden_state.mean(dim=1).numpy().flatten().tolist()





ModuleNotFoundError: No module named 'transformers'

In [3]:
# Examples of documents
text_1 = """Japan narrowly escapes recession ..."""
text_2 = """Dibaba breaks 5,000m world record ..."""
text_3 = """Google's toolbar sparks concern ..."""
text_4 = """Web accessibility, or eAccessibility, is the inclusive practice of ensuring there are no barriers that prevent interaction with, or access to, websites on the World Wide Web by people with physical disabilities, situational disabilities, and socio-economic restrictions on bandwidth and speed."""

# Builds the json with the content in natural language and the vectors
doc_1 = {"content": text_1, "vector": get_vector(text_1)}
doc_2 = {"content": text_2, "vector": get_vector(text_2)}
doc_3 = {"content": text_3, "vector": get_vector(text_3)}
doc_4 = {"content": text_4, "vector": get_vector(text_4)}

Create the index in Redis (schema). The creation is dynamic on the DIM so that it can adapt to other model.

In [4]:
from redis.commands.search.field import TextField, VectorField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType

schema = [ VectorField('$.vector', 
            "FLAT", 
            {   "TYPE": 'FLOAT32', 
                "DIM": len(doc_1['vector']), 
                "DISTANCE_METRIC": "COSINE"
            },  as_name='vector' ),
            TextField('$.content', as_name='content')
        ]
idx_def = IndexDefinition(index_type=IndexType.JSON, prefix=['doc:'])
try: 
    client.ft('idx').dropindex()
    print("dropped index")
except:
    pass
client.ft('idx').create_index(schema, definition=idx_def)
print("created Index")

dropped index
created Index


In [5]:
# Loads the document into Redis.
# Careful with the prefix in the name as Redis use that to associate a document and an index.
client.json().set('doc:1', '$', doc_1)
client.json().set('doc:2', '$', doc_2)
client.json().set('doc:3', '$', doc_3)
client.json().set('doc:4', '$', doc_4)

True

Retrieval of context and text generation.

The process is the following:
Get an input question from a user.
Vectorize this question with the same model we vectorize the documents.
Do a semantic search (KNN vector distance) or Hybrid search (Vector distances + full text search) between the documents and the question. This is a feature of Redis so no need to implement the algorithm.
Get back the natural language part of the document.
Insert that as a context into the prompt for the model to generate a response.

In [15]:
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from redis.commands.search.query import Query
import numpy as np
# Reimport Query from redis and see if numpy is really needed here?

# First, load the model and tokenizer manually
tokenizer = AutoTokenizer.from_pretrained("t5-large")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-large")


def get_answer(client, idx, question, model, tokenizer):
    # Get vector for the question, need to be converted tobytes for redis
    vec = np.array(get_vector(question), dtype=np.float32).tobytes()
        
    
    # Define the search query
    q = Query('*=>[KNN 1 @vector $query_vec AS vector_score]').return_fields('content').dialect(2)    
    
    # Define query parameters
    params = {"query_vec": vec}

    # Execute the search query
    results = client.ft(idx).search(q, query_params=params)

    if len(results.docs) == 0:
        return "No relevant documents found in database. Please seek professional help."
    else:
        # Retrieve the content of the most relevant document
        document = results.docs[0]['content'].strip()
        print(document)
        # Build the prompt
        
        prompt = f"question: {question} context: {document}"
        print(prompt)
        # Then, pass the model and tokenizer to the pipeline function
        text2text_generator = pipeline("text2text-generation", model=model, tokenizer=tokenizer)
        result = text2text_generator(prompt)
        print(result)

# You can call the function like this
question = "What criteria about images?"
get_answer(client, 'idx', question, model, tokenizer)



Skip to Table of Contents - Skip to main content Introduction to RGAA RGAA companion guide Technical reference Criteria - current page Glossary Particular cases Technical notes Baseline References RGAA 3 2016 - Criteria - English translation The RGAA is the French government's General Accessibility Reference for Administrations. It is meant to provide a way to check conformity to WCAG 2.0. Table of Contents How to use the RGAA Images Frames Colors Multimedia Tables Links Scripts Mandatory elements Information structure Presentation of information Forms Navigation Consultation How to use the RGAA The RGAA applies to any HTML content (HTML4, XHTML1, HTML5). For some tests, a reference baseline is used. This baseline takes into account a set of assistive technologies, browsers and operating systems, on which the accessibility of JavaScript-based interface components must be tested, among others. A detailed description is provided here: Baseline. Important notice regarding HTML content pri