<a href="https://colab.research.google.com/github/afsarahannan/NLP_RAG_Project-/blob/main/Retrieval_Generation_project_with_ColBert_and_T5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG Project: ColBert, T5-base and GPT3.5

If you're working in Google Colab, we recommend selecting "T4 GPU" as your hardware accelerator in the runtime settings.



## Setting up the environment, loading necessary libraries, load the datset

In [None]:
#checking if the ColBert repository exists if not then clone the repo
!git -C ColBERT/ pull || git clone https://github.com/stanford-futuredata/ColBERT.git

#import ColBert to the system path to import necessary modules
import sys; sys.path.insert(0, 'ColBERT/')


fatal: cannot change to 'ColBERT/': No such file or directory
Cloning into 'ColBERT'...
remote: Enumerating objects: 2662, done.[K
remote: Counting objects: 100% (1165/1165), done.[K
remote: Compressing objects: 100% (364/364), done.[K
remote: Total 2662 (delta 908), reused 849 (delta 801), pack-reused 1497[K
Receiving objects: 100% (2662/2662), 2.03 MiB | 22.16 MiB/s, done.
Resolving deltas: 100% (1667/1667), done.


In [None]:
#checks if the environment is Colab if so then install updated version of pip and Faiss GPU and pytorch
try:
    import google.colab
    !pip install -U pip
    !pip install -e ColBERT/['faiss-gpu','torch']
    #if not colab then add to the system path
except Exception:
  import sys; sys.path.insert(0, 'ColBERT/')
  try:
    #import Indexer and Searcher from the colbert package
    from colbert import Indexer, Searcher
  except Exception:
    print("You're running outside Colab, please make sure you install ColBERT in conda. Conda is recommended.")
    assert False

Collecting pip
  Downloading pip-24.0-py3-none-any.whl (2.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m26.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.1.2
    Uninstalling pip-23.1.2:
      Successfully uninstalled pip-23.1.2
Successfully installed pip-24.0
Obtaining file:///content/ColBERT
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting bitarray (from colbert-ai==0.2.19)
  Downloading bitarray-2.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (34 kB)
Collecting datasets (from colbert-ai==0.2.19)
  Downloading datasets-2.17.1-py3-none-any.whl.metadata (20 kB)
Collecting git-python (from colbert-ai==0.2.19)
  Downloading git_python-1.0.3-py2.py3-none-any.whl (1.9 kB)
Collecting python-dotenv (from colbert-ai==0.2.19)
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting ninja (from co

In [None]:
#import colbert and all the necessay packages from the library
import colbert
from colbert import Indexer, Searcher
from colbert.infra import Run, RunConfig, ColBERTConfig
from colbert.data import Queries, Collection

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'


In [None]:
#load the csv dataset here
#note: remember to load the files in the temp content directory of colab
import pandas as pd
queries = pd.read_csv("/content/queries.csv", header = None ,sep='\t')
answers = pd.read_csv("/content/answers.csv", header = None ,sep='\t')

In [None]:
#extract all the questions and the answer passages from the documents

questions =[x for x in queries[0]]
answer_index = [x for x in queries[1]]
answer_index_integers = [[int(num)-1 for num in sub_string.split(',')] for sub_string in answer_index]
answer_text = [x for x in answers[0]]

answer = [[answer_text[x] for x in sublist]for sublist in answer_index_integers]
answer_compiled = ["".join(text) for text in answer]

print(f"There are {len(questions)} questions and {len(answer_text)} answers in this notebook.")

There are 190 questions and 203 answers in this notebook.


## Indexing


Below, the `Indexer` take a model checkpoint and writes a (compressed) index to disk. We then prepare a `Searcher` for retrieval from this index.

The indexer module is responsible for building an index of the document embeddings which are representation of documents in a vector space.   
The searcher module is responsible for performing search queries over the indexed document embeddings. The searcher returns the most relevant list of documents.

In [None]:
# encode each dimension with 2 bits because we are lowering the precision of the model to save the computational power which is limited in this case.
#ColBert uses Byte Pair encoding (BPE)
nbits = 2
doc_maxlen = 300 # truncate passages at 300 tokens
# max_id = 10000

index_name = f'ML_Edge.{nbits}bits'

In [None]:
index_name

'ML_Edge.2bits'

In the case of this project we will not finetune the model because the dataset is small and the model might overfit the data. So the indexer will run on the dataset with the weights of the pretrained model.  

Now run the `Indexer` on the collection subset.

In [None]:
checkpoint = 'colbert-ir/colbertv2.0'

with Run().context(RunConfig(nranks=1, experiment='notebook')):  # nranks specifies the number of GPUs to use
    config = ColBERTConfig(doc_maxlen=doc_maxlen, nbits=nbits, kmeans_niters=4) # kmeans_niters specifies the number of iterations of k-means clustering; 4 is a good and fast default.

    # the indexer will load the checkpoint of the pre-trained ColBert model
    indexer = Indexer(checkpoint=checkpoint, config=config)
    indexer.index(name=index_name, collection=answer_text, overwrite=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


artifact.metadata:   0%|          | 0.00/1.63k [00:00<?, ?B/s]



[Feb 20, 09:26:17] #> Creating directory /content/experiments/notebook/indexes/ML_Edge.2bits 


#> Starting...
#> Joined...


## Search



The searcher module is responsible for embedding the query and then coducting similarity search after which it willl rank the indexed document and retrieve the document that has the highest similarity score with the query.

In [None]:
# Create the searcher
with Run().context(RunConfig(experiment='notebook')):
  searcher = Searcher(index=index_name, collection=answer_text)



[Feb 20, 09:35:41] Loading segmented_maxsim_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
[Feb 20, 09:35:41] #> Loading codec...
[Feb 20, 09:35:41] #> Loading IVF...
[Feb 20, 09:35:41] Loading segmented_lookup_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...




[Feb 20, 09:36:13] #> Loading doclens...


100%|██████████| 1/1 [00:00<00:00, 2918.79it/s]

[Feb 20, 09:36:13] #> Loading codes and residuals...



100%|██████████| 1/1 [00:00<00:00, 454.72it/s]

[Feb 20, 09:36:13] Loading filter_pids_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...





[Feb 20, 09:36:51] Loading decompress_residuals_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...


In [None]:
#I need the context list from the ColBert Model
context = []

for query in questions:
  results = searcher.search(query, k=3)
  intermediate_data = []
  for passage_id, passage_rank, passage_score in zip(*results):
    data = searcher.collection[passage_id]
    intermediate_data.append(data)
  context.append(intermediate_data)
  intermediate_data=[]

context_compiled = ["".join(text) for text in context]


#> QueryTokenizer.tensorize(batch_text[0], batch_background[0], bsize) ==
#> Input: . what is machine learning ?, 		 True, 		 None
#> Output IDs: torch.Size([32]), tensor([ 101,    1, 2054, 2003, 3698, 4083, 1029,  102,  103,  103,  103,  103,
         103,  103,  103,  103,  103,  103,  103,  103,  103,  103,  103,  103,
         103,  103,  103,  103,  103,  103,  103,  103])
#> Output Mask: torch.Size([32]), tensor([1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0])





In [None]:
#saving the context from ColBert model as an external file
# File path to save the CSV file
import csv
file_path = '/content/context.txt'

# Writing the list to a text file
with open(file_path, 'w') as file:
    for item in context:
        file.write("%s\n" % item)

## Generation

### Generation with pretrained t5-base model

In [None]:
# Install the transformers library
!pip install transformers
!pip install evaluate
!pip install rouge


In [None]:
!pip install --upgrade pyarrow
!pip uninstall -y evaluate
!pip install evaluate


In [None]:
import torch
import json
from tqdm import tqdm
import torch.nn as nn
from torch.optim import Adam
import nltk
import spacy
import string
# import evaluate
from torch.utils.data import Dataset, DataLoader, RandomSampler
import transformers
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from transformers import T5Tokenizer, T5Model, T5ForConditionalGeneration, T5TokenizerFast

import warnings
warnings.filterwarnings("ignore")

In [None]:
#set the values for the model parameter
MODEL_NAME ="t5-base"
TOKENIZER = T5TokenizerFast.from_pretrained(MODEL_NAME)
MODEL = T5ForConditionalGeneration.from_pretrained(MODEL_NAME, return_dict=True)
OPTIMIZER = Adam(MODEL.parameters(), lr=0.0001)
Q_LEN = 150   # Question Length
T_LEN = 200    # Target Length
BATCH_SIZE = 3
DEVICE = "cpu" #DEVICE = "cuda:0" #if you have cuda

In [None]:
#create a class to extract the questions context and the answer from the dataframe
class QA_Dataset(Dataset):
    def __init__(self, tokenizer, dataframe, q_len, t_len):
        self.tokenizer = tokenizer
        self.q_len = q_len
        self.t_len = t_len
        self.data = dataframe
        self.questions = self.data["questions"]
        self.context = self.data["context"]
        self.answer = self.data["answer"]

    def __len__(self):
        return len(self.questions)

    def __getitem__(self, idx):
        question = self.questions[idx]
        context = self.context[idx]
        answer = self.answer[idx]

        question_tokenized = self.tokenizer(question, context, max_length=self.q_len, padding="max_length",
                                                    truncation=True, pad_to_max_length=True, add_special_tokens=True)
        answer_tokenized = self.tokenizer(answer, max_length=self.t_len, padding="max_length",
                                          truncation=True, pad_to_max_length=True, add_special_tokens=True)

        labels = torch.tensor(answer_tokenized["input_ids"], dtype=torch.long)
        labels[labels == 0] = -100

        return {
            "input_ids": torch.tensor(question_tokenized["input_ids"], dtype=torch.long),
            "attention_mask": torch.tensor(question_tokenized["attention_mask"], dtype=torch.long),
            "labels": labels,
            "decoder_attention_mask": torch.tensor(answer_tokenized["attention_mask"], dtype=torch.long)
        }

In [None]:
questions_df = pd.DataFrame(questions, columns=['questions'])
answer_df = pd.DataFrame(answer_compiled, columns=['answer'])
context_df = pd.DataFrame(context_compiled, columns=['context'])

data = pd.concat([questions_df, context_df,answer_df], axis=1)
data

Unnamed: 0,questions,context,answer
0,what is machine learning ?,Machine learning is a branch of computer scien...,Machine learning is a branch of computer scien...
1,Why do we need to know about machine learning ?,Machine Learning is an important field of data...,Machine Learning is an important field of data...
2,What is deep learning?,Deep learning is a subset of Machine learning ...,Deep learning is a subset of Machine learning ...
3,How is deep learning different from machine le...,Deep learning is a subset of Machine learning ...,Unlike traditional ML it does not require man...
4,When do we use deep learning or machine learni...,Deep learning is a subset of Machine learning ...,Machine learning is used when the task is simp...
...,...,...,...
185,What are the different types of quantization,Quantization involves converting the continuou...,There is Post-Training Quantization: It's simp...
186,How does transfer learning work,Transfer learning involves taking a model trai...,The process starts with a model that has been ...
187,What are the different types of transfer learning,Types of Transfer Learning include Inductive T...,Types of Transfer Learning include Inductive T...
188,What is Knowledge Distillation,Knowledge distillation involves training a sma...,Knowledge distillation involves training a sma...


In [None]:
# load the dataset using the Dataloader and then split the dataset according to training and test set

train_data, val_data = train_test_split(data, test_size=0.2, random_state=42)

train_sampler = RandomSampler(train_data.index)
val_sampler = RandomSampler(val_data.index)

qa_dataset = QA_Dataset(TOKENIZER, data, Q_LEN, T_LEN)

train_loader = DataLoader(qa_dataset, batch_size=BATCH_SIZE, sampler=train_sampler)
val_loader = DataLoader(qa_dataset, batch_size=BATCH_SIZE, sampler=val_sampler)

In [None]:
#run the training loop along with model evaluation
train_loss = 0
val_loss = 0
train_batch_count = 0
val_batch_count = 0

for epoch in range(2):
    MODEL.train()
    for batch in tqdm(train_loader, desc="Training batches"):
        input_ids = batch["input_ids"].to(DEVICE)
        attention_mask = batch["attention_mask"].to(DEVICE)
        labels = batch["labels"].to(DEVICE)
        decoder_attention_mask = batch["decoder_attention_mask"].to(DEVICE)

        outputs = MODEL(
                          input_ids=input_ids,
                          attention_mask=attention_mask,
                          labels=labels,
                          decoder_attention_mask=decoder_attention_mask
                        )

        OPTIMIZER.zero_grad()
        outputs.loss.backward()
        OPTIMIZER.step()
        train_loss += outputs.loss.item()
        train_batch_count += 1

    #Evaluation
    MODEL.eval()
    for batch in tqdm(val_loader, desc="Validation batches"):
        input_ids = batch["input_ids"].to(DEVICE)
        attention_mask = batch["attention_mask"].to(DEVICE)
        labels = batch["labels"].to(DEVICE)
        decoder_attention_mask = batch["decoder_attention_mask"].to(DEVICE)

        outputs = MODEL(
                          input_ids=input_ids,
                          attention_mask=attention_mask,
                          labels=labels,
                          decoder_attention_mask=decoder_attention_mask
                        )

        OPTIMIZER.zero_grad()
        outputs.loss.backward()
        OPTIMIZER.step()
        val_loss += outputs.loss.item()
        val_batch_count += 1

    print(f"{epoch+1}/{2} -> Train loss: {train_loss / train_batch_count}\tValidation loss: {val_loss/val_batch_count}")

Training batches: 100%|██████████| 51/51 [17:20<00:00, 20.40s/it]
Validation batches: 100%|██████████| 13/13 [03:51<00:00, 17.84s/it]


1/2 -> Train loss: 2.277066555531586	Validation loss: 1.4941018223762512


Training batches: 100%|██████████| 51/51 [16:38<00:00, 19.58s/it]
Validation batches: 100%|██████████| 13/13 [04:14<00:00, 19.58s/it]

2/2 -> Train loss: 1.841683010333309	Validation loss: 1.286982897669077





In [None]:
#save the model weights for inference
MODEL.save_pretrained("QA_model")
TOKENIZER.save_pretrained("QA_tokenizer")

('QA_tokenizer/tokenizer_config.json',
 'QA_tokenizer/special_tokens_map.json',
 'QA_tokenizer/spiece.model',
 'QA_tokenizer/added_tokens.json',
 'QA_tokenizer/tokenizer.json')

In [None]:
#function to make model inference
from nltk.translate.bleu_score import sentence_bleu

def predict_answer(context, question, ref_answer=None):
    inputs = TOKENIZER(question, context, max_length=Q_LEN, padding="max_length", truncation=True, add_special_tokens=True)

    input_ids = torch.tensor(inputs["input_ids"], dtype=torch.long).to(DEVICE).unsqueeze(0)
    attention_mask = torch.tensor(inputs["attention_mask"], dtype=torch.long).to(DEVICE).unsqueeze(0)

    outputs = MODEL.generate(input_ids=input_ids, attention_mask=attention_mask)

    predicted_answer = TOKENIZER.decode(outputs.flatten(), skip_special_tokens=True)

    if ref_answer:
        # Load the Bleu metric
        predicted_tokens = predicted_answer.split()
        ref_tokens = ref_answer.split()

        # Compute BLEU score
        bleu_score = sentence_bleu([ref_tokens], predicted_tokens)

        print("Context: \n", context)
        print("\n")
        print("Question: \n", question)
        return {
            "Reference Answer: ": ref_answer,
            "Predicted Answer: ": predicted_answer,
            "BLEU Score: ": bleu_score
        }
    else:
        return predicted_answer

In [None]:
#this is the inference part of the model
import random
n = random.randint(0,71) #asked a question at random from the dataset

context = data['context'][n]
question = data['questions'][n]
answer = data['answer'][n]

predict_answer(context, question, answer)

Context: 
 Instance based learning is when the model learns from only the data that is provided without any further generalization. It updates its learning with the input of new data at every instance. Machine learning algorithms are used to either make a prediction or classify a given data input. The data may or may not be labeled. The way a machine learning algorithm learns is with the help of a mathematical function which is responsible to evaluate the prediction of a model with it’s true label Models are then adjusted to reduce discrepancy between a known example and the model estimate with the help of a loss function. Model based learning models are the type of supervised models that learn underlying patterns from the data and then generalizes on new data by extending the parameters that are learned from the training data e.g. Spam filter may learn on the fly with a deep neural network – online model-based supervised learning system 


Question: 
 What is instance based predictive

{'Reference Answer: ': 'Instance based learning is when the model learns from only the data that is provided without any further generalization. It updates its learning with the input of new data at every instance. ',
 'Predicted Answer: ': 'It learns from the input of new data at every instance. Instance based learning',
 'BLEU Score: ': 0.1857843257365279}

In [None]:
#ask a question that is not in the dataset for model inference
def ask_RAG(query):
  results = searcher.search(query, k=3)
  all_data = []
  for passage_id, passage_rank, passage_score in zip(*results):
    data = searcher.collection[passage_id]
    all_data.append(data)
    retrieved_context = ''.join(all_data)
    prediction = predict_answer(retrieved_context, query)
  print(f"retrieved information: {retrieved_context} \ngenerated response: {prediction}")


### Try out the model

In [None]:
#These are the type of sample questions we can ask the model

import random
for i in range(10):
  n = random.randint(0,len(questions)-1)
  print(questions[n])

What is collaborative filtering ?
What are the computational challenges faced by a CNN model 
What is dropout method in model optimization 
What is Mean squared bias in regression analysis?
What is forward pass in the training process of a convolutional network 
What are the different methods of hyperparameter tuning ?
What is context aware systems ?
What is byte pair encoding 
What are the different types of recommendation systems 
How can the computational costs be reduced for a CNN model ? 


In [None]:
#training question
query = "What is byte pair encoding ?"
ask_RAG(query)

retrieved information: Byte Pair Encoding (BPE) is a middle ground between word-level and character-level tokenization. It starts with a base vocabulary of individual characters and iteratively merges the most frequent pair of tokens Efficient in representing common subword units significantly reduces the vocabulary size and handles out-of-vocabulary words well. Used in models like GPT-2 GPT-3The Seq2seq model uses encode and decoder architecture. Where the encoder Processes the input sequence and compresses the information into a context vector. The decoder then takes the context vector and generates the output sequence. Seq2seq models are typically trained end-to-end on paired
sequences (e.g.  an English sentence and its French translation) The training objective is to maximize the likelihood of the correct output sequence given the input sequence. Seq2seq models are good at translating text from one language to another  converting spoken language into text  text summarization  and i

In [None]:
#Within syllabus question but paraphrased
query = "Can you tell me something about collaborative filtering ?"
ask_RAG(query)

retrieved information: There are two types of collaborative filtering. User-Based Collaborative Filtering: This method finds users who have similar preferences or behavior patterns to the target user and recommends items that these similar users have liked or interacted with and Item-Based Collaborative Filtering: Instead of finding similar users this approach identifies items that are similar to those the user has already liked or interacted with based on other users' interactions with these items.    Hybrid recommendation systems combine collaborative and content-based filtering methods. The goal is to improve recommendation quality and overcome the limitations inherent in any single approach. For example a hybrid system might use collaborative filtering to identify a set of users with similar tastes and then use content-based filtering to find items that those users liked and that match the target user's content preferences.The different type of recommendation system includes Collab

In [None]:
#Out of syllabus question but within the domain
query = "How can Machine Learning help us in life?"
ask_RAG(query)

retrieved information: Machine learning is a branch of computer science which focuses on the use of data and algorithms to imitate the way humans learn Machine Learning is an important field of data science because there is too much data in the world for humans to process and Classical Machine Learning is dependent on human Intervention which is a sub-field of AI that uses algorithms trained on data to produce adaptable models to perform tasks  Machine learning algorithms are used to either make a prediction or classify a given data input. The data may or may not be labeled. The way a machine learning algorithm learns is with the help of a mathematical function which is responsible to evaluate the prediction of a model with it’s true label Models are then adjusted to reduce discrepancy between a known example and the model estimate with the help of a loss function.  
generated response: is a branch of computer science which focuses on the use of data and algorithms to


In [None]:
#Out of syllabus question and out of domain
query = "How to be happy in life?"
ask_RAG(query)

retrieved information: Precision counts the true positives out of all the items predicted to be positive Ideal Usage When the cost of false positive is too high Example: Email spam detectionA perfect model fit can be achieved By first constructing good evaluation metrics to give us feedback on model performance Then tuning hyper-parameters till the performance improves across the board Recall counts how many of the true positive items were correctly classified Ideal Usage When the missing a positive is too costly  
generated response: :         


### Testing with Langchain GPT-3.5

NOTE: This requires an API key which will be available for 3 months starting from 14th Feb 2024

In [None]:
!pip install langchain
!pip install langchain_community tiktoken langchain-openai langchainhub chromadb langchain
!pip install openai

In [None]:
import os
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = "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"

In [None]:
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

In [None]:
# Prompt
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
prompt

ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template='Answer the question based only on the following context:\n{context}\n\nQuestion: {question}\n'))])

In [None]:
# LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0, openai_api_key="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")

In [None]:
# Chain
chain = prompt | llm

### Try out the Langchain framework with GPT 3.5  
Please note that this will work only until the OpenAi API is free.

In [None]:
query = "Can you tell me something about collaborative filtering ?"

In [None]:
# Run
results = searcher.search(query, k=3)
all_data = []
for passage_id, passage_rank, passage_score in zip(*results):
  data = searcher.collection[passage_id]
  all_data.append(data)
  retrieved_context = ''.join(all_data)

chain.invoke({"context":retrieved_context,"question":query})



AIMessage(content='Collaborative filtering is a recommendation system that identifies users with similar preferences or behavior patterns and recommends items based on the interactions of these similar users. It can be user-based, where similar users are found, or item-based, where similar items are identified. Hybrid recommendation systems combine collaborative and content-based filtering methods to improve recommendation quality.')



In [None]:
query = "What is the secret to a successful life?"
# Run
results = searcher.search(query, k=3)
all_data = []
for passage_id, passage_rank, passage_score in zip(*results):
  data = searcher.collection[passage_id]
  all_data.append(data)
  retrieved_context = ''.join(all_data)

chain.invoke({"context":retrieved_context,"question":query})



AIMessage(content='The context provided does not address the question of what the secret to a successful life is.')

