# How to use functions with a knowledge base

Este Notebook se basa en los conceptos del Notebook de argument generation, creando un agente con acceso a una base de conocimientos y dos funciones que puede llamar en función de los requisitos del usuario.

Crearemos un agente que utilice datos de arXiv para responder a preguntas sobre temas académicos. Dispone de dos funciones

- <span style="color: rgb(0,255,0);">***get_articles***</span>: Una función que obtiene artículos de arXiv sobre un tema y los resume para el usuario con enlaces.
- <span style="color: rgb(0,255,0);">***read_article_and_summarize***</span>: Esta función toma uno de los artículos previamente buscados, lo lee en su totalidad y resume el argumento central, las pruebas y las conclusiones.

Esto te hará sentir cómodo con un flujo de trabajo multifunción que puede elegir entre múltiples servicios, y donde algunos de los datos de la primera función se persisten para ser utilizados por la segunda.

## Recorrido

Esta guia le llevará a través del siguiente flujo de trabajo:

- **Utilidades de búsqueda**: Creación de las dos funciones que acceden a arXiv en busca de respuestas.
- **Configurar Agente**: Creación del comportamiento del Agente que evaluará la necesidad de una función y, si se requiere una, llamará a esa función y presentará los resultados de vuelta al agente.
- **Conversación arXiv**: Reunir todo esto en una conversación en directo.

In [2]:
# !pip install scipy
# !pip install tenacity
# !pip install tiktoken==0.3.3
# !pip install termcolor 
# !pip install openai
# !pip install requests
# !pip install arxiv
# !pip install pandas
# !pip install PyPDF2
# !pip install tqdm

In [3]:
import os                                 # modulo del sistema operativo

import arxiv                              # Módulo para interactuar con el archivo de preprints de ArXiv.
import ast                                # Módulo para procesar árboles de sintaxis abstracta.
import concurrent                         # Módulo para ejecución concurrente.
from csv import writer                    # Para escribir en archivos CSV.
from IPython.display import display, Markdown, Latex    # Para mostrar datos en entornos Jupyter.
import json                               # Para manejar datos en formato JSON.
import openai                             # Módulo de OpenAI.
import pandas as pd                       # Para manejo y análisis de datos (PD).
from PyPDF2 import PdfReader              # Para leer archivos PDF.
import requests                           # Para realizar solicitudes HTTP.
from scipy import spatial                 # Para cálculos espaciales y de distancia.
from tenacity import retry, wait_random_exponential, stop_after_attempt     # Para reintentos con estrategias de espera.
import tiktoken                           # No es un módulo estándar; podría estar relacionado con tokens o autenticación.
from tqdm import tqdm                     # Para barras de progreso.
from termcolor import colored             # Para colorear texto en la terminal.
import config

GPT_MODEL = "gpt-3.5-turbo-0613"
EMBEDDING_MODEL = "text-embedding-ada-002"

api = config.OPENAI_API_KEY
#  KEY ='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'
openai.api_key = api 

## Utilidades de búsqueda

Primero configuraremos algunas utilidades que sustentarán nuestras dos funciones.

Los artículos descargados se almacenarán en un directorio (aquí utilizaremos ./data/papers). Creamos un archivo arxiv_library.csv para almacenar las incrustaciones y los detalles de los artículos descargados para recuperarlos mediante summarize_text.

In [4]:
directory = './data/papers'

if not os.path.exists(directory):    # Check if the directory already exists

    # If the directory doesn't exist, create it and any necessary intermediate directories
    os.makedirs(directory)
    print(f"Directory '{directory}' created successfully.")
else:

    # If the directory already exists, print a message indicating it
    print(f"Directory '{directory}' already exists.")

Directory './data/papers' created successfully.


In [5]:
# Set a directory to store downloaded papers
data_dir = os.path.join(os.curdir, "data", "papers")
paper_dir_filepath = "./data/arxiv_library.csv"

# Generate a blank dataframe where we can store downloaded files
df = pd.DataFrame(list())
df.to_csv(paper_dir_filepath)

In [45]:
@retry(wait=wait_random_exponential(min=1, max=40), stop=stop_after_attempt(3)) # Si la función falla, se reintenta con una espera que aumenta exponencialmente entre 1 y 40 segundos, hasta un máximo de 3 intentos.
def embedding_request(text):
    response = openai.embeddings.create(input=text, model=EMBEDDING_MODEL)
    return response

# def get_embedding(mi_texto, model= "text-embedding-ada-002"):
#     text = mi_texto.replace('\n', ' ')
#     respuesta = openai.embeddings.create(input= text, model= model)
#     return respuesta.data[0].embedding


def get_articles(query, library=paper_dir_filepath, top_k=5):
    """
    This function gets the top_k articles based on a user's query, sorted by relevance.
    It also downloads the files and stores them in arxiv_library.csv to be retrieved by the 
    read_article_and_summarize.

    La función get_articles busca artículos científicos en ArXiv basados en una consulta 
    (query). Los resultados se limitan a los top_k artículos más relevantes. Guarda los 
    metadatos de estos artículos en un archivo CSV (library).
    """
    search = arxiv.Search(                                # Realiza una búsqueda en ArXiv utilizando los criterios dados.
        query=query, 
        max_results=top_k, 
        sort_by=arxiv.SortCriterion.Relevance
    )

    result_list = []
    for result in search.results():                       # Itera sobre cada artículo encontrado en la búsqueda.
        result_dict = {}                                  # Crea un diccionario con los metadatos del artículo, incluyendo el título, el resumen y las URLs del artículo y del PDF.
        result_dict.update({"title"  : result.title})
        result_dict.update({"summary": result.summary})

        # Taking the first url provided
        result_dict.update({"article_url": [x.href for x in result.links][0]})
        result_dict.update({"pdf_url"    : [x.href for x in result.links][1]})
        result_list.append(result_dict)

        # Store references in library file
        response = embedding_request(text=result.title) # Obtiene el embedding del título del artículo usando embedding_request y 
        file_reference = [                              # prepara una referencia de archivo, que incluye el título, la ruta de descarga del PDF, y el embedding.
            result.title,
            result.download_pdf(data_dir),
            # response["data"][0]["embedding"],
            response.data[0].embedding
        ]

        # Write to file
        with open(library, "a") as f_object:      # Abre el archivo CSV en modo de añadido ("a") y escribe la referencia del archivo (metadatos del artículo) en él.
            writer_object = writer(f_object)
            writer_object.writerow(file_reference)
            f_object.close()
            
    return result_list


In [46]:
# Test that the search is working
result_output = get_articles("ppo reinforcement learning")
result_output[0]


  for result in search.results():


{'title': 'Proximal Policy Optimization and its Dynamic Version for Sequence Generation',
 'summary': 'In sequence generation task, many works use policy gradient for model\noptimization to tackle the intractable backpropagation issue when maximizing\nthe non-differentiable evaluation metrics or fooling the discriminator in\nadversarial learning. In this paper, we replace policy gradient with proximal\npolicy optimization (PPO), which is a proved more efficient reinforcement\nlearning algorithm, and propose a dynamic approach for PPO (PPO-dynamic). We\ndemonstrate the efficacy of PPO and PPO-dynamic on conditional sequence\ngeneration tasks including synthetic experiment and chit-chat chatbot. The\nresults show that PPO and PPO-dynamic can beat policy gradient by stability and\nperformance.',
 'article_url': 'http://arxiv.org/abs/1808.07982v1',
 'pdf_url': 'http://arxiv.org/pdf/1808.07982v1'}

In [48]:
for result in result_output:
    for key, value in result.items():
        print(key, ': ', value)
    print('\n')

title :  Proximal Policy Optimization and its Dynamic Version for Sequence Generation
summary :  In sequence generation task, many works use policy gradient for model
optimization to tackle the intractable backpropagation issue when maximizing
the non-differentiable evaluation metrics or fooling the discriminator in
adversarial learning. In this paper, we replace policy gradient with proximal
policy optimization (PPO), which is a proved more efficient reinforcement
learning algorithm, and propose a dynamic approach for PPO (PPO-dynamic). We
demonstrate the efficacy of PPO and PPO-dynamic on conditional sequence
generation tasks including synthetic experiment and chit-chat chatbot. The
results show that PPO and PPO-dynamic can beat policy gradient by stability and
performance.
article_url :  http://arxiv.org/abs/1808.07982v1
pdf_url :  http://arxiv.org/pdf/1808.07982v1


title :  CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric
summary :  As an algorithm based on

In [22]:
def strings_ranked_by_relatedness(
    query: str,
    df: pd.DataFrame,
    relatedness_fn=lambda x, y: 1 - spatial.distance.cosine(x, y),
    top_n: int = 100,
) -> list[str]:
    """Returns a list of strings and relatednesses, sorted from most related to least."""

    query_embedding_response = embedding_request(query)          # se obtiene el embedding de la consulta
    query_embedding = query_embedding_response.data[0].embedding # query_embedding_response["data"][0]["embedding"]

    strings_and_relatednesses = [                                
        (row["filepath"], relatedness_fn(query_embedding, row["embedding"]))
        for i, row in df.iterrows()
    ]

    # Las rutas de archivos se ordenan en función de su similitud con la consulta y se devuelven las top_n rutas más relevantes.
    strings_and_relatednesses.sort(key=lambda x: x[1], reverse=True)
    strings, relatednesses = zip(*strings_and_relatednesses)
    return strings[:top_n]

In [30]:
def read_pdf(filepath):
    """
    Takes a filepath to a PDF and returns a string of the PDF's contents
    
    Esta función toma una ruta de archivo PDF y devuelve su contenido como texto.
    """
    # creating a pdf reader object
    reader = PdfReader(filepath)
    pdf_text = ""
    page_number = 0
    for page in reader.pages:    # Itera sobre cada página del PDF, extrayendo el texto y anotando el número de página.
        page_number += 1
        pdf_text += page.extract_text() + f"\nPage Number: {page_number}"
    return pdf_text

# Split a text into smaller chunks of size n, preferably ending at the end of a sentence
def create_chunks(text, n, tokenizer):
    """
    Returns successive n-sized chunks from provided text.

    Esta función divide un texto en segmentos (chunks) de 
    tamaño aproximado n, utilizando un tokenizador.
    """
    tokens = tokenizer.encode(text)
    i = 0
    # Se itera sobre los tokens del texto, buscando el final de las oraciones para dividir el texto de manera coherente.
    while i < len(tokens):  

        # Find the nearest end of sentence within a range of 0.5 * n and 1.5 * n tokens
        j = min(i + int(1.5 * n), len(tokens))
        while j > i + int(0.5 * n):
            
            # Decode the tokens and check for full stop or newline
            chunk = tokenizer.decode(tokens[i:j])
            if chunk.endswith(".") or chunk.endswith("\n"):
                break
            j -= 1
        # If no end of sentence found, use n tokens as the chunk size
        if j == i + int(0.5 * n):
            j = min(i + n, len(tokens))
        yield tokens[i:j]
        i = j


def extract_chunk(content, template_prompt):
    """
    This function applies a prompt to some input content. In this case it returns a summarized chunk of text

    Aplica un prompt a un contenido de entrada y devuelve una parte resumida del texto.
    """

    prompt = template_prompt + content
    response = openai.chat.completions.create(
        model=GPT_MODEL, 
        messages=[{"role": "user", "content": prompt}], 
        temperature=0
    )
    return response.choices[0].message.content #response["choices"][0]["message"]["content"]


def summarize_text(query):
    """
    This function does the following:
        - Reads in the arxiv_library.csv file in including the embeddings
        - Finds the closest file to the user's query
        - Scrapes the text out of the file and chunks it
        - Summarizes each chunk in parallel
        - Does one final summary and returns this to the user

    Este bloque de código automatiza el proceso de resumir un documento académico 
    relacionado con una consulta del usuario. El proceso incluye:
    
        - la búsqueda del documento más relevante
        - la extracción y fragmentación del texto. 
        - la generación de resúmenes utilizando la API de OpenAI.
    """

    # A prompt to dictate how the recursive summarizations should approach the input paper
    summary_prompt = """Summarize this text from an academic paper. Extract any key points with reasoning.\n\nContent:"""

    # Lee un archivo CSV (arxiv_library.csv) que contiene información sobre documentos académicos previamente descargados 
    # y sus embeddings. 
    library_df = pd.read_csv(paper_dir_filepath).reset_index()

    # Si el archivo está vacío, realiza una búsqueda inicial para descargar documentos.
    if len(library_df) == 0:           
        print("No papers searched yet, downloading first.")
        get_articles(query)
        print("Papers downloaded, continuing")
        library_df = pd.read_csv(paper_dir_filepath).reset_index()

    library_df.columns = ["title", "filepath", "embedding"]
    library_df["embedding"] = library_df["embedding"].apply(ast.literal_eval)
    strings = strings_ranked_by_relatedness(query, library_df, top_n=1)       # para encontrar el documento más relevante en relación con la consulta.
    print("Chunking text from paper")
    pdf_text = read_pdf(strings[0])                                           # Lee el texto del documento 

    # Initialise tokenizer
    tokenizer = tiktoken.get_encoding("cl100k_base")
    results = ""

    # Chunk up the document into 1500 token chunks
    chunks = create_chunks(pdf_text, 1500, tokenizer)                         # lo divide en fragmentos de tamaño manejable para el procesamiento
    text_chunks = [tokenizer.decode(chunk) for chunk in chunks]
    print("Summarizing each chunk of text")

    # Parallel process the summaries
    # Utiliza un ThreadPoolExecutor para procesar los resúmenes de cada 
    # fragmento de texto en paralelo, utilizando la función extract_chunk.
    with concurrent.futures.ThreadPoolExecutor(
        max_workers=len(text_chunks)
    ) as executor:
        futures = [
            executor.submit(extract_chunk, chunk, summary_prompt)
            for chunk in text_chunks
        ]
        with tqdm(total=len(text_chunks)) as pbar:
            for _ in concurrent.futures.as_completed(futures):
                pbar.update(1)
        for future in futures:
            data = future.result()
            results += data

    # Final summary
    # Compila los resúmenes de los fragmentos y envía esta información a 
    # la API de OpenAI para generar un resumen final que responda a la consulta del usuario.
    print("Summarizing into overall summary")
    response = openai.chat.completions.create(
        model=GPT_MODEL,
        messages=[
            {
                "role": "user",
                "content": f"""Write a summary collated from this collection of key points extracted from an academic paper.
                        The summary should highlight the core argument, conclusions and evidence, and answer the user's query.
                        User query: {query}
                        The summary should be structured in bulleted lists following the headings Core Argument, Evidence, and Conclusions.
                        Key points:\n{results}\nSummary:\n""",
            }
        ],
        temperature=0,
    )
    return response

In [31]:
# Test the summarize_text function works
chat_test_response = summarize_text("PPO reinforcement learning sequence generation")

Chunking text from paper
Summarizing each chunk of text


100%|██████████| 4/4 [00:08<00:00,  2.19s/it]


Summarizing into overall summary


In [34]:
print(chat_test_response.choices[0].message.content) # ["choices"][0]["message"]["content"])

Core Argument:
The paper discusses the use of Proximal Policy Optimization (PPO) in sequence generation tasks, specifically in the context of chit-chat chatbots. The authors argue that PPO is a more efficient reinforcement learning algorithm compared to policy gradient, which is commonly used in these tasks. They propose a dynamic version of PPO (PPO-dynamic) and demonstrate its efficacy in synthetic experiments and chit-chat chatbot tasks.

Evidence:
- PPO-dynamic achieves high precision scores in a synthetic counting task, comparable to other algorithms such as REINFORCE and MIXER.
- PPO-dynamic shows faster progress in learning compared to PPO in the synthetic counting task.
- In the chit-chat chatbot task, PPO-dynamic achieves a slightly higher BLEU-2 score than REINFORCE and PPO.
- The learning curves of PPO and PPO-dynamic are more stable than policy gradient, and PPO-dynamic converges faster.

Conclusions:
- PPO is a better optimization method for sequence learning compared to p

# Configure Agent

Crearemos nuestro agente en este paso, incluyendo una clase Conversation para soportar múltiples turnos con la API, y algunas funciones Python para permitir la interacción entre la <span style="color: rgb(0,255,0);">***Chat Completions API***</span> y las funciones de nuestra base de conocimiento.

In [35]:
@retry(wait=wait_random_exponential(min=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, functions=None, model=GPT_MODEL):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer " + openai.api_key,
    }
    json_data = {"model": model, "messages": messages}
    if functions is not None:
        json_data.update({"functions": functions})
    try:
        response = requests.post(
            "https://api.openai.com/v1/chat/completions",
            headers=headers,
            json=json_data,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e


In [36]:
class Conversation:
    def __init__(self):
        self.conversation_history = []

    def add_message(self, role, content):
        message = {"role": role, "content": content}
        self.conversation_history.append(message)

    def display_conversation(self, detailed=False):
        role_to_color = {
            "system": "red",
            "user": "green",
            "assistant": "blue",
            "function": "magenta",
        }
        for message in self.conversation_history:
            print(
                colored(
                    f"{message['role']}: {message['content']}\n\n",
                    role_to_color[message["role"]],
                )
            )

In [37]:
# Initiate our get_articles and read_article_and_summarize functions
arxiv_functions = [
    {
        "name": "get_articles",
        "description": """Use this function to get academic papers from arXiv to answer user questions.""",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": f"""
                            User query in JSON. Responses should be summarized and should include the article URL reference
                            """,
                }
            },
            "required": ["query"],
        },
    },
    {
        "name": "read_article_and_summarize",
        "description": """Use this function to read whole papers and provide a summary for users.
        You should NEVER call this function before get_articles has been called in the conversation.""",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": f"""
                            Description of the article in plain text based on the user's query
                            """,
                }
            },
            "required": ["query"],
        },
    }
]

In [38]:
def chat_completion_with_function_execution(messages, functions=[None]):
    """This function makes a ChatCompletion API call with the option of adding functions"""
    response = chat_completion_request(messages, functions)
    full_message = response.json()["choices"][0]
    if full_message["finish_reason"] == "function_call":
        print(f"Function generation requested, calling function")
        return call_arxiv_function(messages, full_message)
    else:
        print(f"Function not required, responding to user")
        return response.json()


def call_arxiv_function(messages, full_message):
    """Function calling function which executes function calls when the model believes it is necessary.
    Currently extended by adding clauses to this if statement."""

    if full_message["message"]["function_call"]["name"] == "get_articles":
        try:
            parsed_output = json.loads(
                full_message["message"]["function_call"]["arguments"]
            )
            print("Getting search results")
            results = get_articles(parsed_output["query"])
        except Exception as e:
            print(parsed_output)
            print(f"Function execution failed")
            print(f"Error message: {e}")
        messages.append(
            {
                "role": "function",
                "name": full_message["message"]["function_call"]["name"],
                "content": str(results),
            }
        )
        try:
            print("Got search results, summarizing content")
            response = chat_completion_request(messages)
            return response.json()
        except Exception as e:
            print(type(e))
            raise Exception("Function chat request failed")

    elif (
        full_message["message"]["function_call"]["name"] == "read_article_and_summarize"
    ):
        parsed_output = json.loads(
            full_message["message"]["function_call"]["arguments"]
        )
        print("Finding and reading paper")
        summary = summarize_text(parsed_output["query"])
        return summary

    else:
        raise Exception("Function does not exist and cannot be called")

## arXiv conversation

Let's put this all together by testing our functions out in conversation.

In [39]:
# Start with a system message
paper_system_message = """You are arXivGPT, a helpful assistant pulls academic papers to answer user questions.
You summarize the papers clearly so the customer can decide which to read to answer their question.
You always provide the article_url and title so the user can understand the name of the paper and click through to access it.
Begin!"""
paper_conversation = Conversation()
paper_conversation.add_message("system", paper_system_message)


In [40]:
# Add a user message
paper_conversation.add_message("user", "Hi, how does PPO reinforcement learning work?")
chat_response = chat_completion_with_function_execution(
    paper_conversation.conversation_history, functions=arxiv_functions
)
assistant_message = chat_response["choices"][0]["message"]["content"]
paper_conversation.add_message("assistant", assistant_message)
display(Markdown(assistant_message))


Function generation requested, calling function
Getting search results


  for result in search.results():


Got search results, summarizing content


PPO (Proximal Policy Optimization) is an efficient reinforcement learning algorithm that has gained popularity in recent years. Here are a few papers that discuss PPO and its variants:

1. Title: "Proximal Policy Optimization and its Dynamic Version for Sequence Generation"
   - Summary: This paper explores the use of PPO for sequence generation tasks, such as chatbot development. The authors propose a dynamic approach called PPO-dynamic and show that PPO outperforms policy gradient in terms of stability and performance.
   - Article URL: [arXiv:1808.07982v1](http://arxiv.org/abs/1808.07982v1)

2. Title: "CIM-PPO: Proximal Policy Optimization with Liu-Correntropy Induced Metric"
   - Summary: This article introduces a variation of PPO called CIM-PPO, which incorporates the theory of correntropy (a symmetry metric method) into PPO. The authors compare CIM-PPO with KL-PPO and Clip-PPO and conduct experiments on OpenAI gym to demonstrate the effectiveness of the new algorithm.
   - Article URL: [arXiv:2110.10522v2](http://arxiv.org/abs/2110.10522v2)

3. Title: "A2C is a special case of PPO"
   - Summary: This paper presents theoretical justifications and pseudocode analysis to show that Advantage Actor-critic (A2C) is a special case of PPO. The authors provide empirical evidence using Stable-baselines3 to demonstrate that A2C and PPO produce the exact same models under controlled settings.
   - Article URL: [arXiv:2205.09123v1](http://arxiv.org/abs/2205.09123v1)

4. Title: "Proximal Policy Optimization via Enhanced Exploration Efficiency"
   - Summary: This paper focuses on the exploration ability of PPO and proposes an exploration enhancement mechanism based on uncertainty estimation called IEM-PPO. The authors evaluate the method on MuJoCo physical simulator tasks and compare it with curiosity-driven exploration algorithms and the original PPO. The results show improved sample efficiency and cumulative reward with stability and robustness.
   - Article URL: [arXiv:2011.05525v1](http://arxiv.org/abs/2011.05525v1)

5. Title: "Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks"
   - Summary: This work applies the tricks introduced in DreamerV3 (a model-based approach) to PPO. The study investigates the generality of these tricks and their applicability to PPO. The experiments conducted on the Arcade Learning Environment and the DeepMind Control Suite show that the tricks do not generally improve PPO's performance but highlight specific cases where they succeed.
   - Article URL: [arXiv:2310.17805v1](http://arxiv.org/abs/2310.17805v1)

You can access the full papers by clicking on the article URLs provided.

In [42]:
# Add another user message to induce our system to use the second tool
paper_conversation.add_message(
    "user",
    "Can you read the PPO sequence generation paper for me and give me a summary",
)
updated_response = chat_completion_with_function_execution(
    paper_conversation.conversation_history, functions=arxiv_functions
)
display(Markdown(updated_response.choices[0].message.content))  #["choices"][0]["message"]["content"]))


Function generation requested, calling function
Finding and reading paper
Chunking text from paper
Summarizing each chunk of text


100%|██████████| 4/4 [00:09<00:00,  2.27s/it]


Summarizing into overall summary


Core Argument:
- The paper discusses the use of proximal policy optimization (PPO) in sequence generation tasks, specifically in the context of chit-chat chatbots.
- The authors argue that PPO is a more efficient reinforcement learning algorithm compared to policy gradient, which is commonly used in these tasks.
- They propose a dynamic approach for PPO (PPO-dynamic) and demonstrate its efficacy in synthetic experiments and chit-chat chatbot tasks.

Evidence:
- PPO-dynamic achieves a high precision score in a synthetic counting task, comparable to other algorithms like REINFORCE and MIXER.
- In the chit-chat chatbot task, PPO-dynamic achieves a slightly higher BLEU-2 score than REINFORCE and PPO.
- The learning curves of PPO and PPO-dynamic are more stable than policy gradient, and PPO-dynamic converges faster.

Conclusions:
- PPO is a better optimization method for sequence learning compared to policy gradient.
- PPO-dynamic further improves the optimization process by dynamically adjusting the hyperparameters.
- PPO can be used as a new optimization method for GAN-based sequence learning for better performance.