# Tailored Abstractive Summary
A "summary" is a shorted restatement of the information from the target content, ideally it maintains information with fewer words.

An "abstractive" summary is a shorter restatement of the information from content without any requirement to reuse the same words or phrasing. This is often implemented using a generative approach. This is usually compared to an "extractive" summary which reduces the content word count by selecting words and phrases from the original content and removing everything else.

A "tailored" summary is a summary that prioritizes specific information during the summarization. In this case, we attempt to focus on the information requested by the user (in the User's Request statement). Here, we also augment the abstractive summary process with curated reference knowledge that makes the connection between the request and the content that justified the recommendation in the first place. Tailoring is often implemented like RAG by augmenting the prompt sent to a large language model with information we expect to be useful - unlike traditional RAG, Tailoring does attempts to direct the generation rather then add additional information.

The result of this step includes:
- Summary nodes, connected to Content nodes with a SUMMARIZES relationship and to Recommendation nodes with a FOCUS_ON relationship

## Setup

In [None]:
import os
import logging

## Parameters
OpenTLDR workflows use the notebook block tagged as "parameters" to inject variables (for example to change the LLM model).

> **Do Not Change Variable Names in the Parameters Block** you are welcome to change the values of these parameter variables, but please do not change their names. They are used elsewhere in the notebook and in other workflow processes.

In [None]:
#Parameters

# When run an LLM locally, you need to download the model to your local machine
llm_config = {'type': 'GPT4ALL', 'device':'gpu', 'model':'../LLM_Models/mistral-7b-openorca.gguf2.Q4_0.gguf'}
#llm_config = {'type': 'Ollama', 'device':'local', 'model':'mistral:latest'}

llm_prompt = '''
    Given these facts: {knowledge}
    Concisely summarize this content: {content}
    While focusing on answering this: {request}
    '''

# Logging level ranges are (from least to most verbose): ERROR, WARN, INFO, DEBUG
logging_level= logging.INFO

# List of the UniqueIds of Requests to add summaries
list_of_uids = None

# level of unnecessary output
verbose = True

In [None]:
logging.getLogger("OpenTLDR").setLevel(logging_level)

from opentldr import KnowledgeGraph
kg=KnowledgeGraph()

import opentldr.Domain as domain

### Load Content

In [None]:
if list_of_uids is None:
    # default to getting all Requests
    list_of_uids = kg.get_all_node_uids_by_tag("Request")

if verbose:
    print ("Found {} Request nodes to summarize.".format(len(list_of_uids)))


## Run an LLM Model
This cell setups of access to a (usually locally running) LLM based on the llm_config parameter.

Ollama: runs locally with the Ollama service
- You need to start the Ollama server (ollama serve)
- It will attempt to pull models based on config

GPT4ALL: runs locally with a .gguf formatted model.
- When you run an LLM localling using GPT4ALL, you need to download a model file to your local machine.
- Model files are large and not part of the git repository.
- You can download them from here: https://gpt4all.io under "Model Explorer" and put them in a "models" folder.

All:
- Be sure to check the license for the model before using.


In [None]:
from SummarizeWithGPT4All import SummarizeWithGPT4All
from SummarizeWithLocalOllama import SummarizeWithLocalOllama

llm = None

match (llm_config['type'].lower()):
    case "gpt4all": 
        llm = SummarizeWithGPT4All(llm_config['model'],device=llm_config['device'], logging_level=logging_level)

    case "ollama":
        # TODO config for local and remote ollama services
        llm = SummarizeWithLocalOllama(model_name=llm_config['model'], logging_level=logging_level)
    case _:
        raise ValueError("No LLM type support for {}.".format(llm_config['type']))

## Compute the Shortest Path for each Recommendation between the Source Article and Query (excluding the recommendation itself)
- TODO: processing the path could be much more interesting that it is now but doing more with other nodes/edges

In [None]:
def explain(something):
    if hasattr(something, 'to_text') and callable(something.to_text):
        return something.to_text()
    
    return ""

## Build the prompts and run the LLM


In [None]:
for request_uid in list_of_uids:
    request = kg.get_request_by_uid(request_uid)

    if request is None:
        print("No Request found for uid: {}".format(request_uid))

    print("Request ({request}):".format(request=request.text))
    recommendations = kg.get_recommendations_by_request(request=request)
    
    for recommendation in recommendations:
        for content in recommendation.recommends:
            print ("Recommended {title} ({score}): {url}".format(title=content.title, score=round(recommendation.score,3),url=content.url))
            path_text=""
            path=kg.shortest_path(request,content)

            if path is not None:
                for hop in path:
                    path_text+=explain(hop)+" "
            
            original= content.text
            if verbose:
                print("\tOriginal Content:\t{text}".format(text=original))
                print("\tPath Text:\t{text}".format(text=path_text))

            prompt_text = llm_prompt.format(knowledge=path_text, content=content.text, request=request.text )
            summary = llm.summarize(prompt_text)
            kg.add_summary(text=summary,content=content,recommendation=recommendation)

            print("Summary reduced {reduction}% of content:\t{text}\n".format(reduction=round(((len(original)-len(summary))/len(original))*100,1),text=summary))

In [None]:
kg.close()