![JohnSnowLabs](https://sparknlp.org/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/llama.cpp/GGUFRankingFinisher_for_AutoGGUFReranker.ipynb)

# GGUFRankingFinisher for AutoGGUFReranker

This notebook will show you how to use the `GGUFRankingFinisher` to post-process the relevance scores produced by the AutoGGUFReranker.

Let's keep in mind a few things before we start 😊

- `AutoGGUFReranker` was introduced in `Spark NLP 6.1.2`, enabling efficient and quantized reranking of documents with LLMs. Please make sure you have upgraded to the latest Spark NLP release.
- `GGUFRankingFinisher` was introduced in `Spark NLP 6.1.3`, to post-process the document rankings

`GGUFRankingFinisher` for `AutoGGUFReranker` outputs that provides ranking capabilities
including top-k selection, sorting by relevance score, and score normalization.

This finisher processes the output of AutoGGUFReranker, which contains documents with
relevance scores in their metadata. It provides several options for post-processing:

- Top-k selection: Select only the top k documents by relevance score
- Score thresholding: Filter documents by minimum relevance score
- Min-max scaling: Normalize relevance scores to 0-1 range
- Sorting: Sort documents by relevance score in descending order
- Ranking: Add rank information to document metadata

The finisher preserves the document annotation structure while adding ranking information
to the metadata and optionally filtering/sorting the documents.

## Spark NLP Setup

In [None]:
# Only execute this if you are on Google Colab
! wget -q http://setup.johnsnowlabs.com/colab.sh -O - | bash

In [None]:
import sparknlp

# let's start Spark with Spark NLP with GPU enabled. If you don't have GPUs available remove this parameter.
spark = sparknlp.start(gpu=True)
print(sparknlp.version())

6.1.3


## Producing Document Rankings

Let's start by producing some document ranking. We first define a suitable pipeline and run it on some data. The relevance scores will then be in the metadata.

In [None]:
import sparknlp
from sparknlp.base import *
from sparknlp.annotator import *
from pyspark.ml import Pipeline

document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")

auto_gguf_model = (
    AutoGGUFReranker.loadSavedModel(
        "/home/ducha/Workspace/scala/spark-nlp-release/tmp_autogguf_reranker/bge-reranker-v2-m3-q4_k_m.gguf",
        spark,
    )
    .setInputCols("document")
    .setOutputCol("reranked_documents")
    .setQuery("A man is eating pasta.")
    .setDisableLog(True)
)

pipeline = Pipeline().setStages([document_assembler, auto_gguf_model])

data = spark.createDataFrame(
    [
        ["A man is eating food."],
        ["A man is eating a piece of bread."],
        ["The girl is carrying a baby."],
        ["A man is riding a horse."],
        ["A young girl is playing violin."],
    ]
).toDF("text")

result = pipeline.fit(data).transform(data)


# Verify results contain relevance scores
result.selectExpr("explode(reranked_documents) as reranked_document").selectExpr(
    "reranked_document.result", "reranked_document.metadata['relevance_score']"
).show(truncate=False)

Extracted 'libjllama.so' to '/tmp/libjllama.so'


ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no                                      
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3070, compute capability 8.6, VMM: yes
[Stage 1:>                                                          (0 + 4) / 4]

+---------------------------------+-------------------------------------------+
|result                           |reranked_document.metadata[relevance_score]|
+---------------------------------+-------------------------------------------+
|A man is eating food.            |7.023443                                   |
|A man is eating a piece of bread.|2.1200795                                  |
|The girl is carrying a baby.     |-10.790537                                 |
|A man is riding a horse.         |-8.433026                                  |
|A young girl is playing violin.  |-10.778883                                 |
+---------------------------------+-------------------------------------------+



                                                                                

# Post-Processing Ranking with `GGUFRankingFinisher`

Let's now use the `GGUFRankingFinisher` to post-process and sort our results. For this the annotator will

1. automatically sort
2. only choose the top 3 results
3. scale the relevance scores to be between $[0, 1]$, available as `scaled_score` in the metadata
4. set a minimum relevance score after rescaling


In [None]:
from sparknlp.base import *

finisher = (
    GGUFRankingFinisher()
    .setInputCols("reranked_documents")
    .setOutputCol("finished_reranked_documents")
    .setTopK(3)
    .setMinRelevanceScore(0.3)
    .setMinMaxScaling(True)
)
finisher_result = finisher.transform(result)

In [None]:
finisher_result.selectExpr(
    "explode(finished_reranked_documents) as finished_reranked_documents"
).show(truncate=False)

+------------------------------------------------------------------------------------------------------------------------------------------------------------+
|finished_reranked_documents                                                                                                                                 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
|{document, 0, 20, A man is eating food., {sentence -> 0, query -> A man is eating pasta., relevance_score -> 1.0000162790005285, rank -> 1}, []}            |
|{document, 0, 32, A man is eating a piece of bread., {sentence -> 0, query -> A man is eating pasta., relevance_score -> 0.7246697769085113, rank -> 2}, []}|
+------------------------------------------------------------------------------------------------------------------------------------------------------------+

