# Function Score

In this notebook, we'll explore how to use function score directives to customize document scoring. Function score directives allow you to modify the relevance scores of documents based on various factors, such as field values, scripts, or random functions.

## Introduction to Function Score

Elasticsearch's function score feature allows you to modify the relevance scores of documents based on various factors. This is particularly useful for:

1. **Boosting recent content**: Giving higher scores to more recent documents
2. **Popularity boosting**: Boosting documents based on view counts, likes, or other popularity metrics
3. **Personalization**: Customizing scores based on user preferences or behavior
4. **Geographical relevance**: Boosting documents based on proximity to a location
5. **Business rules**: Implementing custom business logic in scoring

The Elasticsearch Query Toolkit provides several function score directives to help you implement these scoring strategies.

## Setup

Let's import the necessary modules:

In [None]:
import json
from elastictoolkit.queryutils.builder.functionscoreengine import FunctionScoreEngine
from elastictoolkit.queryutils.builder.scorefunctiondirective import (
    ScriptScoreDirective,
    RandomScoreDirective,
    FieldValueFactorDirective,
    DecayFunctionDirective,
    WeightDirective
)
from elastictoolkit.queryutils.builder.matchdirective import ConstMatchDirective
from elastictoolkit.queryutils.builder.directivevaluemapper import DirectiveValueMapper
from elastictoolkit.queryutils.types import FieldValue
from elastictoolkit.queryutils.consts import FieldMatchType, ScoreNullFilterAction
from elasticquerydsl.filter import MatchAllQuery

# Helper function to print queries as formatted JSON
def print_query(directive):
    query = directive.to_dsl()
    print(json.dumps(query.to_query(), indent=2))

## Basic Function Score

Let's start with a simple example of using function score to boost documents based on a field value:

In [None]:
# Define a value mapper for our product search
class ProductValueMapper(DirectiveValueMapper):
    category = FieldValue(
        fields=["category"],
        values_list=["match_params.category"]
    )
    popularity = FieldValue(
        fields=["popularity"],
        values_list=["match_params.min_popularity"]
    )

# Create a function score engine
class ProductScoreEngine(FunctionScoreEngine):
    # Boost documents based on popularity
    popularity = FieldValueFactorDirective(
        field="popularity_score",
        factor=1.2,
        modifier="log1p",
        missing=1.0,
        weight=2.0
    )
    
    # Apply a filter to only boost certain categories
    category = ScriptScoreDirective(
        script="doc['boost_factor'].value",
        filter=ConstMatchDirective(rule=FieldMatchType.ANY),
        weight=1.5
    )
    
    class Config:
        score_mode = "sum"  # Sum the scores from all functions
        boost_mode = "multiply"  # Multiply the combined function score with the query score
        value_mapper = ProductValueMapper()

# Create an engine instance with match parameters
engine = ProductScoreEngine().set_match_params(
    {
        "category": "electronics",
        "min_popularity": 4
    }
)

# Set the base query
engine.set_match_dsl(MatchAllQuery())

# Generate the query
print_query(engine)

In this example, we've created a function score engine that boosts documents based on two factors:

1. **Popularity Boost**: Uses the `FieldValueFactorDirective` to boost documents based on their popularity score
2. **Category Boost**: Uses the `ScriptScoreDirective` with a filter to boost documents in specific categories

The `score_mode` is set to "sum", which means the scores from all functions are added together. The `boost_mode` is set to "multiply", which means the combined function score is multiplied with the original query score.

## Boosting Recent Content

A common use case for function score is boosting recent content. Let's create an engine that boosts recent blog posts:

In [None]:
# Define a value mapper for our blog search
class BlogValueMapper(DirectiveValueMapper):
    category = FieldValue(
        fields=["category"],
        values_list=["match_params.category"]
    )
    author = FieldValue(
        fields=["author"],
        values_list=["match_params.author"]
    )

# Create a function score engine for blog posts
class BlogScoreEngine(FunctionScoreEngine):
    # Boost recent posts using a decay function
    recency = DecayFunctionDirective(
        field="published_date",
        origin="now",
        scale="30d",  # 30 days
        decay=0.5,    # Score will be halved after 30 days
        decay_type="exp",  # Exponential decay
        weight=2.0
    )
    
    # Boost popular posts
    popularity = FieldValueFactorDirective(
        field="view_count",
        factor=0.1,
        modifier="log1p",
        missing=1.0,
        weight=1.0
    )
    
    # Boost posts by featured authors | For `filter` to work the attribute-key must match as that in the `BlogValueMapper`
    author = WeightDirective(
        filter=ConstMatchDirective(rule=FieldMatchType.ANY),
        weight=1.5
    )
    
    class Config:
        score_mode = "multiply"  # Multiply the scores from all functions
        boost_mode = "sum"       # Add the combined function score to the query score
        max_boost = 3.0          # Cap the boost at 3.0
        value_mapper = BlogValueMapper()

# Create an engine instance with match parameters
engine = BlogScoreEngine().set_match_params(
    {
        "category": "technology",
        "author": "featured_author"
    }
)

# Set the base query
engine.set_match_dsl(MatchAllQuery())

# Generate the query
print_query(engine)

In this example, we've created a function score engine for a blog search that boosts posts based on three factors:

1. **Recency**: Uses the `DecayFunctionDirective` to boost recent posts, with the score decaying exponentially over time
2. **Popularity**: Uses the `FieldValueFactorDirective` to boost posts based on view count
3. **Featured Authors**: Uses the `WeightDirective` with a filter to boost posts by featured authors

The `score_mode` is set to "multiply", which means the scores from all functions are multiplied together. The `boost_mode` is set to "sum", which means the combined function score is added to the original query score. The `max_boost` is set to 3.0, which caps the boost at 3.0.

## Geographical Relevance

Another common use case for function score is boosting documents based on geographical proximity. Let's create an engine for a restaurant search that boosts nearby restaurants:

In [None]:
# Define a value mapper for our restaurant search
class RestaurantValueMapper(DirectiveValueMapper):
    cuisine = FieldValue(
        fields=["cuisine"],
        values_list=["match_params.cuisine"]
    )
    price_range = FieldValue(
        fields=["price_range"],
        values_list=["match_params.price_range"]
    )

# Create a function score engine for restaurant search
class RestaurantScoreEngine(FunctionScoreEngine):
    # Boost nearby restaurants
    proximity = DecayFunctionDirective(
        field="location",
        origin="40.7128,-74.0060",  # New York City coordinates
        scale="5km",                # 5 kilometers
        offset="1km",               # No decay within 1km
        decay=0.5,                  # Score will be halved at 5km distance
        decay_type="gauss",         # Gaussian decay
        weight=3.0
    )
    
    # Boost highly-rated restaurants
    rating = FieldValueFactorDirective(
        field="rating",
        factor=1.0,
        modifier="sqrt",
        missing=3.0,
        weight=2.0
    )
    
    # Add some randomness to avoid always showing the same results
    random = RandomScoreDirective(
        seed=42,  # Fixed seed for reproducibility
        weight=0.1
    )
    
    class Config:
        score_mode = "sum"      # Sum the scores from all functions
        boost_mode = "multiply" # Multiply the combined function score with the query score
        min_score = 1.0         # Only return documents with a minimum score of 1.0
        value_mapper = RestaurantValueMapper()

# Create an engine instance with match parameters
engine = RestaurantScoreEngine().set_match_params(
    {
        "cuisine": "italian",
        "price_range": "$$"
    }
)

# Set the base query
engine.set_match_dsl(MatchAllQuery())

# Generate the query
print_query(engine)

In this example, we've created a function score engine for a restaurant search that boosts restaurants based on three factors:

1. **Proximity**: Uses the `DecayFunctionDirective` with a Gaussian decay to boost nearby restaurants
2. **Rating**: Uses the `FieldValueFactorDirective` to boost highly-rated restaurants
3. **Randomness**: Uses the `RandomScoreDirective` to add a small amount of randomness to the scores

The `score_mode` is set to "sum", which means the scores from all functions are added together. The `boost_mode` is set to "multiply", which means the combined function score is multiplied with the original query score. The `min_score` is set to 1.0, which means only documents with a minimum score of 1.0 will be returned.

## Conditional Scoring

Sometimes you want to apply scoring functions conditionally based on match parameters. Let's create an engine that applies different scoring strategies based on user preferences:

In [None]:
# Define a value mapper for our product search
class ProductValueMapper(DirectiveValueMapper):
    category = FieldValue(
        fields=["category"],
        values_list=["match_params.category"]
    )
    brand = FieldValue(
        fields=["brand"],
        values_list=["match_params.brand"]
    )

# Create a function score engine with conditional scoring
class ConditionalScoreEngine(FunctionScoreEngine):
    # Boost by popularity (applied only if sort_by_popularity is true)
    popularity = ScriptScoreDirective(
        script="doc['popularity'].value * params.factor",
        weight=2.0,
    ).set_script_params(factor=1.5)
    
    recency = DecayFunctionDirective(
        field="created_date",
        origin="now",
        scale="30d",
        decay=0.5,
        decay_type="exp",
        weight=1.5,
    )
    
    # Boost preferred brands (applied only if `brand` is provided in `match_params`)
    # Note: `nullable_value` must also to set to `True` to allow a null output when resolving filter DSL
    brand = WeightDirective(
        filter=ConstMatchDirective(rule=FieldMatchType.ANY, nullable_value=True),
        weight=2.0,
        null_filter_action=ScoreNullFilterAction.DISABLE_FUNCTION
    )
    
    class Config:
        score_mode = "sum"
        boost_mode = "multiply"
        value_mapper = ProductValueMapper()

# Create an engine instance with match parameters
engine = ConditionalScoreEngine().set_match_params(
    {
        "category": "electronics",
        "brand": "Apple", # Not providing this value will disable the function - Try commenting this line
    }
)

# Set the base query
engine.set_match_dsl(MatchAllQuery())

# Generate the query
print_query(engine)

In this example, we've created a function score engine that applies scoring functions conditionally based on match parameters:

1. **Brand Boost**: Applied only if `brand` is provided

We use the `null_filter_action=ScoreNullFilterAction.DISABLE_FUNCTION` parameter to disable the function if the filter resolves to None. This allows us to conditionally apply scoring functions based on match parameters.

## Summary

In this notebook, we've explored how to use function score directives to customize document scoring. We've covered:

- Basic function score with field value factor and script score
- Boosting recent content with decay functions
- Geographical relevance with decay functions
- Conditional scoring based on match parameters

Function score directives provide a powerful way to customize document scoring based on various factors. They allow you to implement sophisticated ranking strategies that go beyond simple text relevance, such as boosting recent content, popular items, or items that match user preferences.

In the next notebook, we'll explore custom score functions, which allow you to encapsulate complex scoring logic in reusable components.