# Custom Score Functions

In this notebook, we'll explore how to create custom score function directives to encapsulate complex scoring logic. Custom score functions allow you to package reusable scoring patterns that can be shared across multiple search applications.

## Introduction to Custom Score Functions

As your search applications grow in complexity, you'll often find yourself repeating the same scoring patterns. Custom score functions solve this problem by allowing you to:

1. Encapsulate complex scoring logic in a reusable component
2. Hide implementation details behind a clean interface
3. Share scoring patterns across multiple applications
4. Maintain consistent scoring behavior
5. Conditionally apply scoring based on runtime parameters

The toolkit provides the `CustomScoreFunctionDirective` class for creating custom score functions.

## Setup

Let's import the necessary modules:

In [None]:
import json
from elastictoolkit.queryutils.builder.customscorefunctiondirective import CustomScoreFunctionDirective
from elastictoolkit.queryutils.builder.functionscoreengine import FunctionScoreEngine
from elastictoolkit.queryutils.builder.scorefunctiondirective import (
    ScriptScoreDirective,
    DecayFunctionDirective,
    FieldValueFactorDirective,
    WeightDirective
)
from elastictoolkit.queryutils.builder.matchdirective import ConstMatchDirective
from elastictoolkit.queryutils.builder.directivevaluemapper import DirectiveValueMapper
from elastictoolkit.queryutils.types import FieldValue
from elastictoolkit.queryutils.consts import FieldMatchType, ScoreNullFilterAction
from elasticquerydsl.filter import MatchAllQuery

# Helper function to print queries as formatted JSON
def print_query(directive):
    query = directive.to_dsl()
    print(json.dumps(query.to_query(), indent=2))

## Creating a Basic Custom Score Function

Let's start by creating a basic custom score function for a product search application. This function will boost products based on their popularity and recency:

In [None]:
# Define a value mapper for our product search
class ProductValueMapper(DirectiveValueMapper):
    category = FieldValue(
        fields=["category"],
        values_list=["match_params.category"]
    )
    brand = FieldValue(
        fields=["brand"],
        values_list=["match_params.brand"]
    )

# Create a custom score function for popularity and recency
class PopularityRecencyScore(CustomScoreFunctionDirective):
    # Define which engine this directive can be used with
    allowed_engine_cls_name = "ProductScoreEngine"
    
    def get_score_directive(self):
        # Check if we should apply popularity scoring
        if not self.match_params.get("apply_popularity_boost", True):
            return None
        
        # Create a script score directive that combines popularity and recency
        return ScriptScoreDirective(
            script="""double popularity = doc['popularity'].value;
                    double days_old = (System.currentTimeMillis() - doc['created_date'].value) / 86400000.0;
                    double recency_factor = Math.exp(-days_old / params.recency_scale);
                    return popularity * recency_factor * params.boost_factor;""",
            weight=self._weight,
            filter=ConstMatchDirective(rule=FieldMatchType.ANY, name="category_filter")
        ).set_script_params(
            recency_scale=30.0,  # 30 days scale
            boost_factor=1.5
        )

# Create a function score engine
class ProductScoreEngine(FunctionScoreEngine):
    # Use our custom score function
    category = PopularityRecencyScore(weight=2.0)
    
    class Config:
        score_mode = "multiply"
        boost_mode = "multiply"
        value_mapper = ProductValueMapper()

# Create an engine instance with match parameters
engine = ProductScoreEngine().set_match_params(
    {
        "category": "electronics",
        "brand": "Apple",
        "apply_popularity_boost": True
    }
)

# Set the base query
engine.set_match_dsl(MatchAllQuery())

# Generate the query
print_query(engine)

In this example, we've created a custom score function that combines popularity and recency scoring. The `PopularityRecencyScore` directive encapsulates the logic for calculating a score based on a product's popularity and how recently it was created. It also includes a conditional check to only apply the scoring if `apply_popularity_boost` is true in the match parameters.

## Creating a More Complex Custom Score Function

Now let's create a more complex custom score function for a real estate search application. This function will score properties based on multiple factors, including location, price, and amenities:

In [None]:
# Define a value mapper for our real estate search
class RealEstateValueMapper(DirectiveValueMapper):
    property_type = FieldValue(
        fields=["property_type"],
        values_list=["match_params.property_type"]
    )
    amenities = FieldValue(
        fields=["amenities"],
        values_list=["*match_params.required_amenities"]
    )

# Create a custom score function for real estate properties
class PropertyScoreFunction(CustomScoreFunctionDirective):
    allowed_engine_cls_name = "RealEstateScoreEngine"
    
    def get_score_directive(self):
        # Get user preferences from match parameters
        user_location = self.match_params.get("user_location")
        max_price = self.match_params.get("max_price")
        preferred_amenities = self.match_params.get("preferred_amenities", [])
        
        # If no user location is provided, we can't do location-based scoring
        if not user_location:
            return None
        
        # Create a script that scores properties based on multiple factors
        script = """
            // Location score - closer is better
            double lat1 = params.user_lat;
            double lon1 = params.user_lon;
            double lat2 = doc['location.lat'].value;
            double lon2 = doc['location.lon'].value;
            
            // Calculate distance in kilometers using Haversine formula
            double dLat = Math.toRadians(lat2 - lat1);
            double dLon = Math.toRadians(lon2 - lon1);
            double a = Math.sin(dLat/2) * Math.sin(dLat/2) +
                      Math.cos(Math.toRadians(lat1)) * Math.cos(Math.toRadians(lat2)) *
                      Math.sin(dLon/2) * Math.sin(dLon/2);
            double c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
            double distance = 6371 * c; // Earth radius in km
            
            // Convert distance to a score (closer = higher score)
            double location_score = Math.exp(-distance / params.distance_scale);
            
            // Price score - lower price is better, but only if below max_price
            double price = doc['price'].value;
            double price_score = 1.0;
            if (params.max_price > 0) {
                price_score = params.max_price / price;
                price_score = Math.min(price_score, 1.0); // Cap at 1.0
            }
            
            // Amenities score - more matching amenities is better
            double amenities_score = 1.0;
            if (params.preferred_amenities.length > 0) {
                int matches = 0;
                for (int i = 0; i < params.preferred_amenities.length; i++) {
                    if (doc['amenities'].contains(params.preferred_amenities[i])) {
                        matches++;
                    }
                }
                amenities_score = 1.0 + (matches * params.amenity_boost);
            }
            
            // Combine scores with weights
            return (location_score * params.location_weight) *
                   (price_score * params.price_weight) *
                   (amenities_score * params.amenities_weight);
        """
        
        # Extract latitude and longitude from user_location
        user_lat, user_lon = user_location
        
        # Create the script score directive
        return ScriptScoreDirective(
            script=script,
            weight=self._weight,
            filter=ConstMatchDirective(rule=FieldMatchType.ANY, name="property_type_filter")
        ).set_script_params(
            user_lat=user_lat,
            user_lon=user_lon,
            distance_scale=10.0,  # 10km scale
            max_price=max_price or 0,
            preferred_amenities=preferred_amenities,
            amenity_boost=0.2,
            location_weight=2.0,
            price_weight=1.5,
            amenities_weight=1.0
        )

# Create a function score engine
class RealEstateScoreEngine(FunctionScoreEngine):
    # Use our custom score function
    property_type = PropertyScoreFunction(weight=1.0)
    
    class Config:
        score_mode = "multiply"
        boost_mode = "multiply"
        value_mapper = RealEstateValueMapper()

# Create an engine instance with match parameters
engine = RealEstateScoreEngine().set_match_params(
    {
        "property_type": "apartment",
        "required_amenities": ["parking", "elevator"],
        "user_location": [40.7128, -74.0060],  # New York City coordinates
        "max_price": 2000,
        "preferred_amenities": ["gym", "pool", "balcony"]
    }
)

# Set the base query
engine.set_match_dsl(MatchAllQuery())

# Generate the query
print_query(engine)

In this example, we've created a complex custom score function for real estate properties. The `PropertyScoreFunction` directive calculates a score based on multiple factors:

1. **Location**: Properties closer to the user's location get higher scores
2. **Price**: Properties with lower prices (but still below the max price) get higher scores
3. **Amenities**: Properties with more of the user's preferred amenities get higher scores

The function combines these factors with weights to produce a final score. It also includes conditional logic to only apply location-based scoring if a user location is provided.

## Dynamic Custom Score Functions

Now let's create a dynamic custom score function that adapts its behavior based on match parameters:

In [None]:
# Define a value mapper for our job search
class JobSearchValueMapper(DirectiveValueMapper):
    job_type = FieldValue(
        fields=["job_type"],
        values_list=["match_params.job_type"]
    )
    skills = FieldValue(
        fields=["skills"],
        values_list=["*match_params.required_skills"]
    )

# Create a dynamic custom score function for job search
class JobMatchScoreFunction(CustomScoreFunctionDirective):
    allowed_engine_cls_name = "JobSearchScoreEngine"
    
    def get_score_directive(self):
        # Get scoring mode from match parameters
        scoring_mode = self.match_params.get("scoring_mode", "balanced")
        user_skills = self.match_params.get("user_skills", [])
        user_experience = self.match_params.get("user_experience", 0)
        
        # If no user skills are provided, we can't do skills-based scoring
        if not user_skills:
            return None
        
        # Different scoring logic based on scoring mode
        if scoring_mode == "skills_focused":
            # Focus on skills match
            script = """
                // Skills match score
                int matches = 0;
                for (int i = 0; i < params.user_skills.length; i++) {
                    if (doc['skills'].contains(params.user_skills[i])) {
                        matches++;
                    }
                }
                double skill_match_ratio = (double)matches / params.user_skills.length;
                return Math.pow(skill_match_ratio, 2) * 10.0;
            """
            return ScriptScoreDirective(
                script=script,
                weight=self._weight,
                filter=ConstMatchDirective(rule=FieldMatchType.ANY, name="job_type_filter")
            ).set_script_params(user_skills=user_skills)
            
        elif scoring_mode == "experience_focused":
            # Focus on experience match
            script = """
                // Experience match score
                double job_exp = doc['required_experience'].value;
                double user_exp = params.user_experience;
                
                // Perfect match if user experience is exactly what's required
                if (Math.abs(job_exp - user_exp) < 1.0) {
                    return 10.0;
                }
                // Good match if user experience is within 2 years of what's required
                else if (Math.abs(job_exp - user_exp) < 2.0) {
                    return 8.0;
                }
                // Acceptable match if user has more experience than required
                else if (user_exp > job_exp) {
                    return 6.0;
                }
                // Poor match if user has less experience than required
                else {
                    return 3.0;
                }
            """
            return ScriptScoreDirective(
                script=script,
                weight=self._weight,
                filter=ConstMatchDirective(rule=FieldMatchType.ANY, name="job_type_filter")
            ).set_script_params(user_experience=user_experience)
            
        else:  # balanced mode (default)
            # Balanced scoring considering both skills and experience
            script = """
                // Skills match score
                int matches = 0;
                for (int i = 0; i < params.user_skills.length; i++) {
                    if (doc['skills'].contains(params.user_skills[i])) {
                        matches++;
                    }
                }
                double skill_match_ratio = (double)matches / params.user_skills.length;
                double skill_score = Math.pow(skill_match_ratio, 2) * 5.0;
                
                // Experience match score
                double job_exp = doc['required_experience'].value;
                double user_exp = params.user_experience;
                double exp_diff = Math.abs(job_exp - user_exp);
                double exp_score = 5.0 * Math.exp(-exp_diff / 2.0);
                
                // Combine scores
                return skill_score + exp_score;
            """
            return ScriptScoreDirective(
                script=script,
                weight=self._weight,
                filter=ConstMatchDirective(rule=FieldMatchType.ANY, name="job_type_filter")
            ).set_script_params(
                user_skills=user_skills,
                user_experience=user_experience
            )

# Create a function score engine
class JobSearchScoreEngine(FunctionScoreEngine):
    # Use our custom score function
    job_type = JobMatchScoreFunction(weight=1.0)
    
    class Config:
        score_mode = "multiply"
        boost_mode = "multiply"
        value_mapper = JobSearchValueMapper()

# Create an engine instance with match parameters
engine = JobSearchScoreEngine().set_match_params(
    {
        "job_type": "full-time",
        "required_skills": ["python", "elasticsearch"],
        "user_skills": ["python", "elasticsearch", "javascript", "aws"],
        "user_experience": 5,
        "scoring_mode": "balanced"  # Try changing to "skills_focused" or "experience_focused"
    }
)

# Set the base query
engine.set_match_dsl(MatchAllQuery())

# Generate the query
print_query(engine)

In this example, we've created a dynamic custom score function for job search. The `JobMatchScoreFunction` directive adapts its scoring logic based on the `scoring_mode` parameter:

1. **Skills Focused**: Emphasizes matching the user's skills with the job requirements
2. **Experience Focused**: Emphasizes matching the user's experience level with the job requirements
3. **Balanced**: Considers both skills and experience equally

This approach allows you to create flexible scoring functions that adapt to different user preferences or search contexts.

## Summary

In this notebook, we've explored how to create custom score function directives to encapsulate complex scoring logic. We've covered:

- Creating basic custom score functions
- Building more complex custom score functions with multiple factors
- Creating dynamic custom score functions that adapt to match parameters

Custom score functions provide a powerful way to encapsulate complex scoring logic in reusable components. They allow you to:

1. Hide implementation details behind a clean interface
2. Share scoring patterns across multiple applications
3. Create modular, maintainable search applications
4. Adapt scoring behavior based on runtime parameters

By using custom score functions, you can create sophisticated search applications with advanced scoring capabilities while keeping your code clean, modular, and maintainable.