Nova Custom Evaluation SDK

A Python SDK for creating custom evaluation metrics for AWS Nova on-demand model evaluation on Sagemaker Training Job with built-in Pydantic validation. For the official integration with AWS Sagemaker training job, please view in the Official AWS Sagemaker Documentation.

Installation

git clone https://github.com/aws/nova-custom-eval-sdk.git
cd nova-custom-eval-sdk
pip install .

Architecture

The SDK provides:

Pydantic Validation: Automatic input/output validation using Pydantic models
PreProcessor: For input data transformation with validation
PostProcessor: For output data formatting with validation
Decorators: Simplified processor creation (@preprocess, @postprocess)
Lambda Handler Builder: Easy Lambda function creation
Exception Handling: Custom error types with validation feedback

Quick Start

Complete Example

See example/run_example.py for a complete working example to run locally.

Run in AWS Lambda

You need to create a lambda (follow this guide) and upload nova-custom-eval-sdk as a lambda layer in order to use it.

In the github release, you should be able to find a pre-built nova-custom-eval-layer.zip file.

Use below command to upload custom lambda layer.

aws lambda publish-layer-version \
    --layer-name nova-custom-eval-layer \
    --zip-file fileb://nova-custom-eval-layer.zip \
    --compatible-runtimes python3.12 python3.11 python3.10 python3.9

You need to add this layer as custom layer along with the required AWS layer: AWSLambdaPowertoolsPythonV3-python312-arm64 (because of pydantic depencency) to your lambda.

Then update your lambda code with:

from nova_custom_evaluation_sdk.processors.decorators import preprocess, postprocess
from nova_custom_evaluation_sdk.lambda_handler import build_lambda_handler

@preprocess
def preprocessor(event: dict, context) -> dict:
    data = event.get('data', {})
    return {
        "statusCode": 200,
        "body": {
            "system": data.get("system"),
            "prompt": data.get("prompt", ""),
            "gold": data.get("gold", "")
        }
    }

@postprocess
def postprocessor(event: dict, context) -> dict:
    # data is already validated and extracted from event
    data = event.get('data', [])
    inference_output = data.get('inference_output', '')
    gold = data.get('gold', '')
    
    metrics = []
    inverted_accuracy = 0 if inference_output.lower() == gold.lower() else 1.0
    metrics.append({
        "metric": "inverted_accuracy_custom",
        "value": inverted_accuracy
    })
    
    # Add more metrics here
    
    return {
        "statusCode": 200,
        "body": metrics
    }

# Build Lambda handler
lambda_handler = build_lambda_handler(
    preprocessor=preprocessor,
    postprocessor=postprocessor
)

Input/Output Validation

The SDK automatically validates:

Preprocessing Input

{
  "process_type": "preprocess",
  "data": {
    "prompt": "what can you do?",
    "gold": "Hello! How can I help you today?",
    "system": "You are a helpful assistant" 
  }
}

Postprocessing Input

{
  "process_type": "postprocess",
  "data": [
    {
      "prompt": "what can you do",
      "inference_output": "Hello! How can I help you today?",
      "gold": "Hello! How can I help you today?"
    }
  ]
}

Testing

# Run all tests
python -m pytest -v

# Run example
python example/run_example.py

Development

# Install in development mode
pip install -e .

# Run tests with coverage
python -m pytest tests/ --cov=nova_custom_evaluation_sdk

Contributing

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
example		example
src/nova_custom_evaluation_sdk		src/nova_custom_evaluation_sdk
tests		tests
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nova Custom Evaluation SDK

Installation

Architecture

Quick Start

Complete Example

Run in AWS Lambda

Input/Output Validation

Preprocessing Input

Postprocessing Input

Testing

Development

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

aws/nova-custom-eval-sdk

Folders and files

Latest commit

History

Repository files navigation

Nova Custom Evaluation SDK

Installation

Architecture

Quick Start

Complete Example

Run in AWS Lambda

Input/Output Validation

Preprocessing Input

Postprocessing Input

Testing

Development

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages