<a href="https://colab.research.google.com/github/withpi/cookbook-withpi/blob/main/colabs/Sagemaker_Scoring.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href="https://withpi.ai"><img src="https://withpi.ai/logo/logoFullBlack.svg" width="240px"></a>

<a href="https://code.withpi.ai"><font size="4">Documentation</font></a>

<a href="https://withpi.ai"><font size="4">Copilot</font></a>

# Scoring

Pi has published its Pi Scoring model for deployment on AWS Sagemaker.

Deploy to Sagemaker for inference in your own AWS account.  This notebook shows how to perform inference with it.

You will need appropriate secrets in your notebook to access your account, such as `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` and `AWS_SESSION_TOKEN`.  When running locally authenticate to AWS in the normal manner.

Start by installing packages and adding environment variables.

In [None]:
%pip install boto3 tqdm


import os
from google.colab import userdata

os.environ["AWS_ACCESS_KEY_ID"] = userdata.get('AWS_ACCESS_KEY_ID')
os.environ["AWS_SECRET_ACCESS_KEY"] = userdata.get("AWS_SECRET_ACCESS_KEY")
os.environ["AWS_SESSION_TOKEN"] = userdata.get("AWS_SESSION_TOKEN")



## Sample inference

Run the below cell to test if everything is working.

You will need to plug in the name of your Sagemaker endpoint and the region it is located in below.

In [None]:
import boto3
import json
import time

# Initialize the SageMaker runtime client
# Update the region if needed
sagemaker_runtime = boto3.client('sagemaker-runtime', region_name='us-east-1')

# Your endpoint configuration
endpoint_name = 'PiScorer'

latencies = []
for _ in range(10):
  start = time.perf_counter()
  response = sagemaker_runtime.invoke_endpoint(
      EndpointName=endpoint_name,
      ContentType='application/json',
      Body=json.dumps(
          {"llm_input": "", "llm_output": "Ordinary input","scoring_spec": [{"label": "Error Messages","question": "Does the log include error messages indicating unauthorized access attempts?","python_code": "from typing import Any\n\ndef score(\n    response_text: str,\n    input_text: str,\n    kwargs: dict[str, Any],\n) -> dict:\n\n    def evaluate_response(response):\n        \"\"\"\n        Check if the log includes error messages indicating unauthorized access attempts.\n        \n        Args:\n            response (str): The LLM's response text\n        \n        Returns:\n            float: 1.0 if error messages indicating unauthorized access attempts are found, 0.0 otherwise\n        \"\"\"\n        import re\n        \n        # Define patterns for error messages related to unauthorized access attempts\n        error_patterns = [\n            r'failed login attempt',  # Common phrase for failed login\n            r'unauthorized access',  # Explicit mention of unauthorized access\n            r'access denied',        # Access denied messages\n            r'invalid credentials',  # Invalid login credentials\n            r'login blocked',        # Login blocked messages\n            r'suspicious activity detected'  # Suspicious activity\n        ]\n        \n        # Convert response to lowercase for case-insensitive matching\n        response_lower = response.lower()\n        \n        # Check if any of the error patterns are present in the response\n        for pattern in error_patterns:\n            if re.search(pattern, response_lower):\n                return 1.0\n        \n        return 0.0  # No error messages indicating unauthorized access attempts found\n\n    final_score = evaluate_response(response_text)\n    return {\"score\": final_score, \"explanation\": \"\"}\n","scoring_type": "PYTHON_CODE","weight": 0.5},{"label": "Failed Login Attempts","question": "Does the log indicate repeated failed login attempts for the same user within a short time frame?","weight": 0.5},{"label": "Unusual IP Address","question": "Does the log show login attempts from IP addresses that are geographically inconsistent with the user's typical location?","weight": 0.5}]}
      )
  )
  stop = time.perf_counter()
  latencies.append(f"{stop-start:.3f}")

print(f"Latencies: {latencies}")
results = json.loads(response['Body'].read().decode())
display("Inference complete")
display(f"{results}")

Latencies: ['0.809', '0.339', '0.333', '0.337', '0.338', '0.339', '0.333', '0.338', '0.330', '0.331']


'Inference complete'

"{'total_score': 0.2092, 'question_scores': {'Error Messages': 0.0, 'Failed Login Attempts': 0.9258, 'Unusual IP Address': 0.9883}}"