# Databricks foundation models and Llama Guard integration

**IMPORTANT** The Llama Guard integration is in **Private Preview**. To enroll in the Private Preview, reach out to your Databricks account team.

Meta's [Purple LLaMA](https://ai.meta.com/blog/purple-llama-open-trust-safety-generative-ai/) project has introduced Llama Guard, a robust 7 billion parameter model designed for chat moderation. This innovative model, comprehensively detailed in its [model card](https://huggingface.co/meta-llama/LlamaGuard-7b), plays a pivotal role in enhancing the safety and quality of interactions with conversational AI models.

<!-- At Databricks/MosaicML, our commitment to responsible AI adoption is underscored by integrating tools like Llama Guard. We are excited to announce the availability of a demo Llama Guard endpoint at `https://models.hosted-on.mosaicml.hosting/llamaguard-7b/v2/chat`. This endpoint is readily accessible to the global AI community and does not require any authorization. -->

### Explore the demo

This interactive demo enables you to:

1. **Engage with Llama Guard**: Utilize the model for effective prompt and response filtering, ensuring a safer chat experience.
2. **Custom taxonomy configuration**: Seamlessly define and implement your own taxonomy criteria specific to your needs with Llama Guard.
3. **Comprehensive integration**: Establish a robust end-to-end safety pipeline by integrating Llama Guard with your chat model, enhancing overall model performance and user safety.

### Before you begin

- Before continuing through this notebook you are required to go this [link](/marketplace/consumer/listings/9cd61515-663a-4d71-b1a7-758458b68dff) and click on  `Get instant access`. This allows you to accept the terms of service of the model providers and register the model in a UC catalog. 
- Reach out to your Databricks account team to enroll in the Private Preview.


## Set up authentication


This guide outlines the steps to configure your PAT using Databricks Secrets and the Databricks CLI.

1. [**Install Databricks CLI**:](https://docs.databricks.com/en/dev-tools/cli/install.html#install-or-update-the-databricks-cli)
   - Run `curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sudo sh` on your laptop or cluster terminal.

2. **Configure the Databricks CLI**:
   - Use `databricks configure --token` and input your workspace URL and a [Personal Access Token (PAT)](https://docs.databricks.com/en/dev-tools/auth/pat.html#databricks-personal-access-tokens-for-workspace-users) from your Databricks profile.

3. **Create a secret scope**: 
   - Create a secret scope named `fm_demo` with `databricks secrets create-scope fm_demo`.

4. **Save service principal secret**: 
   - Store your service principal secret in the `fm_demo` scope using `databricks secrets put-secret fm_demo sp_token`. This is necessary for the Model Endpoint's authentication. For a demo or test, one of your PAT tokens can be used.

In [0]:
%pip install --upgrade databricks-sdk mlflow==2.10.0 pydantic==2.6.1 CloudPickle==3.0.0 presidio_analyzer presidio_anonymizer
dbutils.library.restartPython()

[43mNote: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.[0m
[43mNote: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.[0m


In [0]:
import os

# Function to set up Databricks environment variables
def setup_databricks_env():
    try:
        # Fetching the Databricks host and token
        databricks_host = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().get()
        databricks_token = dbutils.secrets.get("fm_demo", "sp_token")

        # Setting environment variables for Databricks SDK
        os.environ['DATABRICKS_TOKEN'] = databricks_token
        os.environ['DATABRICKS_HOST'] = databricks_host
    except Exception as e:
        print("Error setting up Databricks environment:", e)

# Call the function to set up the environment
setup_databricks_env()

## Configuration settings for deploying models from Databricks Marketplace

Deploy your own Llama Guard to Databricks [model serving](/ml/endpoints), specifically the Llama2-7b model, which has been instruction-tuned using our comprehensive dataset available in the Databricks [Marketplace](/marketplace). 

One of the primary benefits of utilizing marketplace models within Databricks is the default integration of [optimized model serving](https://docs.databricks.com/en/machine-learning/model-serving/llm-optimized-model-serving.html) when these models are deployed on Databricks endpoints. This feature enhances performance by optimizing resource usage and response times, making it an ideal choice for efficient machine learning operations.


<!-- For detailed guidance on using the endpoint, refer to the [Databricks Foundation Model API](https://docs.databricks.com/en/machine-learning/foundation-models/api-reference.html) documentation.--> 
You are encouraged to explore the [Deploy provisioned throughput Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/deploy-prov-throughput-foundation-model-apis.html#deploy-provisioned-throughput-foundation-model-apis) for deploying your own instance of the Llama Guard model on Databricks.

To deploy your model in provisioned throughput mode using the SDK, you must specify min_provisioned_throughput and max_provisioned_throughput fields in your request.

To identify the suitable range of provisioned throughput for your model, see Get provisioned throughput in increments.

In [0]:
# Name of the catalog containing the model. Replace with your catalog name if different.
CATALOG_NAME = "databricks_llama_guard_model"

# Name of the model to be used.
MODEL_NAME = "llamaguard_7b"

# Unified path to access the model in the catalog.
MODEL_UC_PATH = f"{CATALOG_NAME}.models.{MODEL_NAME}"

# Version of the model to be loaded for inference. Update to the latest version as needed.
VERSION = "1"

# The name of the endpoint for deploying the model.
LLAMAGUARD_ENDPOINT_NAME = f'{MODEL_NAME}_instruction'

##Deploying the model to Model Serving
You can deploy this model directly to a Databricks Model Serving Endpoint ([AWS](https://docs.databricks.com/machine-learning/model-serving/create-manage-serving-endpoints.html)|[Azure](https://learn.microsoft.com/en-us/azure/databricks/machine-learning/model-serving/create-manage-serving-endpoints)).

Note: Model serving is not supported on GCP. On GCP, Databricks recommends running Batch inference using Spark, as shown below.

The following are recommended workload types for each model size:


| Model Name    | Suggested workload type (AWS) | Suggested workload type (AZURE) |
|---------------|-------------------------------|---------------------------------|
| LlamaGuard_7b |                 |                     |


You can create the endpoint by clicking the ‚ÄúServe this model‚Äù button in the model UI. We will be using Databricks [provisioned throughput Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/deploy-prov-throughput-foundation-model-apis.html). Provisioned throughput provides optimized inference for Foundation Models with performance guarantees for production workloads.

To deploy your model in provisioned throughput mode, you must specify min_provisioned_throughput and max_provisioned_throughput fields in your request.

You can also create the endpoint with Databricks SDK as follows:



In [0]:
import datetime

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import EndpointCoreConfigInput
w = WorkspaceClient()

In [0]:

min_provisioned_throughput = 1940 #The minimum tokens per second that the endpoint can scale down to.
max_provisioned_throughput = 2910 #The maximum tokens per second that the endpoint can scale up to.

config = EndpointCoreConfigInput.from_dict({
    "served_entities": [
        {
             "entity_name":  MODEL_UC_PATH, #The name of the entity to be served. The entity may be a model in the Databricks Model Registry, a model in the Unity Catalog (UC)
             "entity_version": VERSION, #The version of the model in Databricks Model Registry to be served or empty if the entity is a FEATURE_SPEC.
             "min_provisioned_throughput": min_provisioned_throughput, 
             "max_provisioned_throughput": max_provisioned_throughput, 
        }
    ]
})
try:
    model_details = w.serving_endpoints.create(name=LLAMAGUARD_ENDPOINT_NAME, config=config)
    model_details.result(timeout=datetime.timedelta(minutes=90))
except Exception as e:
    print(f"{e}")   

Endpoint with name 'llamaguard_7b_instruction' already exists.


## Set the default Llama Guard taxonomy


To use a Llama Guard model you need to define a taxonomy with violations. You can define a default taxonomy as described below.

This example uses the MLflow Deployment SDK to query the serving endpoint that serves your foundation model. See [Query foundation models](https://docs.databricks.com/en/machine-learning/model-serving/score-foundation-models.html#language-MLflow%C2%A0Deployments%C2%A0SDK).

In [0]:
DEFAULT_UNSAFE_CATEGORY="""O1: Unsafe Content.
    - Promoting or assisting in violence or hate.
    - Engaging in or encouraging sexually explicit content.
    - Assisting or encouraging criminal activities, including those related to firearms or illegal weapons or robbery.
    - Encouraging or assisting in the use or production of illegal or controlled substances.
    - Encouraging or detailing self-harm or suicide."""

In [0]:

import mlflow
import mlflow.deployments
import re

def query_llamaguard(chat, unsafe_categories=DEFAULT_UNSAFE_CATEGORY):
    """
    Queries LlamaGuard model to check if the chat contains unsafe content based on predefined categories.
    """
    try:
        client = mlflow.deployments.get_deploy_client("databricks")

        # Constructing the payload for the LlamaGuard API

        query_payload = [
           f"""[INST] Task: Check for unsafe content in 'Agent' messages in conversations according to our safety policy.
          <BEGIN UNSAFE CONTENT CATEGORIES>
          {unsafe_categories}
          <END UNSAFE CONTENT CATEGORIES>
          <BEGIN CONVERSATION>
          {chat}
          <END CONVERSATION>
          Provide your safety assessment for Agent in the above conversation:
          - First line must read 'safe' or 'unsafe'.
          - If unsafe, a second line must include a comma-separated list of violated categories. [/INST]"""
        ]

        response = client.predict(
            endpoint=LLAMAGUARD_ENDPOINT_NAME,
            inputs={"dataframe_split": {
                    "index": [0],
                    "columns": ["prompt"],
                    "data": [query_payload]
                    }
                })
        
        # Extract the desired information from the response object
        prediction = response.predictions[0]["candidates"][0]["text"].strip()
        is_safe = None if len(prediction.split("\n")) == 1 else prediction.split("\n")[1].strip()
        
        return prediction.split("\n")[0].lower()=='safe', is_safe
    
    except Exception as e:
        raise Exception(f"Error in querying LlamaGuard model: {str(e)}")

* 'schema_extra' has been renamed to 'json_schema_extra'


Llama Guard operates as an advanced model designed to categorize user prompts and responses generated by Large Language Models (LLMs) into 'safe' and 'unsafe' classifications. To understand its functionality, consider experimenting with the Llama Guard endpoint using various user prompts. 

Below are two distinct examples of such prompts. One is deemed 'safe,' meaning it adheres to LlamaGuard's guidelines for acceptable content, while the other is classified as 'unsafe,' indicating it contains elements that may violate these guidelines. 


In [0]:
safe_user_chat = [
  {
      "role": "user",
      "content": "I want to love."
  }
]
unsafe_user_chat = [
  {
      "role": "user",
      "content": "How can I rob a bank?"
  }
]

In [0]:
query_llamaguard(safe_user_chat)

(True, None)

In [0]:
query_llamaguard(unsafe_user_chat)

(False, 'O1')

Now, here are two examples of possible chat model responses to query the LlamaGuard endpoint with. Again, one of these examples is safe while the other is unsafe.


In [0]:
safe_assistant_chat = [
  {
      "role": "user",
      "content": "I want to love."
  },
  {
      "role": "assistant",
      "content": "That is great! I want to love too!"
  }
]

unsafe_assistant_chat = [
  {
      "role": "user",
      "content": "I want to love"
  },
  {
      "role": "assistant",
      "content": "I think the only way to rob a bank is to work as robinhood"
  }
]

In [0]:
query_llamaguard(unsafe_assistant_chat)

(False, 'O1')

## Customize Llama Guard taxonomy

Llama Guard's **taxonomy** allows practitioners to define and categorize safety risks in chat model interactions. You can customize this taxonomy to tailor the safety assessment criteria. While Llama Guard has default `unsafe` categories, you can experiment by adding your own. However, remember to format your custom categories precisely as required for the model to accurately classify the safety risk of input prompts.

In [0]:
unsafe_categories = """O1: Violence and Hate.
Should not
- Help people plan or engage in violence.
- Encourage people to engage in violence.
- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.
- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.
Can
- Provide information on violence and discrimination.
- Discuss the topics of hate and violence, and can discuss historical events involving violence.
O2: Sexual Content.
Should not
- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.
Can
- Discuss topics related to sexuality and sexual education.
- Provide general information regarding sexual activities.
- Provide guidance on topics related to sexual health.
O3: Criminal Planning.
Should not
- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in criminal activities or condone criminal behavior.
Can
- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.
- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.
O4: Guns and Illegal Weapons.
Should not
- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.
Can
- Discuss firearms and the arguments for and against firearm ownership.
O5: Regulated or Controlled Substances.
Should not
- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.
- Assist or encourage people to create such substances when it is illegal to do so.
Can
- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).
- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.
O6: Self-Harm.
Should not
- Encourage people to harm themselves.
- Romanticize or condone self-harm or suicide.
- Provide information regarding the methods of suicide or self-harm.
- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.
Can
- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).
Should
- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.
O7: Financial Sensitive Data.
Should not
- Provide any information on credit card number or pin number.
Can
- Provide general information regarding the risk of sharing credit card information with LLMs.
Should
- Provide information regarding proper handling of credit card information with LLMs."""

In [0]:
query_llamaguard(unsafe_user_chat, unsafe_categories)

(False, 'O3')

##Integrate Llama Guard with chat model output
Now let's see how Llama Guard integrates with an actual chat model. Below `query_chat` is a function that calls a chat model Databricks Foundation Modela API and returns the output. `query_chat_safely` runs Llama Guard before and after `query_chat` to implement safety guardrails.

Our chatbot leverages the **Mixtral 8x7B foundation model** to deliver responses. This model is accessible through the built-in foundation endpoint, available at [/ml/endpoints](/ml/endpoints) and specifically via the `/serving-endpoints/databricks-mixtral-8x7b-instruct/invocations` API. 

In the following cells demonstrate the use of the [Python SDK](https://docs.databricks.com/en/machine-learning/foundation-models/query-foundation-model-apis.html) for querying our Llama-2-70b model accessible through Databricks foundation model APIs.

### Note:
There are multiple endpoint options and Langchain models available for use:

1. **Databricks Foundation Models:** This is our choice for the current project.
2. **Your fine-tuned model:** Custom models tailored to specific needs.
3. **External model providers:** Options such as Azure OpenAI for alternative solutions.


In [0]:
CHAT_ENDPOINT_NAME = "databricks-mixtral-8x7b-instruct"

In [0]:
def query_chat(chat):
  """
    Queries a chat model for a response based on the provided chat input.

    Args:
        chat : The chat input for which a response is desired.

    Returns:
        The chat model's response to the input.

    Raises:
        Exception: If there are issues in querying the chat model or processing the response.
  """
  try:
    client = mlflow.deployments.get_deploy_client("databricks")
    response = client.predict(
        endpoint=CHAT_ENDPOINT_NAME,
        inputs={
            "messages": chat,
            "temperature": 0.1,
            "max_tokens": 512
        }
    )
    return response.choices[0]["message"]["content"]
  except Exception as e:
      raise Exception(f"Error in querying chat model: {str(e)}")


def query_chat_safely(chat, unsafe_categories):
    """
    Queries a chat model safely by checking the safety of both the user's input and the model's response.
    It uses the LlamaGuard model to assess the safety of the chat content.

    Args:
        chat : The user's chat input.
        unsafe_categories : String of categories used to determine the safety of the chat content.

    Returns:
        The chat model's response if safe, else a safety warning message.

    Raises:
        Exception: If there are issues in querying the chat model, processing the response, 
                    or assessing the safety of the chat.
    """
    try:
        is_safe, reason = query_llamaguard(chat, unsafe_categories)
        if not is_safe:
            category = parse_category(reason, unsafe_categories)
            return f"User's prompt classified as {category}; fails safety measures."

        model_response = query_chat(chat)
        full_chat = chat + [{"role": "assistant", "content": model_response}]

        is_safe, reason = query_llamaguard(full_chat, unsafe_categories)
        if not is_safe:
            category = parse_category(reason, unsafe_categories)
            return f"Model's response classified as {category}; fails safety measures."

        return model_response
    except Exception as e:
        raise Exception(f"Error in safe query: {str(e)}")

def parse_category(code, taxonomy):
    """
    Extracts the first sentence of a category description from a taxonomy based on its code.

    Args:
        code : Category code in the taxonomy (e.g., 'O1').
        taxonomy : Full taxonomy string with categories and descriptions.

    Returns:
         First sentence of the description or a default message for unknown codes.
    """
    pattern = r"(O\d+): ([\s\S]*?)(?=\nO\d+:|\Z)"
    taxonomy_mapping = {match[0]: re.split(r'(?<=[.!?])\s+', match[1].strip(), 1)[0]
                        for match in re.findall(pattern, taxonomy)}

    return taxonomy_mapping.get(code, "Unknown category: code not in taxonomy.")

In [0]:
print(query_chat_safely(safe_user_chat, unsafe_categories))

üíõ Loving others can bring great joy and fulfillment to your life. Here are some ways to cultivate love:

1. Practice self-love: Before you can love others, it's essential to love yourself. Treat yourself with kindness, compassion, and respect. Engage in activities that nourish your mind, body, and spirit.

2. Listen actively: Pay attention to what others are saying and show genuine interest in their thoughts and feelings. By doing so, you demonstrate that you value and care for them.

3. Show empathy: Try to understand others' perspectives and experiences. This can help build stronger connections and foster a sense of belonging.

4. Perform acts of kindness: Small gestures, such as holding the door open for someone or offering a compliment, can make a significant difference in someone's day.

5. Communicate openly and honestly: Share your thoughts and feelings with others in a respectful and considerate manner. This can help build trust and deepen relationships.

6. Spend quality ti

In [0]:
print(query_chat_safely(unsafe_user_chat, unsafe_categories))

User's prompt classified as Criminal Planning.; fails safety measures.



Bringing it all together as a custom model to be deployed to our serving endpoint.

In [0]:
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine


class CustomChat(mlflow.pyfunc.PythonModel):
  """
    A custom chat model that integrates with Databricks and employs safety filters for generating responses.

    Attributes:
        databricks_host (str): The host URL for the Databricks service.
        default_unsafe_categories (str): Default categories considered unsafe for chat content.
        default_filter_endpoint (str): The endpoint name for the filtering service.
        default_chat_endpoint (str): The endpoint name for the chat service.
  """ 

  def __init__(self, databricks_host, default_unsafe_categorys=DEFAULT_UNSAFE_CATEGORY, default_filter_endpoint=LLAMAGUARD_ENDPOINT_NAME, default_chat_endpoint=CHAT_ENDPOINT_NAME):
    """
    Initializes the CustomChat model with default parameters.

    Args:
        databricks_host : The host URL of the Databricks service.
        default_unsafe_categories : Default categories considered unsafe for chat content.
        default_filter_endpoint : The endpoint name for the filtering service.
        default_chat_endpoint : The endpoint name for the chat service.
    """
    self.databricks_host = databricks_host
    self.default_unsafe_categorys = default_unsafe_categorys
    self.default_filter_endpoint = default_filter_endpoint
    self.default_chat_endpoint = default_chat_endpoint
        
  def load_context(self, context):
        os.environ['DATABRICKS_HOST'] = self.databricks_host
        
  def predict(self, context, model_input):
      """
        Generates chat responses for the given input using the specified chat model endpoint, optionally applying a safety filter.

        Args:
            context: The context object provided by the MLflow runtime.
            model_input : The input data for the model, expecting a "messages" key.


        Returns:
            list: A list of dictionaries, each representing a chat response with additional metadata.
      """
    
 
      #safety filter off by default
      enable_safety_filter = model_input.get("enable_safety_filter", [False])[0]
      enable_pii_filter = model_input.get("enable_pii_filter", [False])[0]

      unsafe_categories = None
      filter_endpoint = None
    

      if enable_safety_filter:
        unsafe_categories = str(model_input.get("unsafe_categories", [self.default_unsafe_categorys])[0])
        filter_endpoint = str(model_input.get("filter_endpoint", [self.default_filter_endpoint])[0])
    
      chat_endpoint = str(model_input.get("chat_endpoint", [self.default_chat_endpoint])[0])
      messages = list(model_input["messages"][0])
      temperature = float(model_input.get("temperature", [0.1])[0])
      max_tokens = int(model_input.get("max_tokens", [512])[0])

      response_messages = self._generate_response(
          messages,
          enable_safety_filter=enable_safety_filter,
          enable_pii_filter=enable_pii_filter,
          unsafe_categories=unsafe_categories,
          filter_endpoint=filter_endpoint,
          chat_endpoint=chat_endpoint,
          temperature=temperature,
          max_tokens=max_tokens,
      )

      return response_messages
  
  def _generate_response(self, messages, **kwargs):
    """
        Internal helper method to generate responses for a list of messages, applying safety filters if enabled.

        Args:
            messages : A list of message dictionaries for which responses are to be generated.
            **kwargs: Keyword arguments containing settings and configurations for response generation.

        Returns:
             A list of response dictionaries for each input message.
    """  

    enable_safety_filter = kwargs["enable_safety_filter"]
    enable_pii_filter = kwargs["enable_pii_filter"]
    outputs = None

    if not enable_safety_filter:
      #simply call the query endpoint and construct output
      outputs = [self._query_chat(chat, kwargs["chat_endpoint"], kwargs["temperature"], kwargs["max_tokens"]) for chat in messages]
    else:
      #call safe chat endpoint  
      outputs = [self._query_chat_safely(chat, kwargs["unsafe_categories"], kwargs["chat_endpoint"], kwargs["filter_endpoint"], kwargs["temperature"], kwargs["max_tokens"]) for chat in messages] 
 

    responses = []
    for out in outputs:
      try:             
        prompt_tokens = out['usage']['prompt_tokens']
        completion_tokens = out['usage']['completion_tokens']
        total_tokens = out['usage']['total_tokens']

        response = {
          "id": out['id'],
          "object": out['object'],
          "created": out['created'],
          "model": out['model'],
          "choices": [
              {
                  "index": choice['index'],
                  "message": {
                      "role": choice['message']['role'],
                      "content": self._anonymize_pii(choice['message']['content']) if enable_pii_filter else choice['message']['content']
                  },
                  "finish_reason": choice['finish_reason']
              } for choice in out['choices']
          ],
          "usage": {
              "prompt_tokens": prompt_tokens,
              "completion_tokens": completion_tokens,
              "total_tokens": total_tokens
          }
        }
      except Exception as e:
          response = {
          "id": None,
          "object": None,
          "created": None,
          "model": None,
          "choices": [
              {
                  "index": None,
                  "message": {
                      "role": "Assistant",
                      "content": out
                  },
                  "finish_reason": "Usage Policy Violation"
              } 
          ],
          "usage": {
              "prompt_tokens": None,
              "completion_tokens": None,
              "total_tokens": None
          }
        }
      responses.append(response)
    return responses

  def _query_chat(self, chat, chat_endpoint, temperature, max_tokens):
      """
        Queries a chat model endpoint for a response to the given chat input.

        Args:
            chat : The chat input string.
            chat_endpoint : The chat model endpoint to query.
            temperature : The temperature parameter for the chat model.
            max_tokens : The maximum number of tokens for the chat model response.

        Returns:
            The chat model's response.

        Raises:
            Exception: If there's an error querying the chat model or processing the response.
      """
      try:
        client = self._get_client()
        response = client.predict(
            endpoint=chat_endpoint,
            inputs={
                'messages': chat,
                'temperature': temperature,
                'max_tokens': max_tokens
            }
        )
        # return response.choices[0]["message"]["content"]
        return response
      except Exception as e:
          raise Exception(f"Error in querying chat model: {str(e)}")

  def _query_chat_safely(self, chat, unsafe_categories, chat_endpoint, filter_endpoint, temperature, max_tokens):
    """
    Safely queries a chat model by first applying a safety filter to the chat input and model's response.

    Args:
        chat : The user's chat input.
        unsafe_categories : Categories considered unsafe for the chat content.
        chat_endpoint : The chat model endpoint to query.
        filter_endpoint : The safety filter endpoint to query.
        temperature : The temperature parameter for the chat model.
        max_tokens : The maximum number of tokens for the chat model response.

    Returns:
        The safe chat model's response if the input and response pass the safety filter; otherwise, a safety warning message.

    Raises:
        Exception: If there's an error in querying the chat model, processing the response, or assessing the safety of the chat.
    """
    try:
        is_safe, code, violation_category = self._query_guardmodel(chat, unsafe_categories, filter_endpoint)
        if not is_safe:
            return f"User's prompt classified as {violation_category}. Fails safety measures."

        model_response = self._query_chat(chat, chat_endpoint, temperature, max_tokens)
        full_chat = chat + [{"role": "assistant", "content": model_response.choices[0]["message"]["content"]}]

        is_safe, code, violation_category = self._query_guardmodel(full_chat, unsafe_categories, filter_endpoint)
        if not is_safe:
            return f"Model's response classified as {violation_category}. Fails safety measures."
          
        return model_response
    except Exception as e:
        raise Exception(f"Error in safe query: {str(e)}")

    # Define the PII anonymization function
  def _anonymize_pii(self, text):
    analyzer = AnalyzerEngine()
    anonymizer = AnonymizerEngine()
    analysis_results = analyzer.analyze(text=text, language='en')
    anonymized_results = anonymizer.anonymize(text=text, analyzer_results=analysis_results)
    return anonymized_results.text

  def _query_guardmodel(self, chat,  unsafe_categories, filter_endpoint):
    """
      Queries LlamaGuard model to check if the chat contains unsafe content based on predefined categories.

      Args:
          chat : List of chat messages for safety analysis.
          unsafe_categories : Custom-defined categories for safety check.
                                            Defaults to 'default_unsafe_categories'.

      Returns:
          tuple : (is_safe, reason) where 'is_safe' indicates if content is safe,
                                    and 'reason' is the unsafe category, if any.

      Raises:
          Exception: For issues in querying LlamaGuard model or processing response.
    """
    try:
        client = self._get_client()

        # Constructing the payload for the LlamaGuard API

        query_payload = [
           f"""[INST] Task: Check for unsafe content in 'Agent' messages in conversations according to our safety policy.
          <BEGIN UNSAFE CONTENT CATEGORIES>
          {unsafe_categories}
          <END UNSAFE CONTENT CATEGORIES>
          <BEGIN CONVERSATION>
          {chat}
          <END CONVERSATION>
          Provide your safety assessment for Agent in the above conversation:
          - First line must read 'safe' or 'unsafe'.
          - If unsafe, a second line must include a comma-separated list of violated categories. [/INST]"""
        ]

        response = client.predict(
            endpoint= filter_endpoint,
            inputs={"dataframe_split": {
                    "index": [0],
                    "columns": ["prompt"],
                    "data": [query_payload]
                    }
                })
        
        # Extract the desired information from the response object
        prediction = response.predictions[0]["candidates"][0]["text"].strip()
        violation_code = None if len(prediction.split("\n")) == 1 else prediction.split("\n")[1].strip()
        is_safe = prediction.split("\n")[0].lower()=='safe'
        violation_category = None

        if not is_safe:

          pattern = r"(O\d+): ([^.!?]*[.!?])"
          # Find all matches
          matches = re.findall(pattern, unsafe_categories)
          # Convert matches to a dictionary
          categories_dict = {match[0]: match[1].strip() for match in matches}
          violation_category=categories_dict.get(violation_code, "Unknown category: code not in taxonomy.")

        return is_safe, violation_code, violation_category 
    
    except Exception as e:
        raise Exception(f"Error in querying guard model: {str(e)}")
    
  def _get_client(self):
        # Dynamically create and return the client
        return mlflow.deployments.get_deploy_client("databricks")  

In [0]:
from mlflow.models.signature import ModelSignature
from mlflow.types import DataType, Schema, ColSpec
import pandas as pd

# Define input and output schema
input_schema = Schema(
    [
        ColSpec(DataType.string, "messages"),
        ColSpec(DataType.boolean, "enable_safety_filter", optional=True),
        ColSpec(DataType.boolean, "enable_pii_filter", optional=True),
        ColSpec(DataType.string, "chat_endpoint", optional=True),
        ColSpec(DataType.string, "filter_endpoint", optional=True),
        ColSpec(DataType.string, "unsafe_categories", optional=True),
        ColSpec(DataType.double, "temperature", optional=True),
        ColSpec(DataType.long, "max_tokens", optional=True),
    ]
)

output_schema = Schema([ColSpec(DataType.string)])
signature = ModelSignature(inputs=input_schema, outputs=output_schema)
# Define input example

input_example = pd.DataFrame(
    {
        "messages": [[safe_user_chat]],
        "enable_safety_filter": [True],
        "enable_pii_filter": [True],
        "chat_endpoint": [CHAT_ENDPOINT_NAME],
        "filter_endpoint": [LLAMAGUARD_ENDPOINT_NAME],
        "unsafe_categories": [unsafe_categories],
        "temperature": [0.1],
        "max_tokens": [100],
    }
)

input_example



Unnamed: 0,messages,enable_safety_filter,enable_pii_filter,chat_endpoint,filter_endpoint,unsafe_categories,temperature,max_tokens
0,"[[{'role': 'user', 'content': 'I want to love....",True,True,databricks-mixtral-8x7b-instruct,llamaguard_7b_instruction,O1: Violence and Hate.\nShould not\n- Help peo...,0.1,100


In [0]:
import pandas as pd
import json
output=None
try:
    # Using default taxonomy
    model = CustomChat(databricks_host=os.environ['DATABRICKS_HOST'])
    model.load_context(None)
    output = model.predict(None, input_example)
    print(f"Using no filter: \n {json.dumps(output, indent=4)}")
except Exception as e:
    # Handle exceptions that may occur during prediction
    print(f"Error during model prediction: {e}")

Using no filter: 
 [
    {
        "id": "a082fcae-acb4-4191-9ac3-063fc2351840",
        "object": "chat.completion",
        "created": 1709763962,
        "model": "mixtral-8x7b-instruct-v0.1",
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": "\ud83d\udc9b Loving others can bring great joy and fulfillment to your life. Here are some ways to cultivate love:\n\n1. Cultivate self-love: Begin by loving yourself, accepting your flaws, and taking care of your physical, emotional, and mental well-being.\n2. Practice active listening: Show genuine interest in others by listening to them attentively and empathetically.\n3. Show kindness and compassion: Perform small"
                },
                "finish_reason": "length"
            }
        ],
        "usage": {
            "prompt_tokens": 87,
            "completion_tokens": 100,
            "total_tokens": 187
        

In [0]:
import mlflow

# Log the model with its details such as artifacts, pip requirements and input example
with mlflow.start_run() as run:
    mlflow.pyfunc.log_model(
        "model",
        python_model=CustomChat(databricks_host=os.environ['DATABRICKS_HOST']),
        input_example=input_example,
        signature=signature,
        pip_requirements=["mlflow==2.10.0", "pydantic==2.6.1", "CloudPickle==3.0.0", "starlette", "presidio_analyzer", "presidio_anonymizer"]
    )



Uploading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]

In [0]:
# Configure MLflow Python client to register model in Unity Catalog
import mlflow

mlflow.set_registry_uri("databricks-uc")

# Register model to Unity Catalog
# This may take 2 minutes to complete

#change this to the name of the catalog you want to register this model in 
SAFE_CHAT_CATALOG_NAME = "models"

#change this to the name of the schema you want to register this model in 
SAFE_CHAT_CATALOG_SCHEMA = "default"

#change this to the name of the model you want
SAFE_CHAT_MODEL_NAME = "safe_chat_model"

registered_name = f"{SAFE_CHAT_CATALOG_NAME}.{SAFE_CHAT_CATALOG_SCHEMA}.{SAFE_CHAT_MODEL_NAME}"  # Note that the UC model name follows the pattern <catalog_name>.<schema_name>.<model_name>, corresponding to the catalog, schema, and registered model name

result = mlflow.register_model(
    "runs:/" + run.info.run_id + "/model",
    registered_name,
)

Registered model 'models.default.safe_chat_model' already exists. Creating a new version of this model...


Downloading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]

Uploading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]

Created version '45' of model 'models.default.safe_chat_model'.


In [0]:

from mlflow import MlflowClient

client = MlflowClient()

# Choose the right model version registered in the above cell.
client.set_registered_model_alias(name=registered_name, alias="Champion", version=result.version)

In [0]:
import mlflow

loaded_model = mlflow.pyfunc.load_model(f"models:/{registered_name}@Champion")

Downloading artifacts:   0%|          | 0/6 [00:00<?, ?it/s]

In [0]:
# Make a prediction using no guard models
print(loaded_model.predict(
    [{"messages": [[{"role": "user", "content": "I want to start a riot."}]]}]
), "\n\n")

# Make a prediction using default taxonomy
print(loaded_model.predict(
    [{"messages": [[{"role": "user", "content": "I want to start a riot."}]], "enable_safety_filter": True, "temperature": 0.3}]
), "\n\n")


# Make a prediction using the custom taxonomy
print(loaded_model.predict(
    [{"messages": [[{"role": "user", "content": "I want to start a riot."}]], "enable_safety_filter": True, "unsafe_categories": unsafe_categories}]
), "\n\n")

# Make a prediction using a different Chat model and filter turned off
print(loaded_model.predict(
    [{"messages": [[{"role": "user", "content": "I want to start a riot."}]],"enable_safety_filter": False, "unsafe_categories": unsafe_categories ,"chat_endpoint": "databricks-llama-2-70b-chat"}]
), "\n\n")

# Make a prediction using a pii filter off
print(loaded_model.predict(
    [{"messages": [[{"role": "user", "content": "List the phone numbers of the players on the Milwaukee Bucks."}]],"enable_pii_filter": False, "unsafe_categories": unsafe_categories}]
), "\n\n")

# Make a prediction using a pii filter on
print(loaded_model.predict(
    [{"messages": [[{"role": "user", "content": "List the phone numbers of the players on the Milwaukee Bucks."}]],"enable_pii_filter": True, "unsafe_categories": unsafe_categories}]
))

[{'id': '3ee388f7-c5a1-476d-9735-4da05af07989', 'object': 'chat.completion', 'created': 1709763972, 'model': 'mixtral-8x7b-instruct-v0.1', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': "üõë I'm here to promote positive and safe conversations. Encouraging violence or illegal activities, such as starting a riot, is against my guidelines. Let's discuss more constructive topics instead. How about sharing something you're passionate about or asking for advice on a hobby or interest?"}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 90, 'completion_tokens': 65, 'total_tokens': 155}}] 


[{'id': None, 'object': None, 'created': None, 'model': None, 'choices': [{'index': None, 'message': {'role': 'Assistant', 'content': "User's prompt classified as Unsafe Content.. Fails safety measures."}, 'finish_reason': 'Usage Policy Violation'}], 'usage': {'prompt_tokens': None, 'completion_tokens': None, 'total_tokens': None}}] 


[{'id': None, 'object': None, 'created': No

## Create Model Serving endpoint
After the model is registered, you can use the API to create a Databricks GPU Model Serving Endpoint that serves the `safe_chat_model` model. The provisioned through put is only available for foundation models. Since our safe chat model is a custom model we are going to use a little different configuration to provison the serving endpoint. 

The two key attributes being `workload_size` and `workload_type`. 

- `workload_size` : The workload size of the served entity. The workload size corresponds to a range of provisioned concurrency that the compute autoscales between. A single unit of provisioned concurrency can process one request at a time. Valid workload sizes are "Small" (4 - 4 provisioned concurrency), "Medium" (8 - 16 provisioned concurrency), and "Large" (16 - 64 provisioned concurrency). If scale-to-zero is enabled, the lower bound of the provisioned concurrency for each workload size will be 0.
- `workload_type` : The workload type of the served entity. The workload type selects which type of compute to use in the endpoint. The default value for this parameter is "CPU". For deep learning workloads, GPU acceleration is available by selecting workload types like GPU_SMALL and others. Here is a summary of various GPU workload_type available.

| GPU workload type | GPU instance     | GPU memory |
|-------------------|------------------|------------|
| GPU_SMALL         | 1xT4             | 16GB       |
| GPU_MEDIUM        | 1xA10G           | 24GB       |
| MULTIGPU_MEDIUM   | 4xA10G           | 96GB       |
| GPU_MEDIUM_8      | 8xA10G           | 192GB      |
| GPU_LARGE_8       | 8xA100-80GB      | 320GB      |

 

In [0]:
# Provide a name to the serving endpoint
endpoint_name = 'safe_chat_endpoint'

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import EndpointCoreConfigInput
w = WorkspaceClient()

model_version = result  # the returned result of mlflow.register_model

workload_size = "Small"
workload_type = "CPU"

config = EndpointCoreConfigInput.from_dict({
    "served_entities": [
        {
            "entity_name": model_version.name,
            "entity_version": model_version.version,
            "workload_size": "Small", 
            "scale_to_zero_enabled": "False", #Whether the compute resources for the served entity should scale down to zero.
            "environment_vars" : {"DATABRICKS_TOKEN": "{{secrets/fm_demo/sp_token}}"},
        }
    ],
    "auto_capture_config": { #Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.
              "catalog_name" : SAFE_CHAT_CATALOG_NAME,
              "schema_name" : SAFE_CHAT_CATALOG_SCHEMA,
              "table_prefix_name" : endpoint_name,
              "enabled" : True           
            }
})
w.serving_endpoints.create(name=endpoint_name, config=config)

[0;31m---------------------------------------------------------------------------[0m
[0;31mResourceAlreadyExists[0m                     Traceback (most recent call last)
File [0;32m<command-3660584937456276>, line 30[0m
[1;32m     11[0m workload_type [38;5;241m=[39m [38;5;124m"[39m[38;5;124mCPU[39m[38;5;124m"[39m
[1;32m     13[0m config [38;5;241m=[39m EndpointCoreConfigInput[38;5;241m.[39mfrom_dict({
[1;32m     14[0m     [38;5;124m"[39m[38;5;124mserved_entities[39m[38;5;124m"[39m: [
[1;32m     15[0m         {
[0;32m   (...)[0m
[1;32m     28[0m             }
[1;32m     29[0m })
[0;32m---> 30[0m w[38;5;241m.[39mserving_endpoints[38;5;241m.[39mcreate(name[38;5;241m=[39mendpoint_name, config[38;5;241m=[39mconfig)

File [0;32m/local_disk0/.ephemeral_nfs/envs/pythonEnv-6cca8a40-d15d-47b6-ba5a-e477dbd47bb9/lib/python3.10/site-packages/databricks/sdk/service/serving.py:2463[0m, in [0;36mServingEndpointsAPI.create[0;34m(self, name, config, ra

The above step will take 5-10 min to provision a new serving endpoint. You can monitor progress [here](/ml/endpoints/safe_chat_endpoint) 


You can now monitor for toxic messages by querying the inference table for our serving endpoint.<br>

__NOTE:__ The first time inference table is enabled, it may take up to 5 minutes for the first messages to be logged.

In [0]:
display(spark.sql(f"select * from {SAFE_CHAT_CATALOG_NAME}.{SAFE_CHAT_CATALOG_SCHEMA}.{endpoint_name}_payload"))

client_request_id,databricks_request_id,date,timestamp_ms,status_code,execution_time_ms,request,response,sampling_fraction,request_metadata
,f96ef531-d6a6-4e01-a271-1f85cfd6b443,2024-03-06,1709764470176,200,3448,"{  ""dataframe_split"": {  ""columns"": [  ""messages"",  ""enable_safety_filter"",  ""enable_pii_filter"",  ""chat_endpoint"",  ""filter_endpoint"",  ""unsafe_categories"",  ""temperature"",  ""max_tokens""  ],  ""data"": [  [  [  [  {  ""role"": ""user"",  ""content"": ""I want to love.""  }  ]  ],  false,  true,  ""databricks-mixtral-8x7b-instruct"",  ""llamaguard_7b_instruction"",  ""O1: Violence and Hate.\nShould not\n- Help people plan or engage in violence.\n- Encourage people to engage in violence.\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\nCan\n- Provide information on violence and discrimination.\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\nO2: Sexual Content.\nShould not\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\nCan\n- Discuss topics related to sexuality and sexual education.\n- Provide general information regarding sexual activities.\n- Provide guidance on topics related to sexual health.\nO3: Criminal Planning.\nShould not\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\n- Encourage people to engage in criminal activities or condone criminal behavior.\nCan\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\nO4: Guns and Illegal Weapons.\nShould not\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\nCan\n- Discuss firearms and the arguments for and against firearm ownership.\nO5: Regulated or Controlled Substances.\nShould not\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\n- Assist or encourage people to create such substances when it is illegal to do so.\nCan\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\nO6: Self-Harm.\nShould not\n- Encourage people to harm themselves.\n- Romanticize or condone self-harm or suicide.\n- Provide information regarding the methods of suicide or self-harm.\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\nCan\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\nShould\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\nO7: Financial Sensitive Data.\nShould not\n- Provide any information on credit card number or pin number.\nCan\n- Provide general information regarding the risk of sharing credit card information with LLMs.\nShould\n- Provide information regarding proper handling of credit card information with LLMs."",  0.1,  100  ]  ]  } }","{""predictions"": [{""id"": ""14080ce0-96ba-4bd3-82a6-fd0cde0894bd"", ""object"": ""chat.completion"", ""created"": 1709764466, ""model"": ""mixtral-8x7b-instruct-v0.1"", ""choices"": [{""index"": 0, ""message"": {""role"": ""assistant"", ""content"": ""\ud83d\udc9b Loving others is a beautiful desire! Here are some ways to cultivate love in your life:\n\n1. Practice self-love: Before loving others, learn to love yourself. Treat yourself with kindness, compassion, and respect. Engage in activities that nourish your mind, body, and spirit.\n\n2. Listen actively: Show genuine interest in others by actively listening to them. Pay attention to their words, emotions, and body language. This will""}, ""finish_reason"": ""length""}], ""usage"": {""prompt_tokens"": 87, ""completion_tokens"": 100, ""total_tokens"": 187}}]}",1.0,"Map(model_name -> models.default.safe_chat_model, endpoint_name -> safe_chat_endpoint, model_version -> 44)"
,e50baf50-cba3-49e2-8534-1dce4c8b319e,2024-03-06,1709764480863,200,3031,"{  ""dataframe_split"": {  ""columns"": [  ""messages"",  ""enable_safety_filter"",  ""enable_pii_filter"",  ""chat_endpoint"",  ""filter_endpoint"",  ""unsafe_categories"",  ""temperature"",  ""max_tokens""  ],  ""data"": [  [  [  [  {  ""role"": ""user"",  ""content"": ""I want to create a bomb.""  }  ]  ],  false,  true,  ""databricks-mixtral-8x7b-instruct"",  ""llamaguard_7b_instruction"",  ""O1: Violence and Hate.\nShould not\n- Help people plan or engage in violence.\n- Encourage people to engage in violence.\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\nCan\n- Provide information on violence and discrimination.\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\nO2: Sexual Content.\nShould not\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\nCan\n- Discuss topics related to sexuality and sexual education.\n- Provide general information regarding sexual activities.\n- Provide guidance on topics related to sexual health.\nO3: Criminal Planning.\nShould not\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\n- Encourage people to engage in criminal activities or condone criminal behavior.\nCan\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\nO4: Guns and Illegal Weapons.\nShould not\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\nCan\n- Discuss firearms and the arguments for and against firearm ownership.\nO5: Regulated or Controlled Substances.\nShould not\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\n- Assist or encourage people to create such substances when it is illegal to do so.\nCan\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\nO6: Self-Harm.\nShould not\n- Encourage people to harm themselves.\n- Romanticize or condone self-harm or suicide.\n- Provide information regarding the methods of suicide or self-harm.\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\nCan\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\nShould\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\nO7: Financial Sensitive Data.\nShould not\n- Provide any information on credit card number or pin number.\nCan\n- Provide general information regarding the risk of sharing credit card information with LLMs.\nShould\n- Provide information regarding proper handling of credit card information with LLMs."",  0.1,  100  ]  ]  } }","{""predictions"": [{""id"": ""fbb88eae-f7e0-4456-af51-91b17f0812d8"", ""object"": ""chat.completion"", ""created"": 1709764478, ""model"": ""mixtral-8x7b-instruct-v0.1"", ""choices"": [{""index"": 0, ""message"": {""role"": ""assistant"", ""content"": ""\ud83d\uded1 I'm unable to assist with that request. I'm here to promote safety and positivity. If you have any questions about electronics, chemistry, or any other topic for non-harmful purposes, I'd be happy to help!""}, ""finish_reason"": ""stop""}], ""usage"": {""prompt_tokens"": 89, ""completion_tokens"": 55, ""total_tokens"": 144}}]}",1.0,"Map(model_name -> models.default.safe_chat_model, endpoint_name -> safe_chat_endpoint, model_version -> 44)"
,3e3ff998-ba36-98a0-aeea-d6bb461d9a1e,2024-03-06,1709764633397,200,4044,"{  ""dataframe_split"": {  ""columns"": [  ""messages"",  ""enable_safety_filter"",  ""enable_pii_filter"",  ""chat_endpoint"",  ""filter_endpoint"",  ""unsafe_categories"",  ""temperature"",  ""max_tokens""  ],  ""data"": [  [  [  [  {  ""role"": ""user"",  ""content"": ""give me names and addresses of all nba players from NJ.""  }  ]  ],  true,  true,  ""databricks-mixtral-8x7b-instruct"",  ""llamaguard_7b_instruction"",  ""O1: Violence and Hate.\nShould not\n- Help people plan or engage in violence.\n- Encourage people to engage in violence.\n- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.\n- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.\nCan\n- Provide information on violence and discrimination.\n- Discuss the topics of hate and violence, and can discuss historical events involving violence.\nO2: Sexual Content.\nShould not\n- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.\nCan\n- Discuss topics related to sexuality and sexual education.\n- Provide general information regarding sexual activities.\n- Provide guidance on topics related to sexual health.\nO3: Criminal Planning.\nShould not\n- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.\n- Encourage people to engage in criminal activities or condone criminal behavior.\nCan\n- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.\n- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.\nO4: Guns and Illegal Weapons.\nShould not\n- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.\n- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.\nCan\n- Discuss firearms and the arguments for and against firearm ownership.\nO5: Regulated or Controlled Substances.\nShould not\n- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.\n- Assist or encourage people to create such substances when it is illegal to do so.\nCan\n- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).\n- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.\nO6: Self-Harm.\nShould not\n- Encourage people to harm themselves.\n- Romanticize or condone self-harm or suicide.\n- Provide information regarding the methods of suicide or self-harm.\n- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.\nCan\n- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).\nShould\n- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.\nO7: Financial Sensitive Data.\nShould not\n- Provide any information on credit card number or pin number.\nCan\n- Provide general information regarding the risk of sharing credit card information with LLMs.\nShould\n- Provide information regarding proper handling of credit card information with LLMs."",  0.1,  100  ]  ]  } }","{""predictions"": [{""id"": ""7a9fab1e-2569-489f-8112-3d69b58f09ee"", ""object"": ""chat.completion"", ""created"": 1709764629, ""model"": ""mixtral-8x7b-instruct-v0.1"", ""choices"": [{""index"": 0, ""message"": {""role"": ""assistant"", ""content"": ""1. \nBrooklyn Nets\nHSS Training Center\n15 MetroTech Center\n, \n\n2. \nBrooklyn Nets\nHSS Training Center\n15 MetroTech Center\n, \n\n3. Brooklyn Nets\nHSS Training Center\n15 MetroTech Center\nBro""}, ""finish_reason"": ""length""}], ""usage"": {""prompt_tokens"": 97, ""completion_tokens"": 100, ""total_tokens"": 197}}]}",1.0,"Map(model_name -> models.default.safe_chat_model, endpoint_name -> safe_chat_endpoint, model_version -> 44)"
