# Introduction

This notebook provides an implementation of content moderation mechanisms aimed at ensuring safe and responsible interaction in a system. It includes:

1. **Content Filters**: A mechanism that checks user input against a set of predefined blocked topics (such as violence, self-harm, or illegal activities). If any prohibited content is detected, it blocks the input from being processed.
   
2. **Denied Topics Handling**: If a user's input contains blocked topics, this feature provides a predefined response to guide the conversation away from harmful or restricted content.

3. **Contextual Grounding Check**: This feature verifies that the generated content aligns with an approved knowledge base, ensuring that the responses are based on valid and factual information.

These components collectively ensure that the system maintains a safe, ethical, and informative environment by filtering out harmful content and guiding user interactions in a responsible manner.


In [None]:
import re

# 1. Content Filters
class ContentFilter:
    def __init__(self, blocked_topics):
        """
        Initializes the content filter with a list of prohibited topics.
        :param blocked_topics: List of keywords or phrases to block.
        """
        self.blocked_topics = [re.compile(topic, re.IGNORECASE) for topic in blocked_topics]

    def is_allowed(self, text):
        """
        Checks if the text contains any prohibited topics.
        :param text: Input text to check.
        :return: True if allowed, False if it contains blocked content.
        """
        return not any(pattern.search(text) for pattern in self.blocked_topics)

# 2. Denied Topics Handling
class DeniedTopicsHandler:
    def __init__(self, blocked_responses):
        """
        Initializes the denied topics handler.
        :param blocked_responses: Dictionary mapping denied topics to responses.
        """
        self.blocked_responses = blocked_responses

    def handle_denied_topic(self, text):
        """
        Provides a response when a blocked topic is detected.
        :param text: Input text to check.
        :return: Appropriate response or None if the text is allowed.
        """
        for topic, response in self.blocked_responses.items():
            if re.search(topic, text, re.IGNORECASE):
                return response
        return None

# 3. Contextual Grounding Check
class ContextualGroundingChecker:
    def __init__(self, approved_knowledge_base):
        """
        Initializes the contextual grounding checker.
        :param approved_knowledge_base: Set of approved knowledge entries.
        """
        self.approved_knowledge_base = set(approved_knowledge_base)

    def is_contextually_grounded(self, text):
        """
        Checks if the content aligns with the approved knowledge base.
        :param text: Text to check.
        :return: True if grounded, False otherwise.
        """
        return text in self.approved_knowledge_base

# Example Usage
if __name__ == "__main__":
    # 1. Content Filter Example
    filter = ContentFilter(blocked_topics=["violence", "self-harm", "illegal activities"])
    user_input = "How to commit illegal activities?"
    if not filter.is_allowed(user_input):
        print("Content blocked: Prohibited topic detected.")
    else:
        print("Content allowed.")

    # 2. Denied Topics Handling Example
    handler = DeniedTopicsHandler(
        blocked_responses={
            "violence": "We cannot provide information on violent topics.",
            "illegal activities": "Discussion of illegal activities is not permitted.",
        }
    )
    response = handler.handle_denied_topic(user_input)
    if response:
        print("Response:", response)
    else:
        print("No blocked topics detected.")

    # 3. Contextual Grounding Check Example
    grounding_checker = ContextualGroundingChecker(
        approved_knowledge_base={
            "Artificial intelligence can assist with data analysis.",
            "Climate change is a global issue requiring action.",
        }
    )
    generated_response = "Artificial intelligence can assist with data analysis."
    if grounding_checker.is_contextually_grounded(generated_response):
        print("Content is contextually grounded.")
    else:
        print("Content is not grounded in approved knowledge.")

# Conclusion

This notebook demonstrates how to integrate content moderation into a system to safeguard interactions. By leveraging content filters, denied topic responses, and contextual grounding checks, we can ensure that the content produced by the system is both ethical and accurate. 

- The **Content Filter** blocks harmful or prohibited topics, preventing unsafe interactions.
- The **Denied Topics Handler** ensures that users receive appropriate responses when attempting to discuss forbidden topics.
- The **Contextual Grounding Checker** verifies that the content aligns with an approved knowledge base, ensuring factual integrity.

Together, these mechanisms create a robust content moderation framework, helping to build safer and more reliable systems for users.
