# Content Moderation for AI Tutoring Responses for Elementary School Students

---

#### **Step 1: Install Required Libraries**

We will need several libraries, including LlamaIndex (for indexing and querying documents), HuggingFace models (for embeddings), and general-purpose NLP libraries such as transformers and torch.

```
# Install necessary libraries
!pip install llama-index llama-index-embeddings-huggingface transformers torch sentence_transformers "huggingface_hub[inference]" llama-index-llms-huggingface-api
```

#### **Step 2: Import Required Libraries**

Now that the necessary packages are installed, we import the core modules for document handling, embeddings, and text processing.

  - `VectorStoreIndex`: This is used to index and query documents.
  - `SimpleDirectoryReader`: This reads the document files for content moderation.
  - `HuggingFaceEmbedding`: For creating embeddings from text to better understand relationships.
  - `AutoModelForCausalLM` and `AutoTokenizer`: To load language models for text generation and tokenization

In [13]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from transformers import AutoModelForCausalLM, AutoTokenizer
import os
from os import path

def load_token(file_path):
    with open(file_path) as f:
        key = f.read().strip("\n")
    return key

hf_token = load_token(file_path='hf_token.txt')

#### **Step 3: Setup the Text Moderation System**

We will directly process text from the tutor's responses and run them through moderation criteria.

- The `responses` list simulates the output from an LLM. These are strings that we will analyze for tone, encouragement, and child-appropriate language.

Let’s assume we have the following sample responses generated by an LLM:

In [2]:
# Sample AI tutor responses
responses = [
    "You should have known this already! Why don't you understand simple addition?",
    "Good job! You’re improving, but let’s practice a bit more to get better.",
    "This question is really easy for kids your age. Try harder next time!",
    "Keep going! You're doing great with fractions."
]

#### **Step 4: Generate Embeddings and Moderate Responses**

We will now apply a content moderation strategy using pre-trained models. Here, we’ll load a language model to generate embeddings of each response, and later compare these embeddings with the moderation guidelines to flag inappropriate content.

##### **Step 4.1: Set Up Embeddings**

We’ll use a HuggingFace embedding model to generate embeddings for each response. This allows the LLM to reference the moderation rules later.

The embedding model converts text into embeddings (vector representations), which helps in comparing the AI tutor’s response with the moderation rules.

In [3]:
# Initialize the embedding model
embedding_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

##### **Step 4.2: Create and Index Moderation Rules**

Let’s assume we have a set of moderation rules saved in text files. These rules might include educational guidelines like "Avoid using discouraging language" or "Ensure feedback is constructive."

  - We save moderation guidelines into a folder and load them using `SimpleDirectoryReader`.
  - The documents are then indexed with the embedding model to later compare AI tutor responses to the rules.

In [4]:
# Simulate loading moderation guidelines from text files
moderation_dir = 'data/content/'
if not path.exists(moderation_dir):
    os.mkdir(moderation_dir)

# Create sample moderation guidelines
with open(os.path.join(moderation_dir,"moderation_guideline.txt"), "w") as file:
    file.write("Responses should avoid discouraging phrases like 'You're wrong' and 'You should have known'. "
               "Encourage students with constructive feedback. "
               "Avoid using complex vocabulary that may be too difficult for elementary students.")

# Load moderation guidelines as documents
loader = SimpleDirectoryReader(input_dir=moderation_dir)
documents = loader.load_data()

# Index the documents using the embedding model
index = VectorStoreIndex.from_documents(
    documents,
    embed_model=embedding_model,
)

#### **Step 5: Query the Moderation Guidelines**

Once we have indexed the guidelines, we can query the LLM to decide whether each AI tutor response is appropriate. The LLM compares the text embeddings of the response to the embeddings of the moderation rules.

##### **Step 5.1: Set Up the LLM for Querying**

We’ll use the HuggingFace Inference API to load a pre-trained LLM that will process the queries.

  - We load a small LLM using the HuggingFace Inference API and set up the tokenizer and LLM for further querying. This LLM is responsible for processing moderation requests based on the guidelines.


In [16]:
# Load LLM for querying (e.g., Microsoft Phi-3-mini model)
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.core import Settings

remotely_run = HuggingFaceInferenceAPI(model_name="microsoft/Phi-3-mini-4k-instruct", token=hf_token)

# Set the tokenizer and LLM for query processing
Settings.tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", token=hf_token)
Settings.llm = remotely_run

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


##### **Step 5.2: Moderate the Responses**

Now, we’ll query the LLM to evaluate each response based on the moderation rules.

The `query_engine` will compare each AI tutor response with the moderation guidelines and return whether the response is appropriate or needs adjustment.

In [17]:
# Create a query engine to process the AI tutor responses
query_engine = index.as_query_engine()

# Evaluate each response from the AI tutor using the moderation rules
for response in responses:
    query = f"Based on the moderation guidelines, is the following response appropriate for elementary students? {response}"
    moderation_result = query_engine.query(query)
    print(f"Response: '{response}'\nModeration Result: {moderation_result}\n")

Response: 'You should have known this already! Why don't you understand simple addition?'
Moderation Result: 

No, the response is not appropriate for elementary students. It uses a discouraging phrase "You should have known this already!" which can be demotivating. Instead, a constructive approach would be to guide the student with encouragement and support. For example, "Let's try to understand addition together. Can you tell me what you're finding difficult?"




Response: 'Good job! You’re improving, but let’s practice a bit more to get better.'
Moderation Result: 

No, the response is not appropriate for elementary students based on the moderation guidelines. The phrase "You’re improving" could be seen as discouraging because it implies that the student is not doing well. Instead, the response should focus on encouragement and constructive feedback without using phrases that may be perceived as negative. A more appropriate response could be: "Great effort! With a little more pract

#### **Step 6: Provide Feedback for Inappropriate Responses**

In the final step, we can provide constructive feedback for responses flagged as inappropriate. This can help refine the AI tutor's future responses.

The `provide_feedback` function identifies inappropriate responses and suggests improvements.

In [11]:
# Provide feedback on inappropriate responses
def provide_feedback(moderation_result, response):
    if "not appropriate" in moderation_result:
        return f"Feedback: The response '{response}' can be rephrased to avoid discouraging or condescending language."
    else:
        return "No changes needed."

# Generate feedback for each moderation result
for response in responses:
    query = f"Based on the moderation guidelines, is the following response appropriate for elementary students? {response}"
    moderation_result = query_engine.query(query)
    feedback = provide_feedback(moderation_result, response)
    print(f"Response: '{response}'\nModeration Result: {moderation_result}\n{feedback}\n")

HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://api-inference.huggingface.co/models/microsoft/Phi-3-mini-4k-instruct (Request ID: GkNUiTVIVu1TPoY4nMJ77)

Rate limit reached. Please log in or use a HF access token

### **Conclusion**

This tutorial provides a detailed method for using **LlamaIndex** and **LLM** to moderate AI tutor responses for elementary school students. The system compares each response to moderation rules and flags inappropriate language or tone, ensuring that the AI tutor's output is age-appropriate and constructive. The workflow includes:
1. Installing necessary libraries.
2. Setting up embeddings and indexing moderation rules.
3. Using an LLM to query the rules and moderate responses.
4. Providing feedback on flagged content.

Let me know if this implementation is what you expected or if further adjustments are needed!