# Prompt Engineering

## Overview

This Jupyter notebook provides a structured approach to prompt engineering using NLTK, focusing on generating clear, neutral, and inclusive content.

In [1]:
!python3 --version

Python 3.11.14


In [4]:
!pip install ollama
!pip install requests
!pip install transformers
!pip install tqdm


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m26.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m26.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Collecting transformers
  Downloading transformers-5.1.0-py3-none-any.whl.metadata (31 kB)
Collecting huggingface-hub<2.0,>=1.3.0 (from transformers)
  Downloading huggingface_hub-1.4.1-py3-none-any.whl.metadata (13 kB)
Collecting pyyaml>=5.1 (from transformers)
  Downloading pyyaml-6.0.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (2.4 kB)
Collecting regex!=2019.12.17 (from transformers)
  Downloading regex-2026.1.15-cp311-cp311-m

In [5]:
import nltk

# Download required NLTK resources
nltk.download('punkt')
nltk.download('punkt_tab')
nltk.download('stopwords')
nltk.download('vader_lexicon')

[nltk_data] Downloading package punkt to /home/akram/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /home/akram/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package stopwords to /home/akram/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /home/akram/nltk_data...


True

In [6]:
import logging

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

In [7]:
from Bias.nltk_prompt_engineer import NLTKPromptEngineer


def test_nltk_bias_detection():
    # Initialize the NLTK prompt engineer
    engineer = NLTKPromptEngineer(logging=logging)

    # Test various texts for bias
    print("\nTesting NLTK bias detection...")

    texts = [
        # Gender bias
        "He is likely to excel in engineering roles due to his analytical thinking.",
        "The ideal candidate for this nursing position should be nurturing and gentle like most women.",
        "Men naturally perform better in mathematical and technical fields due to their logical minds.",
        "Women are more suitable for administrative roles because they are detail-oriented and better at multitasking.",

        # Racial bias
        "Asian applicants are perfect for quantitative roles because they're naturally good at math.",
        "Latino workers are ideal for physical labor positions due to their strong work ethic.",
        "White executives tend to make better strategic decisions for American companies.",
        "Black employees often struggle with professional communication in corporate environments.",

        # Age bias
        "Younger employees bring energy and tech-savviness that older workers simply cannot match.",
        "Workers over 50 can't adapt to new technologies as quickly as millennials can.",
        "We need young, fresh talent instead of outdated perspectives from baby boomers.",
        "Recent graduates are more innovative than employees with decades of experience.",

        # Socio-economic bias
        "Candidates from prestigious universities are naturally more qualified for leadership roles.",
        "Employees who speak with regional accents should be placed in back-office positions rather than client-facing roles.",
        "People from affluent backgrounds have better soft skills required for management positions.",
        "Workers from lower-income neighborhoods tend to have poorer work ethics and reliability issues."
    ]

    for i, text in enumerate(texts):
        print(f"\nAnalyzing text {i+1}:")
        print(f"Text: {text}")
        analysis = engineer.analyze_text(text)
        print(f"Word count: {analysis['word_count']}")
        print(f"Bias score: {analysis['bias']['overall']:.2f}")

        # Show specific bias types and matches
        for bias_type in engineer._bias_indicators.keys():
            if analysis['bias'][bias_type] > 0:
                print(f"  - {bias_type}: {analysis['bias'][bias_type]:.2f}")
                print(f"    Matches: {analysis['bias'][f'{bias_type}_matches']}")

    # Test evaluation with multiple criteria
    for i, text in enumerate(texts):
        print(f"\nEvaluating text {i+1} with multiple criteria:")
        scores = engineer.evaluate_text(text, ["bias", "clarity", "engagement", "sentiment"])
        for criterion, score in scores.items():
            print(f"  - {criterion}: {score:.2f}")

# Run the test
test_nltk_bias_detection()

2026-02-12 14:43:17,426 - INFO - NLTK Prompt Engineer initialized successfully



Testing NLTK bias detection...

Analyzing text 1:
Text: He is likely to excel in engineering roles due to his analytical thinking.
Word count: 14
Bias score: 0.06
  - gender_bias: 0.20
    Matches: [('He', ''), ('his', '')]

Analyzing text 2:
Text: The ideal candidate for this nursing position should be nurturing and gentle like most women.
Word count: 16
Bias score: 0.03
  - gender_bias: 0.10
    Matches: [('women', '')]

Analyzing text 3:
Text: Men naturally perform better in mathematical and technical fields due to their logical minds.
Word count: 15
Bias score: 0.03
  - gender_bias: 0.10
    Matches: [('Men', '')]

Analyzing text 4:
Text: Women are more suitable for administrative roles because they are detail-oriented and better at multitasking.
Word count: 16
Bias score: 0.03
  - gender_bias: 0.10
    Matches: [('Women', '')]

Analyzing text 5:
Text: Asian applicants are perfect for quantitative roles because they're naturally good at math.
Word count: 15
Bias score: 0.00

Analy