# AWS Comprehend Tutorial

This notebook demonstrates AWS Comprehend natural language processing capabilities including:
- Language detection (single and multi-language)
- Sentiment analysis
- Named entity recognition
- Key phrase extraction

## Setup Environment

Import required libraries and initialize the AWS Comprehend client.

In [None]:
# Import required libraries and set up our environment
import pprint

import boto3

print("üìö Setting up the environment...")

# Initialize pretty printer for better output formatting
pp = pprint.PrettyPrinter(indent=2)

# Create Comprehend client
comprehend = boto3.client(service_name="comprehend", region_name="eu-west-1")

print("‚úÖ Environment setup complete!")
print(f"üåç Using AWS region: {comprehend.meta.region_name}")

## Single Language Detection

Detect the dominant language in a simple English sentence and display confidence scores.

In [None]:
# Demonstrate simple language detection
print("üîç Testing single-language detection...")

text = "This is a test sentence in English"
try:
    response = comprehend.detect_dominant_language(Text=text)

    print("\nüìù Input text:")
    print(f'"{text}"')

    print("\nüåê Detected languages:")
    for language in response["Languages"]:
        print(f"- {language['LanguageCode']}: {language['Score']:.2%} confidence")

    print("\nüì¶ Raw response:")
    pp.pprint(response)
except Exception as e:
    print(f"‚ùå Error: {str(e)}")

## Multi-language Detection

Test language detection on text containing multiple languages (German and French).

In [None]:
# Demonstrate multi-language detection
print("üîç Testing multi-language detection...")

multilingual_text = "A: Hallo, wie geht es Ihnen?\nB: √áa va bien. Merci. Et toi?"
try:
    response = comprehend.detect_dominant_language(Text=multilingual_text)

    print("\nüìù Input text:")
    print(f'"{multilingual_text}"')

    print("\nüåê Detected languages:")
    for language in response["Languages"]:
        print(f"- {language['LanguageCode']}: {language['Score']:.2%} confidence")

except Exception as e:
    print(f"‚ùå Error: {str(e)}")

## Basic Sentiment Analysis

Analyze the sentiment of a short positive text and display sentiment scores for all categories.

In [None]:
# Demonstrate simple sentiment analysis
print("Testing sentiment analysis with short text...")

text = "Hey, I'm feeling great today!"
try:
    response = comprehend.detect_sentiment(Text=text, LanguageCode="en")

    print("\nüìù Input text:")
    print(f'"{text}"')

    print("\nüí≠ Sentiment analysis:")
    print(f"Overall sentiment: {response['Sentiment']}")
    print("\nSentiment scores:")
    for sentiment, score in response["SentimentScore"].items():
        print(f"- {sentiment}: {score:.2%}")

except Exception as e:
    print(f"‚ùå Error: {str(e)}")

## Sentiment Analysis on Longer Text

Analyze sentiment of a longer movie description to show how Comprehend handles larger content.

In [None]:
# Demonstrate sentiment analysis with longer text
print("üòä Testing sentiment analysis with longer text...")

long_text = (
    "Chronicles the experiences of a formerly successful banker as a prisoner in the gloomy jailhouse "
    "of Shawshank after being found guilty of a crime he did not commit. The film portrays the man's unique way "
    "of dealing with his new, torturous life; along the way he befriends a number of fellow prisoners, most notably "
    "a wise long-term inmate named Red."
)

try:
    response = comprehend.detect_sentiment(Text=long_text, LanguageCode="en")

    print("\nüìù Input text:")
    print("-" * 40)
    print(long_text)
    print("-" * 40)

    print("\nüí≠ Sentiment analysis:")
    print(f"Overall sentiment: {response['Sentiment']}")
    print("\nSentiment scores:")
    for sentiment, score in response["SentimentScore"].items():
        print(f"- {sentiment}: {score:.2%}")

except Exception as e:
    print(f"‚ùå Error: {str(e)}")

## Named Entity Recognition

Extract and identify named entities (people, organizations, locations) from text.

In [None]:
# Demonstrate entity recognition
print("üè∑Ô∏è  Testing named entity recognition...")

texts = [
    "Welcome to CEU's Data Engineering course run by Zoltan Toth.",
    "Here we learn about the internet, AWS and Data Engineering.",
]

try:
    print("\nüîç Analyzing entities in texts:")

    for i, text in enumerate(texts, 1):
        print(f'\nüìù Text {i}: "{text}"')
        response = comprehend.detect_entities(Text=text, LanguageCode="en")

        if response["Entities"]:
            print("Found entities:")
            for entity in response["Entities"]:
                print(f"- {entity['Text']} ({entity['Type']}): {entity['Score']:.2%} confidence")
        else:
            print("No entities found.")

except Exception as e:
    print(f"‚ùå Error: {str(e)}")

## Key Phrase Detection

Identify and extract key phrases from text to understand the main topics and important concepts.

In [None]:
# Demonstrate key phrase detection
print("üîë Testing key phrase detection...")

try:
    print("\nüîç Analyzing key phrases in texts:")

    for i, text in enumerate(texts, 1):
        print(f'\nüìù Text {i}: "{text}"')
        response = comprehend.detect_key_phrases(Text=text, LanguageCode="en")

        if response["KeyPhrases"]:
            print("Found key phrases:")
            for phrase in response["KeyPhrases"]:
                print(f"- {phrase['Text']}: {phrase['Score']:.2%} confidence")
        else:
            print("No key phrases found.")

        print("\nüì¶ Raw response:")
        pp.pprint(response)
except Exception as e:
    print(f"‚ùå Error: {str(e)}")