# Exercises XP Ninja: W1_D1

## What You'll Learn

- Understanding and analyzing the outcomes of different Python expressions and built-in functions.
- Enhancing your ability to predict the behavior of boolean expressions and value comparisons in Python.
- Developing skills in string manipulation and text analysis.
- Learning how to engage with user input to create dynamic, interactive Python scripts.
- Applying concepts of text processing to perform detailed textual analysis and statistics.

## What You Will Create

- A program to predict and understand complex Python outputs involving boolean and comparison operators.
- An interactive script that challenges users to create the longest sentences without using a specific character and provides real-time feedback.
- A detailed textual analysis tool that computes various metrics about a given paragraph, such as word count, sentence count, unique word count, and more, providing insights into basic natural language processing.

## Exercise 1: Outputs

Predict the output of the following code snippets:
3 <= 3 < 9
3 == 3 == 3
bool(0)
bool(5 == "5")
bool(4 == 4) == bool("4" == "4")
bool(bool(None))

x = (1 == True)
y = (1 == False)
a = True + 4
b = False + 10

print("x is", x)
print("y is", y)
print("a:", a)
print("b:", b)

## Exercise 2: Longest word without a specific character

- Keep asking the user to input the longest sentence they can without the character “A”.
- Each time a user successfully sets a new longest sentence, print a congratulations message.

## Exercise 3: Working on a paragraph

- Find an interesting paragraph of text online.
- Paste it to your code, and store it in a variable.
- Analyze the paragraph. Print out a nicely formatted message saying:
  - How many characters it contains.
  - How many sentences it contains.
  - How many words it contains.
  - How many unique words it contains.

**Bonus**:
- How many non-whitespace characters it contains.
- The average amount of words per sentence in the paragraph.
- The amount of non-unique words in the paragraph.

## Exercise 1 — Outputs (prévision + vérification)

In [1]:
# Title: Predictions and Checks for Boolean/Comparison Outputs
# This cell prints both the predicted result (as comments) and the actual result computed by Python.

# 1) 3 <= 3 < 9
# Prediction: True (chained comparisons: 3 <= 3 AND 3 < 9)
print("3 <= 3 < 9  =>", 3 <= 3 < 9)

# 2) 3 == 3 == 3
# Prediction: True (all equal in a chain)
print("3 == 3 == 3 =>", 3 == 3 == 3)

# 3) bool(0)
# Prediction: False (0 is falsy)
print("bool(0)     =>", bool(0))

# 4) bool(5 == '5')
# Prediction: False (int 5 != str '5', so comparison is False; bool(False) -> False)
print("bool(5 == '5') =>", bool(5 == "5"))

# 5) bool(4 == 4) == bool('4' == '4')
# Prediction: True (both sides are True -> True == True)
print("bool(4 == 4) == bool('4' == '4') =>", bool(4 == 4) == bool("4" == "4"))

# 6) bool(bool(None))
# Prediction: False (bool(None) -> False; bool(False) -> False)
print("bool(bool(None)) =>", bool(bool(None)))

# Variables and arithmetic with booleans
# In Python, True behaves like 1 and False like 0 in arithmetic.
x = (1 == True)    # Prediction: True  (1 == 1)
y = (1 == False)   # Prediction: False (1 == 0)
a = True + 4       # Prediction: 5     (1 + 4)
b = False + 10     # Prediction: 10    (0 + 10)

print("x is", x)
print("y is", y)
print("a:", a)
print("b:", b)

3 <= 3 < 9  => True
3 == 3 == 3 => True
bool(0)     => False
bool(5 == '5') => False
bool(4 == 4) == bool('4' == '4') => True
bool(bool(None)) => False
x is True
y is False
a: 5
b: 10


## Exercise 2 — Longest sentence without the letter “A”

In [2]:
# Title: Longest Sentence Without the Letter 'A'
# This script repeatedly asks the user for a sentence that contains no 'A' or 'a'.
# It tracks the longest valid sentence and congratulates the user when they set a new record.
# Press Enter on an empty line to stop.

# Initialize the best (longest) sentence length
best_length = 0  # holds the length of the longest valid sentence
best_sentence = ""  # holds the longest valid sentence itself

while True:
    # Ask for user input
    user_input = input("Enter the longest sentence you can WITHOUT the letter 'A' (Enter to stop): ").strip()

    # If the input is empty, break the loop
    if user_input == "":
        print("Stopping. Thanks for playing!")
        break

    # Check if the sentence contains 'a' or 'A'
    if 'a' in user_input.lower():
        # If it contains 'a'/'A', inform the user and continue
        print("Oops! Your sentence contains the letter 'A'. Try again.")
        continue

    # If valid (no 'A'), check the length
    current_length = len(user_input)  # compute length of the sentence
    if current_length > best_length:
        # Update the record and congratulate
        best_length = current_length
        best_sentence = user_input
        print(f"🎉 New record! Length = {best_length}")
    else:
        # Encourage the user to try again
        print(f"Good try! Current record is {best_length}. Keep going!")

Enter the longest sentence you can WITHOUT the letter 'A' (Enter to stop): This summer I will study Python every single evening
🎉 New record! Length = 52
Enter the longest sentence you can WITHOUT the letter 'A' (Enter to stop): Python is cool
Good try! Current record is 52. Keep going!
Enter the longest sentence you can WITHOUT the letter 'A' (Enter to stop): stop
Good try! Current record is 52. Keep going!
Enter the longest sentence you can WITHOUT the letter 'A' (Enter to stop): 
Stopping. Thanks for playing!


## Exercise 3 — Paragraph analysis (text stats)

In [3]:
# Title: Paragraph Analysis - Characters, Sentences, Words, Unique Words
# This cell computes several statistics on a paragraph:
# - total characters
# - total non-whitespace characters
# - sentence count
# - word count
# - unique word count (case-insensitive)
# - non-unique words (repeated words count)
# - average words per sentence
#
# We use regular expressions to split sentences and extract words.
# The word extraction is Unicode-aware to include accented characters.

import re

# ---- Replace the paragraph below with your own text if you wish ----
paragraph = (
    "Learning to think clearly is a powerful skill. "
    "When we analyze text, we uncover structure and meaning. "
    "By practicing, our intuition improves!"
)

# 1) Total characters (including whitespace)
# Simply take the length of the raw string
total_chars = len(paragraph)

# 2) Total non-whitespace characters
# Remove all whitespace with regex \s and count the length
non_ws_chars = len(re.sub(r"\s+", "", paragraph))

# 3) Sentence splitting
# We split on punctuation that ends sentences (. ! ?), followed by one or more spaces.
# Using a regex with a positive lookbehind ensures punctuation stays with the previous sentence.
sentences = re.split(r"(?<=[.!?])\s+", paragraph.strip())
# Remove possible empty strings after split
sentences = [s for s in sentences if s]

# 4) Word extraction
# Use a Unicode-aware regex: sequences of letters/numbers/underscore or apostrophes as a simple word model.
# You can refine the regex if you need more precise tokenization.
words = re.findall(r"\b[\w']+\b", paragraph, flags=re.UNICODE)

# 5) Normalize words for uniqueness (casefold is stronger than lower for Unicode)
normalized_words = [w.casefold() for w in words]

# 6) Unique words
unique_words = set(normalized_words)
unique_count = len(unique_words)

# 7) Non-unique words (i.e., total minus number of unique items)
non_unique_count = len(normalized_words) - unique_count

# 8) Average words per sentence (guard divide-by-zero)
sentence_count = len(sentences)
word_count = len(words)
avg_words_per_sentence = (word_count / sentence_count) if sentence_count > 0 else 0.0

# 9) Pretty printing of results
print("=== Paragraph Analysis ===")
print(f"Total characters:               {total_chars}")
print(f"Non-whitespace characters:      {non_ws_chars}")
print(f"Number of sentences:            {sentence_count}")
print(f"Number of words:                {word_count}")
print(f"Number of unique words:         {unique_count}")
print(f"Number of non-unique words:     {non_unique_count}")
print(f"Average words per sentence:     {avg_words_per_sentence:.2f}")

# 10) (Optional) Show sentences and a few unique words for inspection
print("\nSentences detected:")
for i, s in enumerate(sentences, start=1):
    print(f"{i}. {s}")

print("\nSample unique words (first 20):")
print(sorted(list(unique_words))[:20])

=== Paragraph Analysis ===
Total characters:               141
Non-whitespace characters:      120
Number of sentences:            3
Number of words:                22
Number of unique words:         21
Number of non-unique words:     1
Average words per sentence:     7.33

Sentences detected:
1. Learning to think clearly is a powerful skill.
2. When we analyze text, we uncover structure and meaning.
3. By practicing, our intuition improves!

Sample unique words (first 20):
['a', 'analyze', 'and', 'by', 'clearly', 'improves', 'intuition', 'is', 'learning', 'meaning', 'our', 'powerful', 'practicing', 'skill', 'structure', 'text', 'think', 'to', 'uncover', 'we']


## Conclusions

- **Exercise 1**: Learned how Python evaluates chained comparisons, boolean logic, and the way booleans behave like integers (`True` = 1, `False` = 0) in arithmetic.
- **Exercise 2**: Practiced creating an interactive loop that validates user input and tracks the "best" result, updating only when a new record is set.
- **Exercise 3**: Applied string manipulation and regular expressions to perform basic text analysis, including counts of characters, sentences, words, unique words, and averages.  
  This demonstrated how simple Python tools can be used for basic Natural Language Processing (NLP) tasks.
