# KUC, PRG210
# a more complicated case of using regular expressions.

In [1]:
import re

In [64]:
r = r"[^a-z]*([y]o|[h']?ello|ok|hey|(good[ ])?(morn[gin']{0,3}|"\
    r"afternoon|even[gin']{0,3}))[\s,;:]{1,3}([a-z]{1,20})"
re_greeting = re.compile(r, flags=re.IGNORECASE)
re_greeting.match('Hello Rosa')

<re.Match object; span=(0, 10), match='Hello Rosa'>

In [50]:
re_greeting.match('Hi Rosa')

In [51]:
re_greeting.match('Hello Rosa').groups()

('Hello', None, None, 'Rosa')

In [52]:
re_greeting.match("Good morning Rosa")

<re.Match object; span=(0, 17), match='Good morning Rosa'>

In [53]:
re_greeting.match("Good Manning Rosa")

In [54]:
re_greeting.match('Good evening Rosa Parks').groups()

('Good evening', 'Good ', 'evening', 'Rosa')

In [55]:
re_greeting.match("Good Morn'n Rosa")

<re.Match object; span=(0, 16), match="Good Morn'n Rosa">

# Exercise One: Pattern Modification and Testing
Given the original pattern:

In [82]:
r = r"[^a-z]*([y]o|[h']?ello|ok|hi|greetings|hey|(good[ ])?(morn[gin']{0,3}|afternoon|even[gin']{0,3}))[\s;:]{1,3}([a-z]{1,20})"
re_greeting_new = re.compile(r, flags=re.IGNORECASE)

In [83]:
re_greeting_new.match("hi Alice")

<re.Match object; span=(0, 8), match='hi Alice'>

In [84]:
re_greeting_new.match("HEY, Bob")

In [85]:
re_greeting_new.match("Greetings Clara")

<re.Match object; span=(0, 15), match='Greetings Clara'>

1. Modify the pattern to also accept the greetings "hi" and "greetings".
2. Ensure that the new pattern only accepts a space after the greeting (not a comma, semicolon, or colon).
3. Test the new pattern with the following strings and explain the outcome of each:
    * 'hi Alice'
    * 'HEY, Bob'
    * 'Greetings Clara'

# Exercise 2: Debugging Regular Expression

The following code has an issue that prevents it from matching as expected:

In [106]:
r = r"[^a-z]*([y]o|[h']?ello|ok|hey|(good[ ])?(morn[gin']{0,3}|afternoon|even[gin']{0,3}))[ ,;:]{1,3}([a-z]{1,20})"
re_greeting = re.compile(r, flags=re.IGNORECASE)
print(re_greeting.match('Good Morning, John').group(0))
print(re_greeting.match('Good Morning, John').group(1))
print(re_greeting.match('Good Morning, John').group(4))

Good Morning, John
Good Morning
John


1. Identify and explain the bug in the code.
2. Provide a corrected version that will successfully match 'Good Morning, John' and extract 'Good Morning' as the matched greeting.

# Exercise 3: Extending Functionality
Write a function extract_name that takes a string as input and returns the name from a greeting based on the original regular expression. The function should return None if the string doesn't match the pattern. Test your function with various strings.

In [78]:
def extract_name(greeting_str):
    # Your code here
    match = re_greeting.match(greeting_str)
    return match.groups()[-1] if match else None

# Test the function
print(extract_name("Hello, Rosa"))  # Should return 'Rosa'
print(extract_name("hey jude"))     # Should return 'jude'
print(extract_name("goodnight moon")) # Should return None


Rosa
jude
None
