# Regular Expressions in Python (Follow-up 1)
<font size="3">For this follow-up you must solve 2 exercises using the theoretical knowledge you have gathered in class and 3 more exercises using Python regular expressions. For the Python section you will have to also present a written summary, in english, of how was your trend of thought to solve the exercise (one paragraph). This part will be marked taken into account the regex, the explanation and the number of cases passed. Each of these exercises must be solved using regular expressions, any other approach, even if it works, won’t be accepted.</font>


1) [15 pts] Find the regular expression for the language over $\Sigma = \{0,1,2\}$ of all strings that only have one occurrence of three consecutive ones.

$$\text{^(?:[02]|10|12|110|112)*111(?:[02]|10|12|110|112)*\$}$$

2. [15 pts] Find the regular expression for the language over $\Sigma = \{0,1,2,3\}$ of all strings that have length divisible by three.

$$\text{^([0-3]{3})*}$$

3) [25 pts] Write a Python function is_valid_username(username) that returns True if the username meets all of the following conditions (all captured with a regex):
- Length between 8 and 16 characters
- Starts with a letter (uppercase or lowercase)
- Contains at least one underscore (_)
- Contains at least one digit
- Ends with the same letter it started with (case-sensitive)
- Only includes letters, digits, and underscores (no other symbols)
<br>
<br>
Hints:

- Use lookaheads (?=...) to require certain patterns (e.g., digits, symbols).
- Use groups() to capture and compare parts of the string.
- Use backreferences \1 to match the same text as a previous group.
- Use {min,max} to enforce minimum and maximum length.


In [7]:
test_cases = [
    ("User_123U", True),
    ("User_1234", False),
    ("a_user1a", True),
    ("Start1234567_endS", False),
    ("GoodName_3e", False),
    ("Z_user_9Z", True),
    ("A123456_7A", True),
    ("a__1b", False),
    ("_User_1_", False),
    ("abc_def_1c", False),
]


In [14]:
import re

pattern = re.compile(r'^(?=.{8,16}$)(?=.*_)(?=.*\d)([a-zA-Z])[a-zA-Z0-9_]*\1')

def is_valid_username(username):
    return bool(pattern.match(username))


for names in test_cases:
    print(names[0]," ",is_valid_username(names[0]) , "\n")

User_123U   True 

User_1234   False 

a_user1a   True 

Start1234567_endS   False 

GoodName_3e   False 

Z_user_9Z   True 

A123456_7A   True 

a__1b   False 

_User_1_   False 

abc_def_1c   False 



4) [20 pts] Write a Python function to find all triplets of words (3-word sequences) that appear at least twice in a given text (not necessarily consecutively).
<br>
<br>
Hints:
- Remember about the boundaries \b
- Use a backreference to check for repetition
- Use the re.findall() or re.finditer() method.
- Convert text to lowercase to avoid case mismatches.
- Use Counter to count occurrences.


In [18]:
test_cases_2 = [
    ("the quick brown fox jumps over the quick brown fox",
     ["the quick brown", "quick brown fox"]),
    ("Hello world again. hello WORLD again.",
     ["hello world again"]),
    ("We went there together Then, we went there together again.",
     ["we went there", "went there together"]),
    ("one two three two three four one two three",
     ["one two three"]),
    ("every word is different in this sentence",
     []),
    ("go go go home then go go go home then again",
     ["go go go", "go go home", "go home then"]),
    ("the dog sleeps all day. the dog sleeps all night.",
     ["the dog sleeps", "dog sleeps all"]),
    ("A B C D E F G A B C D E F G",
     ["a b c", "b c d", "c d e", "d e f", "e f g"]),
    ("Rain falls HARD. rain   FALLS hard every day.",
     ["rain falls hard"]),
    ("up up and away up up and away",
     ["up up and", "up and away"]),
]


In [82]:
import re

pattern = re.compile(r'\b(\w+)\b \b(\w+)\b \b(\w+)\b =?.* \b\1\b \b\2\b \b\3\b')

for names in test_cases_2:
    print(names[0]," ",pattern.findall(names[0]))


the quick brown fox jumps over the quick brown fox   [('the', 'quick', 'brown')]
Hello world again. hello WORLD again.   []
We went there together Then, we went there together again.   [('went', 'there', 'together')]
one two three two three four one two three   [('one', 'two', 'three')]
every word is different in this sentence   []
go go go home then go go go home then again   [('go', 'go', 'go')]
the dog sleeps all day. the dog sleeps all night.   [('the', 'dog', 'sleeps')]
A B C D E F G A B C D E F G   [('A', 'B', 'C')]
Rain falls HARD. rain   FALLS hard every day.   []
up up and away up up and away   [('up', 'up', 'and')]


5) Validate a time string in the format HH:MM:MM:HH such that:

- The first and last HH are the same.
- The two MM values are also the same.
- HH must be between 00 and 23.
- MM must be between 00 and 59.
<br>
<br>
Hints:
- Use capturing groups.
- Use backreferences.
- Use anchors ^...$ to match the full string.

In [59]:
test_cases_3 = [
    ("12:30:30:12", True),
    ("00:00:00:00", True),
    ("23:59:59:23", True),
    ("01:01:01:01", True),
    ("19:45:45:19", True),
    ("12:30:31:12", False),
    ("12:60:60:12", False),
    ("24:00:00:24", False),
    ("12:30:30:13", False),
    ("09:59:59:08", False),
]

In [61]:
import re

hh= r'(0[0-9]|1[0-9]|2[0-3])'
mm= r'(0[0-9]|1[0-9]|2[0-9]|3[0-9]|4[0-9]|5[0-9])'


pattern = re.compile(r'^'+hh+r'\:' +mm+r'\:'+r'\2\:\1$')

for names in test_cases_3:
    print(names[0]," ",bool(pattern.match(names[0])))



12:30:30:12   True
00:00:00:00   True
23:59:59:23   True
01:01:01:01   True
19:45:45:19   True
12:30:31:12   False
12:60:60:12   False
24:00:00:24   False
12:30:30:13   False
09:59:59:08   False
