# Regular Expressions Online Programming Activity (OPE)
### Learning Objectives
For this assignment, the main learning objectives are
1. __Get Comfortable with Regular Expressions:__ Learn how to create and apply regular expression patterns to find, match, split, and replace text in documents.
2. __Master Text Processing Techniques:__ Practice using `re.findall()`, `re.split()`, and `re.sub()` to effectively extract, modify, and organize text.
3. __Apply Your Skills:__ Use your knowledge in regular expressions to solve real-world problems -- in this case, organizing and managing structured data like recipes!

# Cooking Activity
### Reduce Cooking Time for Time Efficiency!

In this assignment, you will apply regular expressions for a cooking activity. Daphne is planning to have a Sunday family cooking session with her daughter, Lucy, and her husband, Cole. She wants to be fully prepared for it. Because Daphne is in a time crunch to make the dishes, you will help by using regular expressions to effectively extract, modify, and organize the recipe text!

**Notes:** 
* Regular Expressions will also be called "RE", "RegEx", or "regex" in this activity. 
* **Make sure to execute (run) every code cell. Many code cells are prerequisites to others, so you'll generally want to run them in order from top to bottom.**
* This activity includes eight tasks. As you finish one, the next one will be made available to you. *Depending on your browser, you may need to scroll down to see each new task.*
* You don't need to add any new cells, although it okay if you do.
* In the definitions below, `recipe_text` is a variable referring to the text upon which the regex pattern will be applied.
* For all the regular expressions you define in the assignment, make sure to define them using [raw string notation](https://docs.python.org/3/library/re.html#raw-string-notation). This will make it easier for you to understand and debug your expressions, as discussed in the readings.
* **To complete this activity, the only requirement is entering regular expressions and executing notebook cells. If you find yourself doing any coding other than creating Python `assert()` statements to test your regular expressions, you're doing it wrong.**

### **Regular Expression Definitions** 

Source Reference: https://www.python-engineer.com/posts/regular-expressions/

| RegEx Functions | Meaning |
| --- | --- |
| `re.compile()` | Compiles a regular expression into a pattern object which can then use other `re` functions.Example: <br>__regex_pattern = re.compile('cat')__ <br>__re.search(regex_pattern,"the cat in the hat")__</br>|
| `re.search()` |  Scans through a string, looking for any location where the regular expression pattern matches, and returns a match object for the first occurrence. <br> Example: __search_method = re.search(regex_pattern, recipe_text)__ </br> |
| `re.match()` |Checks for a match only at the beginning of the string, and returns a match object if the regular expression matches at the start. <br> Example: __match_method = re.match(regex_pattern, recipe_text)__ </br>|
| `re.finditer()` | Finds all substrings where regular expression matches, and returns an iterator, so you can loop over the matches. <br> Example: __finditer_method = re.match(regex_pattern, recipe_text)__</br>|
| `re.split()` | Returns a list where the string has been split at each match. <br> Example: __split_method = re.split = (regex_pattern, recipe_text)__</br>|
| `re.sub()` | Replaces one or more matches with a string. <br> Example: __sub_method = re.sub(regex_pattern,replacement, recipe_text)__. In this case, replacement is the phrase that will replace the pattern within `recipe_text`. </br> |
| `re.findall()` | Returns a list of matches for the regular expression. If there are no groups, findall returns a list of strings matching the entire pattern.<br>If there is exactly one *capturing* group, findall returns the group.<br>If there is more than one group, findall returns a tuple of strings matching the *capturing* groups -- matches for *non-capturing* groups will not appear in the output.<br>If the pattern contains only one *non-capturing* group, findall will return the entire pattern, as it does when there are no groups. <br> Example __findall_method = regex_pattern.findall(recipe_text)"__ </br>|

| RegEx Flags | Meaning |
| --- | --- |
| `re.IGNORECASE` | Does case-insensitive matches. <br> Example: __regex_pattern = re.compile(r'(hour)+[s]', re.IGNORECASE")__ </br>|

| RegEx Character | Meaning |
| --- | --- |
| `*` | Matches zero, one or more of the previous expression. <br> Example: __regex_pattern = re.compile(r'[0-9]*')__--which matches phrase(s) that contains digits from 0 to 9. </br>|
| `?` | Matches zero or one of the previous expression. <br> Example: __regex_pattern = re.compile(r'colou?r')__--which matches the string "color" or "colour" because 'u' is optional. </br> |
| `+` | Matches one or more of the previous expression. <br> Example: __regex_pattern = re.compile(r'\d+')__--which matches one or more digits. </br> |
| `\` | Used to escape a special character (i.e., *metacharacter*) to treat the character as regular text. <br> Example: __regex_pattern = re.compile(r'\\.')__--which matches a literal period dot '.' </br>|
| `.` | Matches any character. </br> Example: __regex_pattern = re.compile(r'a.b')__--which matches any string where 'a' is followed by a single character and then 'b'. </br> |
| `()` | Creates capture groups to extract and refer to specific parts of a matched string. <br> Example: __regex_pattern = re.compile(r'(\d{3}')__--which matches any text with 3 consecutive digits. </br> |
| `(?:)` | Non-capturing group that groups parts of a pattern together but does not create a capturing group that you can refer to later.  <br> Example: __regex_pattern = re.compile(r'\d+(?:-\d+)*')__--which is a non-capturing group that matches a hyphen followed by one or more digits, zero or one more times. </br> |
| `\|` | Serves as an "or" statement that will match the pattern either on the right or left of the bar.  <br> Example: __regex_pattern = re.compile(r'apple\|orange')__ --which matches either "apple" or "orange". </br>  |
| `[]` | Matches a range of characters. <br> Example: __regex_pattern = re.compile(r'[0-9]')__--which matches any single digit. </br>  |
| `{}` | Matches a specific number of occurrences <br> Example: __regex_pattern = re.compile(r'a{3}')__--which matches exactly three consecutive 'a' characters. |
| `^` | Matches the start of a string. <br> Example: __regex_pattern = re.compile(r'^Hello')__--which matches 'Hello' only if it appears at the start of the string. |
| `$` | Matches the end of a string. <br> Example: __regex_pattern = re.compile(r'world!$')__--which matches 'world!' only if it appears at the end of the string.|
| `\d` | Matches a digit [0-9]. <br> Example: __regex_pattern = re.compile(r'\d')__--which matches any single digit. |
| `\D` | Matches a non-digit <br> Example: __regex_pattern = re.compile(r'\D')__--which matches any character that is not a digit [0-9] |
| `\w` | Matches an alphanumeric character (letters [a-zA-Z]) and digits [0-9, and underscores.]<br> Example: __regex_pattern = re.compile(r'\w')--which matches all alphanumeric characters and underscores in the text, ignoring spaces and puncutation. |
| `\W` | Matches an non-alphanumeric character <br> Example: __regex_pattern = re.compile(r'\W')__--which matches punctuation and spaces in the text.  |
| `\s` | Matches a whitespace character. <br> Example: __regex_pattern = re.compile(r'\s')--which matches any whitespace character in the text. |
| `\S` | Matches a non-whitespace character <br> Example: __regex_pattern = re.compile(r'\S')__--which matches any non-whitespace character in the text (including letters, punctuation, and digits). |

| Python String Functions | Meaning |
| --- | --- |
| `strip()` | Remove any spaces at the beginning and end of a string. <br>Example: __strip_method = recipe_text.strip()__  </br> |
| `str()` | Converts an object into its string representation. <br> Example: __print(str(item))__ </br> |

In [None]:
# RUN THIS CELL. IT CANNOT BE MODIFIED.

import re

In [None]:
# RUN THIS CELL. IT CANNOT BE MODIFIED.

recipe_text = \
'''Recipe 1: 194_Cabbage_Kielbasa_Supper
1) In a 5-qt. slow cooker, combine the cabbage, potatoes, onion, salt and pepper.
2) Pour broth over all for 2 HOURS.
3) Place sausage on top (slow cooker will be full, but cabbage will cook down).
4) Cover and cook on low for 4-5 hours or until vegetables are tender and sausage is heated through.
Recipe 2: 195_Chocolate_Chip_Cookie_Ice_Cream_Cake
1) Crush half the cookies (about 20 cookies) to make crumbs.
2) Combine crumbs with melted margarine and press into the bottom of a 9-inch springform pan or pie plate.
3) Stand remaining cookies around edge of pan.
4) Spread 3/4 cup fudge topping over crust.
5) Freeze two hours.
6) Meanwhile, soften 1 quart of ice cream in microwave or on countertop.
7) After crust has chilled, spread softened ice cream over fudge layer.
8) Freeze 30 minutes.
9) Scoop remaining quart of ice cream into balls and arrange over spread ice cream layer.
10) Freeze until firm, 3.5-4.5 Hours.
11) To serve, garnish with remainder of fudge topping, whipped cream and cherries.'''