#### 1. What is the name of the feature responsible for generating Regex objects?
The name of the feature responsible for generating Regex objects in Python is the "re" module.

#### 2. Why do raw strings often appear in Regex objects?
Raw strings are used in Regex objects because regular expressions often contain backslashes, which are also used in Python to escape characters. Using raw strings allows backslashes in regular expressions to be interpreted as literal backslashes, rather than as escape characters for special characters in the regular expression pattern. This makes it easier to write and read regular expressions that contain backslashes.

#### 3. What is the return value of the search() method?
The search() method of a regex object returns a match object if it finds a match, otherwise it returns None. The match object contains information about the match such as the start and end positions of the match, as well as the matching text.

#### 4. From a Match item, how do you get the actual strings that match the pattern?
We can use the group() method on a Match object to get the actual strings that match the pattern. The group() method with no arguments returns the entire matched string, and we can use the optional argument to return a specific capturing group. For example, match.group(1) would return the first capturing group.

#### 5. In the regex which created from the r'(\d\d\d)-(\d\d\d-\d\d\d\d)', what does group zero cover? Group 2? Group 1?
In the regex created from r'(\d\d\d)-(\d\d\d-\d\d\d\d)', group 0 covers the entire string matched by the pattern, group 1 covers the first three digits, and group 2 covers the last seven digits.

To access these groups from a Match object, you can use the group() method, where group(0) returns the entire match, group(1) returns the match for the first group, and group(2) returns the match for the second group.

#### 6. In standard expression syntax, parentheses and intervals have distinct meanings. How can you tell a regex that you want it to fit real parentheses and periods?
To match real parentheses and periods in a regex, you can use the backslash (\) character to indicate that the following character should be treated as a literal character instead of a special character. For example, to match a left parenthesis, you would use the regex pattern `\(`, and to match a period, you would use `\.`.

#### 7. The findall() method returns a string list or a list of string tuples. What causes it to return one of the two options?
The findall() method returns a list of string values if the regular expression passed to it doesn't have any groups defined, i.e., no parentheses in the pattern. If the regular expression has groups, findall() returns a list of tuples where each tuple represents a match of the pattern, and the elements of the tuple are the group matches.

#### 8. In standard expressions, what does the | character mean?
In standard expressions, the `|` character is used to indicate alternation, which means "match either the expression before or after the `|`". For example, the regular expression `cat|dog` would match either the string "cat" or the string "dog".

#### 9. In regular expressions, what does the character stand for?
In regular expressions, the dot character (`.`) stands for any single character except a newline character. It is often used as a wildcard character to match any character in a string.

#### 10.In regular expressions, what is the difference between the + and * characters?
In regular expressions, `+` and `*` are quantifiers used to match one or more occurrences of the preceding regular expression pattern.

The `*` character matches zero or more occurrences of the preceding regular expression pattern. For example, the regular expression `a*` would match an empty string or any number of 'a' characters, such as 'a', 'aa', 'aaa', and so on.

The `+` character, on the other hand, matches one or more occurrences of the preceding regular expression pattern. For example, the regular expression `a+` would match one or more 'a' characters, such as 'a', 'aa', 'aaa', and so on, but it would not match an empty string.

In short, the main difference between `+` and `*` is that `+` requires at least one occurrence of the preceding pattern, while `*` allows for zero or more occurrences.

#### 11. What is the difference between {4} and {4,5} in regular expression?
In regular expressions, {4} specifies that the preceding pattern should be matched exactly four times, while {4,5} specifies that the preceding pattern should be matched at least four times and at most five times. In other words, {4} is a fixed quantifier and {4,5} is a range quantifier.

#### 12. What do you mean by the \d, \w, and \s shorthand character classes signify in regular expressions?
In regular expressions, the following shorthand character classes signify:

- \d: Matches any Unicode decimal digit character. This includes digits from 0-9 and any other digit characters from other scripts.
- \w: Matches any Unicode word character. This includes alphanumeric characters and underscore (_).
- \s: Matches any Unicode whitespace character. This includes spaces, tabs, newlines, and any other space-like characters.

These shorthand character classes can be used to make regular expressions more concise and readable. For example, instead of using [0-9] to match digits, you can use \d. Similarly, instead of using [a-zA-Z0-9_] to match word characters, you can use \w.

#### 13. What do means by \D, \W, and \S shorthand character classes signify in regular expressions?
In regular expressions, the shorthand character classes \D, \W, and \S are the negated versions of \d, \w, and \s, respectively. They match any character that is not a digit (\D), not a word character (\W), and not a whitespace character (\S). For example, \D will match any non-digit character, such as letters or symbols, and \W will match any non-word character, such as punctuation or spaces. Similarly, \S will match any non-whitespace character, such as letters or symbols.



#### 14. What is the difference between .*? and .*?
There is no difference between `.*?` and `.*`, as the question mark `?` here makes the `*` quantifier non-greedy, which means that it will try to match as few characters as possible. 

The `.*` expression matches any sequence of zero or more characters, while `.*?` matches any sequence of zero or more characters, but tries to find the shortest possible match. 

For example, consider the string `'abracadabra'`. The regular expression `a.*a` would match the entire string, starting from the first `'a'` and ending with the last `'a'`. On the other hand, the regular expression `a.*?a` would only match the substring `'abracada'`, because it tries to find the shortest match between the first `'a'` and the last `'a'`.



#### 15. What is the syntax for matching both numbers and lowercase letters with a character class?
The syntax for matching both numbers and lowercase letters with a character class is `[0-9a-z]` or `[\d[a-z]]`.



#### 16. What is the procedure for making a normal expression in regax case insensitive?
To make a regular expression case-insensitive, you can use the `re.IGNORECASE` flag or `re.I` shorthand as the second argument of `re.compile()`, or as the third argument of `re.search()`, `re.findall()`, `re.sub()`, and other similar functions.

For example, the following regular expression matches the word "apple" case-insensitively:
 


In [1]:

import re
string = "I have an Apple and an orange."
pattern = re.compile(r"apple", re.IGNORECASE)
matches = pattern.findall(string)
print(matches)  # Output: ['Apple'] 


['Apple']



In this example, the `re.IGNORECASE` flag is used to compile the regular expression `r"apple"` to be case-insensitive. The `findall()` method is then used to find all matches of the regular expression in the string `string`. Since the flag is set, it matches the word "Apple" even though it has a capital "A".

#### 17. What does the . character normally match? What does it match if re.DOTALL is passed as 2nd argument in re.compile()?
In regular expressions, the `.` character normally matches any character except for a newline character. However, if the `re.DOTALL` flag is passed as the second argument in `re.compile()`, then the `.` character matches any character including newline characters. This flag makes the dot match everything.



#### 18. If numReg = re.compile(r'\d+'), what will numRegex.sub('X', '11 drummers, 10 pipers, five rings, 4 hen') return?
If `numRegex = re.compile(r'\d+')`, `numRegex.sub('X', '11 drummers, 10 pipers, five rings, 4 hen')` will return the string `'X drummers, X pipers, five rings, X hen'`.

In the above regular expression, `\d+` matches one or more digits. The `sub()` method replaces all occurrences of the pattern with the string `'X'`. Therefore, all sequences of one or more digits in the input string will be replaced with `'X'`. The output string will have the same content as the input string, except that all digit sequences will be replaced with `'X'`.



#### 19. What does passing re.VERBOSE as the 2nd argument to re.compile() allow to do?
Passing `re.VERBOSE` as the 2nd argument to `re.compile()` allows you to add comments and whitespace to your regular expression pattern without affecting the functionality of the pattern. This can make the pattern more readable and easier to understand. When `re.VERBOSE` is used, whitespace within the pattern is ignored unless it is escaped or included within a character class. Comments can also be included within the pattern by using the `#` character.



#### 20. How would you write a regex that match a number with comma for every three digits? It must match the given following:
#### '42'
#### '1,234'
#### '6,368,745'
#### but not the following:
#### '12,34,567' (which has only two digits between the commas)
#### '1234' (which lacks commas)

To match a number with commas for every three digits, we can use the following regex:


In [2]:
 
import re
regex = re.compile(r'^\d{1,3}(,\d{3})*$')
 



Here, the `^` and `$` match the start and end of the string, respectively. The pattern `\d{1,3}` matches one to three digits, and `(,\d{3})*` matches zero or more occurrences of a comma followed by exactly three digits. The `*` at the end allows for any number of repetitions of the pattern.

Using this regex, we can check if a string matches the pattern using the `match()` method:


In [3]:
 
result = regex.match('6,368,745')
if result:
    print('Match found')
else:
    print('Match not found') 

# This will print "Match found" for the given input string.

Match found



#### 21. How would you write a regex that matches the full name of someone whose last name is Watanabe? You can assume that the first name that comes before it will always be one word that begins with a capital letter. The regex must match the following:
#### 'Haruto Watanabe'
#### 'Alice Watanabe'
#### 'RoboCop Watanabe'
#### but not the following:
#### 'haruto Watanabe' (where the first name is not capitalized)
#### 'Mr. Watanabe' (where the preceding word has a nonletter character)
#### 'Watanabe' (which has no first name)
#### 'Haruto watanabe' (where Watanabe is not capitalized)


To match the full name of someone whose last name is Watanabe, with the first name that comes before it always being one word that begins with a capital letter, the following regex can be used:

```
[A-Z][a-zA-Z]*\sWatanabe
```

Explanation:

- `[A-Z]` matches the first character of the first name, which must be a capital letter.
- `[a-zA-Z]*` matches the rest of the first name, which can be any combination of uppercase and lowercase letters.
- `\s` matches the space between the first and last names.
- `Watanabe` matches the last name.

This regex will match 'Haruto Watanabe', 'Alice Watanabe', and 'RoboCop Watanabe', but not 'haruto Watanabe', 'Mr. Watanabe', 'Watanabe', or 'Haruto watanabe'.




#### 22. How would you write a regex that matches a sentence where the first word is either Alice, Bob, or Carol; the second word is either eats, pets, or throws; the third word is apples, cats, or baseballs; and the sentence ends with a period? This regex should be case-insensitive. It must match the following:
#### 'Alice eats apples.'
#### 'Bob pets cats.'
#### 'Carol throws baseballs.'
#### 'Alice throws Apples.'
#### 'BOB EATS CATS.'
#### but not the following:
#### 'RoboCop eats apples.'
#### 'ALICE THROWS FOOTBALLS.'
#### 'Carol eats 7 cats.'

Here's a regex that matches the given criteria:

```
^(Alice|Bob|Carol)\s+(eats|pets|throws)\s+(apples|cats|baseballs)\.$
```

Explanation:
- `^` matches the start of the sentence
- `(Alice|Bob|Carol)` matches either Alice, Bob or Carol
- `\s+` matches one or more whitespace characters
- `(eats|pets|throws)` matches either eats, pets or throws
- `\s+` matches one or more whitespace characters
- `(apples|cats|baseballs)` matches either apples, cats or baseballs
- `\.` matches a period at the end of the sentence
- `$` matches the end of the sentence

Note that the `re.IGNORECASE` flag should be passed to `re.compile()` when compiling the regex in order to make it case-insensitive.