### Q1. Explain the difference between greedy and non-greedy syntax with visual terms in as few words as possible. What is the bare minimum effort required to transform a greedy pattern into a non-greedy one? What characters or characters can you introduce or change?

- Greedy quantifiers (*, +, ?, {m,n}) match as much as possible.
- Non-greedy quantifiers (*?, +?, ??, {m,n}?) match as little as possible.
- Greedy to non-greedy transformation is achieved by adding a ? after the quantifier.
- This change requires minimal effort by introducing or changing the ? character.
- Greedy matching consumes the maximum number of characters, while non-greedy matching consumes the minimum necessary.

In [1]:
import re

text = 'abcxyzabcxyz'

# Greedy matching
greedy_match = re.search(r'abc.*xyz', text)
print(greedy_match.group())  # Output: abcxyzabcxyz

# Non-greedy matching
non_greedy_match = re.search(r'abc.*?xyz', text)
print(non_greedy_match.group())  # Output: abcxyz


abcxyzabcxyz
abcxyz


### Q2. When exactly does greedy versus non-greedy make a difference?  What if you&#39;re looking for a non-greedy match but the only one available is greedy?

- Greedy versus non-greedy matching matters when there are multiple potential matches.
- If only a greedy match is available but you need a non-greedy result, you can modify the pattern or use additional constraints.
- Modifying the pattern or applying additional logic can make a greedy match behave like a non-greedy one.

### Q3. In a simple match of a string, which looks only for one match and does not do any replacement, is the use of a nontagged group likely to make any practical difference?

In a simple match of a string, the use of a nontagged group (also known as a non-capturing group) is not likely to make any practical difference. A non-capturing group means that it will not store the text matched by the pattern in the group. It doesn’t mean that the text is not matched by the whole regex1.

In [14]:
import re
phoneNumRegex = re.compile(r'\d\d\d')
num = phoneNumRegex.search('My number is 234-567-8901.')
print(f'Phone number found -> {num.group()}') # Non Tagged group
print(f'Phone number found -> {num.group(0)}') # Tagged Group

Phone number found -> 234
Phone number found -> 234


### Q4. Describe a scenario in which using a nontagged category would have a significant impact on the program&#39;s outcomes.

In [15]:
import re

string = "I love cats and dogs."

# Tagged Category (Capturing Group)
pattern_tagged = r"(cats|dogs)"
match_tagged = re.search(pattern_tagged, string)
if match_tagged:
    print("Tagged Category Match:", match_tagged.group(1))  # Output: cats

# Non-Tagged Category (Non-Capturing Group)
pattern_non_tagged = r"(?:cats|dogs)"
match_non_tagged = re.search(pattern_non_tagged, string)
if match_non_tagged:
    print("Non-Tagged Category Match:", match_non_tagged.group())  # Output: cats


Tagged Category Match: cats
Non-Tagged Category Match: cats


- Group Count: It helps avoid increasing the group count in the match result, simplifying the output when specific group references are not needed.

- Simplified Match Result: Non-tagged categories simplify the match result by eliminating unnecessary groups, focusing on the overall pattern match rather than individual groups.

- Memory and Performance: Non-tagged categories require less memory and can improve performance by not creating additional groups in the match result.

### Q5. Unlike a normal regex pattern, a look-ahead condition does not consume the characters it examines. Describe a situation in which this could make a difference in the results of your programme.

A look-ahead condition in a regular expression allows you to specify a condition that must be met after a certain pattern without consuming the characters it examines. This can make a difference in the results of your program in situations where you want to enforce a specific condition without including it in the actual match.

In [18]:
import re

prices = ["100 USD", "200 EUR", "300 USD", "400 GBP"]
pattern = r"\d+(?= USD)"

for price in prices:
    match = re.search(pattern, price)
    if match:
        print("Price:", match.group())

Price: 100
Price: 300


### Q6. In standard expressions, what is the difference between positive look-ahead and negative look-ahead?

- Positive Look-ahead ((?=pattern)) in regular expressions asserts that a specific pattern must follow the current position without including it in the match. It ensures that the pattern is only considered a match if it is immediately followed by the specified pattern.

- Negative Look-ahead ((?!pattern)) asserts that a specific pattern must not follow the current position. It ensures that the pattern is considered a match only if it is not followed by the specified pattern.

### Q7. What is the benefit of referring to groups by name rather than by number in a standard expression?

Referring to groups by name instead of number in a regular expression offers the following benefits:

- Readability: Group names make the regular expression more readable and easier to understand.

- Self-Documentation: Group names act as documentation within the regular expression, conveying the purpose of each captured group.

- Flexibility: Group names provide flexibility when modifying or rearranging the regular expression, as the code remains unaffected even if the group order changes.

- Named Group Access: Using group names allows direct access to the captured portions, simplifying post-processing tasks and extraction of specific information.

- Code Clarity: Named groups improve code clarity, making it easier for other developers to understand the intended structure of the match.

### Q8. Can you identify repeated items within a target string using named groups, as in &quot;The cow jumped over the moon&quot;?

In [32]:
import re

string = "The cow jumped over the Moon"
pattern = r"\b(?P<word>\w+)\b.*\b(?P=word)\b"
regex = re.compile(pattern, re.IGNORECASE)

match = regex.search(string)
if match:
    print("Repeated word:", match.group("word"))
else:
    print("No repeated words found")


Repeated word: The


### Q9. When parsing a string, what is at least one thing that the Scanner interface does for you that the re.findall feature does not ?

The Scanner interface and the re.findall feature are two different tools for parsing strings. One key difference is that the Scanner interface provides more flexibility in terms of how you can parse the input text, while re.findall is a more specialized tool for finding all occurrences of a regular expression pattern in a string.

### Q10. Does a scanner object have to be named scanner?

No, a Scanner object does not have to be named "scanner." You can choose any valid variable name for your Scanner object. The name of the variable is up to you and can be chosen based on your preferred naming conventions or to reflect the purpose of the Scanner object in your code.