## Q1. Explain the difference between greedy and non-greedy syntax with visual terms in as few words as possible. What is the bare minimum effort required to transform a greedy pattern into a non-greedy one? What characters or characters can you introduce or change?

**Answer:** <br> The primary distinction between Greedy and Non-Greedy Match Syntax is that Greedy Match aims to match as many repetitions of the quantified pattern as possible, while Non-Greedy Match attempts to match as few repetitions of the quantified pattern as possible.

Greedy: Matches as much as possible, consuming more characters.

Non-greedy: Matches as little as possible, consuming fewer characters.

To transform a greedy pattern into a non-greedy one in Python, add a "?" after the quantifier.

Example: Change ".*" (greedy) to ".*?" (non-greedy) in Python's regular expressions.

In [43]:
# Greedy Match:

import re

text = "ababacjcjd"
greedy_match = re.search(r'ab.*', text)

print(greedy_match.group())

ababacjcjd


In [45]:
# Non-Greedy Match

import re

text = "ababababaaaaabbbbsfdgsfgfdg"
non_greedy_match = re.search(r'ab.*?', text)

print(non_greedy_match.group())


ab


## Q2. When exactly does greedy versus non-greedy make a difference?  What if you're looking for a non-greedy match but the only one available is greedy?


**Answer:** The Greedy Match aims to match the maximum repetitions of the quantified pattern, while the Non-Greedy Match aims to match the minimum repetitions. If only the Non-Greedy Match is available, we can use additional filtering or pattern matching methods in regex to identify the desired pattern more precisely.

## Q3. In a simple match of a string, which looks only for one match and does not do any replacement, is the use of a nontagged group likely to make any practical difference?

**Answer:** <br>In Python, a nontagged group is a group that is not stored in the match object's groups attribute. This means that you cannot access it using the group() method. However, if you do not need to access the text matched by the nontagged group, then there is no reason not to use it.

In a simple match of a string that does not do any replacement, the use of a nontagged group is unlikely to make any practical difference. However, if the program needs to access the text matched by the group, then it is necessary to use a tagged group.

In [9]:
import re
phoneNumRegex = re.compile(r'\d\d\d')
num = phoneNumRegex.search('My number is 278-568-9874.')
print(f'Phone number found -> {num.group()}') # Non Tagged group
print(f'Phone number found -> {num.group(0)}') # Tagged Group

Phone number found -> 278
Phone number found -> 278


## Q4. Describe a scenario in which using a nontagged category would have a significant impact on the program's outcomes ?

**Answer:** <br> In the code snippet, the decimal point (.) is not tagged or captured. This means that it will not be stored in the match object's groups attribute. This is useful in scenarios where the separator of value in a string is of no use and we need to capture only the values.

In [47]:
# Ans : Non tagged category :
import re
text='135.456'
pattern=r'(\d+)(?:.)(\d+)'
regobj=re.compile(pattern)
matobj=regobj.search(text)
matobj.groups()
#  Here the '.' decimal is not tagged or captured.
#  It will useful in scenarios where the separator of value in a string is of no use and we need to capture only the
#  values.

('135', '456')

## Q5. Unlike a normal regex pattern, a look-ahead condition does not consume the characters it examines. Describe a situation in which this could make a difference in the results of your programme ?

**Answer:** <br> Positive look-ahead is essential for accurately counting the number of multiple lines or sentences in a string. Without it, we won't obtain the correct count due to its ability to match patterns without consuming characters.

## Q6. In standard expressions, what is the difference between positive look-ahead and negative look-ahead ?

**Ans:** <br> Positive Look-Ahead (`(?=...)`) ensures a specific condition follows the main pattern without including it in the match. Negative Look-Ahead (`(?!...)`) ensures a specific condition does not follow the main pattern without including it in the match. Both are lookaround assertions in Python's regular expressions.

In [48]:
# Positive lookahead
import re
pat=r'abc(?=[A-Z])'
text="abcABCEF"
regobj=re.compile(pat)
matobj=regobj.findall(text)
print("Positive lookahead:",matobj)

# Negative look ahead

import re
pat1=r'abc(?!abc)'
text1="aeiouabcabc"
regobj1=re.compile(pat1)
matobj1=regobj1.findall(text)
print("Negative look ahead:",matobj1) 

Positive lookahead: ['abc']
Negative look ahead: ['abc']


## Q7. What is the benefit of referring to groups by name rather than by number in a standard expression?

**Answer:** Using named groups instead of numerical indices in a regular expression improves code clarity and readability. Also It is easier to maintain the code.

## Q8. Can you identify repeated items within a target string using named groups, as in "The cow jumped over the moon"?

In [49]:
import re
text = "The cow jumped over the moon"
regobj=re.compile(r'(?P<w1>The)',re.I)
regobj.findall(text)

['The', 'the']

## Q9. When parsing a string, what is at least one thing that the Scanner interface does for you that the re.findall feature does not ?

**Answer:**<br> The Scanner interface in Python provides a powerful way to tokenize and parse strings by allowing you to define complex patterns using regular expressions. One thing that the Scanner interface does for you that the `re.findall` feature does not is handling the processing of patterns with different token types.

With the Scanner interface, you can define multiple patterns, each associated with a unique token type. As it processes the input string, the Scanner can identify the appropriate token type based on the pattern that matches the current substring. This makes it easier to differentiate between different types of substrings in the input.

On the other hand, `re.findall` is primarily used to find all occurrences of a specific pattern in the input string. While it's useful for simple pattern matching, it does not handle the tokenization of the entire input string based on different patterns with unique token types.

In summary, the Scanner interface helps with tokenization and parsing by managing multiple patterns and associating them with different token types, which `re.findall` alone does not provide.

## Q10. Does a scanner object have to be named scanner?

**Answer:** No, a scanner object in Python does not have to be named scanner. The name of the scanner object is up to the programmer. It is common practice to name the scanner object scanner, but it can be named anything.