<h1 align='center'>Assignment No 17</h1>

Q1. Explain the difference between greedy and non-greedy syntax with visual terms in as few words
as possible. What is the bare minimum effort required to transform a greedy pattern into a non-greedy
one? What characters or characters can you introduce or change?


- Greedy: It matches as much as possible
Non-greedy: It matches as little as possible
For example, given the string "abababa", the greedy pattern "a.a" would match the entire string, while the non-greedy pattern "a.?a" would only match "aba".

- To transform a greedy pattern into a non-greedy one, you can add a question mark "?" after the quantifier (such as "" or "+"). For example, the greedy pattern "a.a" can be transformed into the non-greedy pattern "a.?a" by adding a question mark after the "" quantifier.

- Alternatively, you can use a non-greedy quantifier, such as "{,}" or "{,?}", instead of the greedy quantifier "{,}". For example, the greedy pattern "a{1,3}" can be transformed into the non-greedy pattern "a{1,3}?" by using the non-greedy quantifier "{1,3}?" instead of "{1,3}"

Q2. When exactly does greedy versus non-greedy make a difference?  What if you&#39;re looking for a
non-greedy match but the only one available is greedy?

- Greedy versus non-greedy matching makes a difference when there are multiple matches that could potentially satisfy the pattern.

- For example, consider the string "abababa" and the pattern "a.*a". The greedy version of this pattern would match the entire string, while the non-greedy version would only match the first occurrence of "a" followed by "a". In this case, the difference between greedy and non-greedy matching is clear.

- However, if there is only one match available that satisfies the pattern, the difference between greedy and non-greedy matching may not matter. In this case, using a greedy pattern would simply match as much text as possible, while using a non-greedy pattern would match the same text, but with less effort.

- If you are looking for a non-greedy match, but the only match available is greedy, you may need to modify the pattern or the input string to force a non-greedy match. One option is to use a non-greedy quantifier or to add a question mark after a greedy quantifier. Another option is to use a negative lookahead or lookbehind assertion to specify what text should not be included in the match.

Q3. In a simple match of a string, which looks only for one match and does not do any replacement, is
the use of a nontagged group likely to make any practical difference?

- a simple match of a string that looks for only one match and does not do any replacement, the use of a nontagged group is unlikely to make any practical difference.

- A nontagged group, also known as a non-capturing group, is used to group together a set of characters or subpatterns without capturing the matched text. This is indicated by adding a question mark and colon to the opening parentheses of the group, like this: "(?:pattern)".

- The use of a nontagged group is useful when you want to group together characters or subpatterns for purposes of alternation, repetition, or lookahead/lookbehind, but you do not want to capture the matched text.

- However, in a simple match where only one match is expected and no captured groups are needed for further processing, the use of a nontagged group is not necessary and is unlikely to make any practical difference in terms of performance or results.

Q4. Describe a scenario in which using a nontagged category would have a significant impact on the
program&#39;s outcomes.

- A scenario in which using a nontagged category could have a significant impact on a program's outcomes is when you are using capturing groups in your regular expression pattern, and you want to ignore a particular group but still capture other groups.

- For example, consider the following string: "John Smith (23 years old) is a software engineer". Suppose we want to capture both the name "John Smith" and the age "23" from the string, but we want to ignore the parentheses around the age.

In [1]:
import re

text = "John Smith (23 years old) is a software engineer"
pattern = r"(\w+\s\w+)\s*\((\d+)\s*years old\)"

match = re.search(pattern, text)
if match:
    name = match.group(1)
    age = match.group(2)
    print(f"Name: {name}, Age: {age}")


Name: John Smith, Age: 23


Q5. Unlike a normal regex pattern, a look-ahead condition does not consume the characters it
examines. Describe a situation in which this could make a difference in the results of your
programme.

- A look-ahead condition in regular expressions is a zero-width assertion that matches a pattern only if it is followed by another pattern, without consuming the characters that make up the second pattern. This can make a difference in the results of a program when we want to match a pattern only if it is followed by another pattern, but we do not want to include the second pattern in the match.
- we want to find all instances of the word "brown" that are followed by the word "fox", but we only want to capture the word "brown" and not the word "fox". We can use a look-ahead condition to achieve this, like so:

In [2]:
import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r"brown(?=\sfox)"

matches = re.findall(pattern, text)
print(matches)


['brown']


Q6. In standard expressions, what is the difference between positive look-ahead and negative look-
ahead?

- positive look-ahead and negative look-ahead are zero-width assertions that allow you to match a pattern only if it is (or is not) followed by another pattern, without consuming the characters that make up the second pattern. The difference between positive look-ahead and negative look-ahead is in the condition that they specify.

- Positive look-ahead is denoted by (?=pattern). It matches the current position in the string if the pattern pattern matches the characters following the current position. For example, the regular expression foo(?=bar) matches the characters "foo" only if they are followed by the characters "bar".

- Negative look-ahead is denoted by (?!pattern). It matches the current position in the string if the pattern pattern does not match the characters following the current position. For example, the regular expression foo(?!bar) matches the characters "foo" only if they are not followed by the characters "bar".

- In essence, positive look-ahead is used to match a pattern only if it is followed by another pattern, whereas negative look-ahead is used to match a pattern only if it is not followed by another pattern.

In [3]:
import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r"brown(?! dog)"

matches = re.findall(pattern, text)
print(matches)


['brown']


Q7. What is the benefit of referring to groups by name rather than by number in a standard
expression?

- Readability: Using group names can make your regular expression more readable and easier to understand, especially if your expression contains many groups.

- Maintainability: If you need to change the order or number of groups in your regular expression, using group names allows you to avoid having to update all the references to those groups in your code.

- Self-documentation: By naming your groups, you can make your regular expression self-documenting, which can be especially useful if others need to read or modify your code.

- Flexibility: Using named groups allows you to refer to the same group multiple times within the same regular expression. You can also refer to groups by name when using the re.sub() method to replace text.

In [4]:
import re

text = "2023-03-11"
pattern = r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})"
replacement = r"\g<month>/\g<day>/\g<year>"

result = re.sub(pattern, replacement, text)
print(result)


03/11/2023


Q8. Can you identify repeated items within a target string using named groups, as in &quot;The cow
jumped over the moon&quot;?

- Yes, you can use named groups to identify repeated items within a target string.

In [5]:
import re

text = "The cow jumped over the moon"
pattern = r"\b(?P<word>\w+)\b(?P=word)"
matches = re.findall(pattern, text)

print(matches)


[]


Q9. When parsing a string, what is at least one thing that the Scanner interface does for you that the
re.findall feature does not?

- The Scanner interface is used to parse a string or input stream into tokens or parts based on specific delimiters. It provides methods to read input and find occurrences of regular expressions or patterns.

- The re.findall function, on the other hand, is used to find all non-overlapping occurrences of a pattern in a string.
- The regular expression `r'(?P<value>[^,\n]+),?'` defines a pattern that matches any sequence of characters that does not include a comma or a newline, followed by an optional comma. The named group value captures each individual value.

- When you call scanner.scan(data), the Scanner interface reads the input string and returns a list of tokens

Q10. Does a scanner object have to be named scanner?

- No, a scanner object can be named anything as long as it follows the rules for naming variables in the programming language being used. The name of the object is simply a reference to the Scanner class and does not affect the functionality of the object itself.