In [None]:
Q1. Explain the difference between greedy and non-greedy syntax with visual terms in as few words
as possible. What is the bare minimum effort required to transform a greedy pattern into a non-greedy
one? What characters or characters can you introduce or change?



Ans-

Greedy: Matches the longest substring.
Non-greedy: Matches the shortest substring.

To transform greedy to non-greedy, add a "?" after quantifiers like "*?" or "+?".








Q2. When exactly does greedy versus non-greedy make a difference?  What if you&#39;re looking for a
non-greedy match but the only one available is greedy?



Ans-


Greedy versus non-greedy makes a difference in regular expressions when you want to match patterns within text.
Greedy matches the longest possible substring, while non-greedy matches the shortest. 

If you specifically need a non-greedy match, but only a greedy one is available, you might have to use additional,
techniques, like altering the pattern or using lookaheads/lookbehinds, depending on the context and the regular,
expression engine you're using. It often involves refining your regular expression pattern to achieve the desired match.






Q3. In a simple match of a string, which looks only for one match and does not do any replacement, is
the use of a nontagged group likely to make any practical difference?




Ans-

In a simple match of a string where you are only looking for one match and not doing any replacement, 
the use of a non-tagged group (a group without capturing, often denoted by parentheses `(?:...)`) won't,
make any practical difference in terms of the match itself. Non-tagged groups are used for grouping without,
capturing the matched substring. If you don't need to capture the matched substring for later reference, 
using a non-tagged group won't impact the outcome of the match. It's simply a way to group the pattern ,
without creating a capturing group.



Q4. Describe a scenario in which using a nontagged category would have a significant impact on the
program&#39;s outcomes.




Ans-

One scenario where using a non-tagged category (non-capturing group) can have a significant impact is in ,
complex regular expressions where capturing groups are used for back-references or within quantifiers.

For example, consider a search engine application that uses regular expressions to parse search queries.
Users can input complex queries with logical operators like AND, OR, and NOT. The application needs to ,
parse these queries to understand the user's intent.

Using non-capturing groups can be crucial in this scenario. Suppose you have a regex pattern to match ,

individual keywords and another pattern to match logical operators. If you want to find occurrences of,
specific keywords within an OR operator, you might use a pattern like this:

```regex
(?:keyword1|keyword2) OR (?:keyword3|keyword4)
```

Here, non-capturing groups `(?:...)` are used to ensure that the logical operators are applied correctly,
without creating unnecessary capturing groups. If you used capturing groups instead of non-capturing groups, 
it would change the behavior of the regular expression and potentially lead to incorrect parsing of the,
search queries. Non-capturing groups allow you to group expressions for logical operations without affecting,
the capture groups used for other purposes, ensuring the program's outcomes are accurate and as intended.





Q5. Unlike a normal regex pattern, a look-ahead condition does not consume the characters it
examines. Describe a situation in which this could make a difference in the results of your
programme.




Ans-

Certainly. Let's consider a scenario where you have a text processing program that needs to validate passwords.
The requirement is that the password must be at least 8 characters long and should contain at least one digit. 

Using a positive lookahead assertion in the regular expression allows you to enforce this requirement without,
consuming characters. Here's how it could be done:

```regex
^(?=.*\d).{8,}$
```

In this regex:

- `^` and `$` indicate the start and end of the line respectively.
- `(?=.*\d)` is the positive lookahead assertion. It checks if there is at least one digit (`\d`) ahead in the input string.
- `.{8,}` matches any character (except for a newline) at least 8 times, ensuring the total length is 8 characters or more.

By using the lookahead assertion, you are checking for the presence of a digit without consuming any characters.
This is important because you want to ensure the entire password is at least 8 characters long while also verifying,
the presence of a digit. If the lookahead consumed characters, it would alter the length of the string being checked,
leading to incorrect validation results.





Q6. In standard expressions, what is the difference between positive look-ahead and negative look-
ahead?



Ans-

In regular expressions, both positive lookahead and negative lookahead are types of lookahead assertions used,
to validate patterns in the input string without consuming characters. The main difference lies in what they,
are looking for:

1. **Positive Look-Ahead (`(?=...)`)**
   - **Syntax:** `(?=...)`
   - **Meaning:** Asserts that a certain pattern **must** be present in the input text for the overall,
    pattern to match.
   - **Example:** If you use the positive lookahead `(?=\d)`, it means the pattern that follows must contain ,
    at least one digit for the match to occur. For instance, the pattern `(?=\d)\w+` would match any word that ,
    is followed by at least one digit.

2. **Negative Look-Ahead (`(?!...)`)**
   - **Syntax:** `(?!...)`
   - **Meaning:** Asserts that a certain pattern **must not** be present in the input text for the overall,
    pattern to match.
   - **Example:** If you use the negative lookahead `(?!@)`, it means the pattern that follows must not ,
    contain the "@" symbol. For instance, the pattern `\w+(?!@)` would match any word that is not followed by "@". 

In summary, positive lookahead ensures the presence of a pattern, while negative lookahead ensures the absence ,,
of a pattern for a successful match.





Q7. What is the benefit of referring to groups by name rather than by number in a standard
expression?



Ans-

Referring to groups by name in a regular expression provides several benefits over using group numbers:

1. **Readability:** Named groups make regular expressions more readable and self-explanatory.,
    Instead of referring to capturing groups by their numeric indices, you can use descriptive names ,
    that indicate the purpose of each group in the pattern. This improves the clarity of the regex pattern,
    especially for complex expressions.

2. **Maintainability:** Named groups enhance the maintainability of regular expressions. If the structure ,
    of the regex pattern changes, using named groups allows you to modify the pattern without having to ,
    update all the code that references specific group numbers. This flexibility simplifies maintenance tasks.

3. **Self-Documenting:** Named groups act as documentation within the regular expression itself. 
    By using meaningful names for groups, you provide context for what each part of the pattern represents. 
    This can be immensely helpful for anyone reading or modifying the regex in the future, as they can,
    understand the purpose of each group without diving deep into the pattern logic.

4. **Avoiding Pitfalls:** When you modify the regex pattern and change the order or number of capturing groups, 
    referencing groups by number can lead to errors if the numbering gets out of sync. Named groups,
    
    eliminate this risk, as you can refer to groups by their names regardless of their order in the pattern.

5. **Clarity in Code:** When using the regex in programming languages that support named groups, 
    such as Python, Perl, or .NET languages, accessing named groups in the code is more intuitive and expressive.
    You can use meaningful names directly in your code, enhancing readability and making the code logic clearer.

In summary, using named groups in regular expressions improves readability, maintainability, 
and self-documentation while also reducing the risk of errors and enhancing code clarity when ,
integrating the regex pattern into programming logic.






Q8. Can you identify repeated items within a target string using named groups, as in &quot;The cow
jumped over the moon&quot;?





Ans-


In regular expressions, you can identify repeated items within a target string using capturing groups,
combined with back-references, but not with named groups. Named groups are used for readability and ,
easier maintenance of the regular expression but do not offer a direct mechanism for identifying repeated items. 

To find repeated items using capturing groups and back-references, you could use a regex pattern like this:

```regex
\b(\w+)\b(?=.*\b\1\b)
```

In this pattern:

- `\b` matches word boundaries.
- `(\w+)` captures one or more word characters.
- `(?=.*\b\1\b)` is a positive lookahead assertion that checks if the captured word appears again,
(`\1` is the back-reference to the first capturing group).

Using this pattern with a tool or programming language that supports regex, you can identify repeated,
words within a given string. Note that the specific syntax and support for back-references might vary,
slightly depending on the programming language or regex tool you are using.







Q9. When parsing a string, what is at least one thing that the Scanner interface does for you that the
re.findall feature does not?





Ans-

The Scanner interface in various programming languages (such as Java) and the `re.findall` function in,
Python are used for different purposes.

1. **Scanner Interface (Java, for example):**
   - Provides a way to read input from various sources, not just strings. It can read from files,
  input streams, or other sources.
   - Allows you to parse different types of tokens from the input, not just regular expressions.
    You can parse integers, floats, strings, etc., using different methods provided by the Scanner class.
   - Offers more control over the parsing process, allowing you to customize how input is tokenized and processed.

2. **`re.findall` in Python:**
   - Specifically used for finding all occurrences of a regular expression pattern in a given string.
   - Limited to working with strings and regular expressions. It doesn't provide general-purpose input,
    parsing capabilities like the Scanner interface.
   - Returns a list of matched substrings, making it easy to extract specific patterns from a string based,

    on a regular expression.

In summary, while `re.findall` is excellent for finding specific patterns in strings using regular expressions,
the Scanner interface provides a broader set of functionalities, including reading from various sources, 
tokenizing input into different data types, and offering more control over the parsing process.






Q10. Does a scanner object have to be named scanner?


Ans


No, a Scanner object does not have to be named "scanner." In most programming languages, including Java,
where the Scanner class is commonly used for input parsing, you can choose any valid variable name for your,
Scanner object. The name you give to your Scanner object should be meaningful and descriptive to enhance the ,
readability and understanding of your code.

For example, you could name your Scanner object `inputScanner`, `userInput`, or any other name that reflects ,
the purpose of the object in your program. The choice of the variable name is a matter of programming style,
and clarity, as long as it follows the rules of the programming language you are using.