# Email Parser and Validator:
####  **Objective:** 
   Develop a Python program that verifies if a string is a valid email address and also extracts parts of the email address.  
#### **Guideline:**
   a. Use regular expressions to determine if a string matches the standard format of an email address.  
   b. Write functions to extract and return the local-part and the domain of the email address.  
   c. Include error handling for non-matching strings.

### pattern = r'^([a-zA-Z0-9_.+-]+)@([a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)$'
* **r** in front of a regular expression string in Python tells the interpreter that the string is a raw string. This means that any backslashes in the string will not be interpreted as escape characters
* `^`								: The caret symbol matches the beginning of the string.
* `[a-zA-Z0-9_.+-]+`				: This character class matches one or more of the following characters: letters, numbers, underscores, periods, plus signs, or hyphens.
* `@`								: The at sign matches the @ character.
* `[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+`	: This character class matches one or more letters, numbers, or hyphens, followed by a period, followed by one or more letters, numbers, or hyphens.
* `$`								: The dollar sign matches the end of the string.

In [1]:
import re

In [12]:
def parse_email_address2(email_string):
    # Define a regular expression pattern to match for a valid email address
    #pattern = r'^([a-zA-Z0-9_.+-]+)@([a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)$'
    #pattern = r'^(\w|\.|\_|\-)+[@]\w+[.]\w{2,3}$'
    pattern = r'^([a-zA-Z0-9_.+-]+)@([a-zA-Z0-9-]+[.]\w{2,3})$'

    # Use the re.match() function to compare the string to the pattern
    match = re.match(pattern, email_string)

    if match:
        # If the string matches the pattern, extract the Local-part and Domain-part
        local_part = match.group(1) #  returns the 1st (first) capturing group of a regular expression match
        domain = match.group(2)     #  returns the 2nd (second) capturing group of a regular expression match
        return local_part, domain
    else:
        # If the string does not match the pattern, raise an exception
        raise ValueError("Invalid email address!")

# Test the function with some example email addresses
test_addresses = ["dara_k@cellcard.com.khm","tevy-smart@ezecom.comp","virea.kbot@wing.com","kaya_socheat_vat@databootcamp.com", "vat@databootcamp_com", "kaya@databootcamp.com", "kaya_socheat_vat_databootcamp@com.edu"]
##test_addresses = open("EmailAddress.txt", "r")
# c=a.readlines()
test_addresses_res = b.split("\n")

for address in test_addresses:
    try:
        local, domain = parse_email_address2(address)
        print(f"Email address: {address}")  # The 'f' in the function name stands for "formatted", and this function uses a feature of Python called "formatted string literals" to format the string "Email address: " with the value of the variable address.
        print(f"Local-part: {local}")
        print(f"Domain-part: {domain}\n")
    except ValueError as e:
        print(f"Error while parsing {address}: {str(e)}\n")


Error while parsing dara_k@cellcard.com.khm: Invalid email address!

Error while parsing tevy-smart@ezecom.comp: Invalid email address!

Email address: virea.kbot@wing.com
Local-part: virea.kbot
Domain-part: wing.com

Email address: kaya_socheat_vat@databootcamp.com
Local-part: kaya_socheat_vat
Domain-part: databootcamp.com

Error while parsing vat@databootcamp_com: Invalid email address!

Email address: kaya@databootcamp.com
Local-part: kaya
Domain-part: databootcamp.com

Email address: kaya_socheat_vat_databootcamp@com.edu
Local-part: kaya_socheat_vat_databootcamp
Domain-part: com.edu



In [5]:
import re
def parse_email_address2(email_string):
    # Define a regular expression pattern to match for a valid email address
    #pattern = r'^([a-zA-Z0-9_.+-]+)@([a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)$'
    #pattern = r'^(\w|\.|\_|\-)+[@]\w+[.]\w{2,3}$'
    pattern = r'^([a-zA-Z0-9_.+-]+)@([a-zA-Z0-9-]+[.]\w{2,3})$'

    # Use the re.match() function to compare the string to the pattern
    match = re.match(pattern, email_string)

    if match:
        # If the string matches the pattern, extract the Local-part and Domain-part
        local_part = match.group(1) #  returns the 1st (first) capturing group of a regular expression match
        domain = match.group(2)     #  returns the 2nd (second) capturing group of a regular expression match
        return local_part, domain
    else:
        # If the string does not match the pattern, raise an exception
        raise ValueError("Invalid email address!")

# Test the function with some example email addresses
##test_addresses = ["kaya_socheat_vat@databootcamp.com", "vat@databootcamp_com", "kaya@databootcamp.com", "kaya_socheat_vat_databootcamp@com.edu"]
test_addresses = open("EmailAddress.txt", "r")
# c=a.readlines()
b = test_addresses.read()
test_addresses_res = b.split("\n")

for address in test_addresses_res:
    try:
        local, domain = parse_email_address2(address)
        print(f"Email address: {address}")  # The 'f' in the function name stands for "formatted", and this function uses a feature of Python called "formatted string literals" to format the string "Email address: " with the value of the variable address.
        print(f"Local-part: {local}")
        print(f"Domain-part: {domain}\n")
    except ValueError as e:
        print(f"Error while parsing {address}: {str(e)}\n")


Error while parsing michael3311980@yahoocom: Invalid email address!

Error while parsing michael426@earthlink_net: Invalid email address!

Email address: michael436253@yahoo.com
Local-part: michael436253
Domain-part: yahoo.com

Email address: michael4lsu@aol.com
Local-part: michael4lsu
Domain-part: aol.com

Email address: michael5018743605@yahoo.com
Local-part: michael5018743605
Domain-part: yahoo.com

Email address: michael5075_53110@yahoo.com
Local-part: michael5075_53110
Domain-part: yahoo.com

Email address: michael50_37@hotmail.com
Local-part: michael50_37
Domain-part: hotmail.com

Email address: michael5679@yahoo.com
Local-part: michael5679
Domain-part: yahoo.com

Email address: michael66713-lucre@yahoo.com
Local-part: michael66713-lucre
Domain-part: yahoo.com

Email address: michael66wn@hotmail.com
Local-part: michael66wn
Domain-part: hotmail.com

Email address: michael67_99@yahoo.com
Local-part: michael67_99
Domain-part: yahoo.com

Email address: michael81970@aol.com
Local-part

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

