This notebook provides a step-by-step walkthrough of the email_verifier.py script, demonstrating its functionality for validating email addresses in a CSV file.

1. Setup and Installation
First, ensure you have the necessary libraries installed. If running this notebook in a new environment, execute the following cell to install pandas and dnspython.

In [2]:
%pip install pandas dnspython




2. Import Libraries
Now, let's import the required Python libraries.

In [3]:
import pandas as pd
import re
import dns.resolver
import os

3. Define the Email Validation Function
This is the core logic of the email_verifier.py script. It performs syntax checks, domain existence (A/AAAA records), and MX record lookups.

In [5]:
# Constants for email validation
EMAIL_REGEX = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

def validate_email_address(email_address: str) -> str:
    """
    Performs a series of checks to validate an email address.
    Checks include syntax, domain existence (A/AAAA records), and MX records.

    Args:
        email_address (str): The email address to validate.

    Returns:
        str: A status string indicating the validation result (e.g., "Valid",
             "Invalid Syntax", "Domain Not Found", "No MX Records", "Error").
    """
    if not isinstance(email_address, str):
        return "Invalid Type"
    
    email_address = email_address.strip()
    if not email_address:
        return "Empty Email"

    # 1. Basic Regex Syntax Check
    if not re.match(EMAIL_REGEX, email_address):
        return "Invalid Syntax"

    # Split email into local part and domain
    try:
        _, domain = email_address.split('@')
    except ValueError:
        return "Invalid Format"

    # 2. Domain Existence Check (A or AAAA records) and MX Record Check
    try:
        # Check for A (IPv4) or AAAA (IPv6) records for the domain
        try:
            dns.resolver.resolve(domain, 'A')
        except dns.resolver.NoAnswer:
            try:
                dns.resolver.resolve(domain, 'AAAA')
            except dns.resolver.NoAnswer:
                return "Domain No A/AAAA Record"
        
        # Check for MX (Mail Exchange) records
        try:
            dns.resolver.resolve(domain, 'MX')
            return "Valid"
        except dns.resolver.NoAnswer:
            return "No MX Records"
        except dns.resolver.NXDOMAIN:
            return "Domain Not Found"
        except dns.resolver.Timeout:
            return "DNS Timeout"
        except Exception as e:
            return f"DNS Error: {e}"

    except dns.resolver.NXDOMAIN:
        return "Domain Not Found"
    except dns.resolver.NoNameservers:
        return "No Nameservers"
    except dns.resolver.Timeout:
        return "DNS Timeout"
    except Exception as e:
        return f"Unexpected Error: {e}"

print("Email validation function defined.")

Email validation function defined.


4. Create Sample Data
For demonstration purposes, let's create a sample_leads.csv file with a mix of valid and invalid email addresses. In a real scenario, you would use a CSV exported from SaaSquatch Leads.

In [6]:
data = {
    'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Brown', 'Diana Prince', 'Eve Adams', 'Frank White', 'Grace Lee', 'Henry Green', 'Ivy King', 'Jack Black'],
    'Company': ['ABC Corp', 'XYZ Inc', 'Fictional Co', 'Wonder Ent', 'Ghost Corp', 'Valid Tech', 'Service Co', 'Innovate Ltd', 'Old Company', 'Test Corp'],
    'Email Address': [
        'alice@abccorp.com',              # Valid
        'bob@xyzinc.com',                 # Valid
        'charlie@invaliddomainxyz123.com',# Non-existent domain (will likely be 'Domain Not Found')
        'diana.prince@example',           # Invalid syntax
        'eve@no-mx-record-domain.com',    # Placeholder for a domain with no MX records (will likely be 'No MX Records')
        'frank.white@gmail.com',          # Valid (common free email)
        'grace@serviceco.org',            # Valid
        'henry.green@innovateltd.net',    # Valid
        'ivy@oldcompany.co.uk',           # Valid
        'jack@testcorp.io'                # Valid
    ],
    'Phone': [
        '123-456-7890', '098-765-4321', '555-123-4567', '444-555-6666',
        '333-222-1111', '111-222-3333', '222-333-4444', '777-888-9999',
        '999-000-1111', '123-987-6543'
    ]
}
df_input = pd.DataFrame(data)
input_filename = 'sample_leads.csv'
df_input.to_csv(input_filename, index=False)


print(f"Sample leads created and saved to {input_filename}:")
print(df_input)

Sample leads created and saved to sample_leads.csv:
            Name       Company                    Email Address         Phone
0    Alice Smith      ABC Corp                alice@abccorp.com  123-456-7890
1    Bob Johnson       XYZ Inc                   bob@xyzinc.com  098-765-4321
2  Charlie Brown  Fictional Co  charlie@invaliddomainxyz123.com  555-123-4567
3   Diana Prince    Wonder Ent             diana.prince@example  444-555-6666
4      Eve Adams    Ghost Corp      eve@no-mx-record-domain.com  333-222-1111
5    Frank White    Valid Tech            frank.white@gmail.com  111-222-3333
6      Grace Lee    Service Co              grace@serviceco.org  222-333-4444
7    Henry Green  Innovate Ltd      henry.green@innovateltd.net  777-888-9999
8       Ivy King   Old Company             ivy@oldcompany.co.uk  999-000-1111
9     Jack Black     Test Corp                 jack@testcorp.io  123-987-6543


5. Apply Validation and Show Results
Now, let's load the sample_leads.csv and apply our validate_email_address function to the "Email Address" column. The results will be stored in a new column.

In [7]:
# Load the dummy data
df_leads = pd.read_csv(input_filename)

# Apply the validation function to the specified column
# .fillna('') is used to treat NaN values as empty strings for validation
df_leads['Email_Validation_Status'] = df_leads['Email Address'].fillna('').apply(validate_email_address)

# Define output filename
output_filename = 'sample_leads_validated.csv'

# Save the DataFrame with the new validation status to a new CSV file
df_leads.to_csv(output_filename, index=False)

print(f"\nValidation complete. Results saved to: {output_filename}")
print("\nValidated Leads (first few rows):")
print(df_leads)


Validation complete. Results saved to: sample_leads_validated.csv

Validated Leads (first few rows):
            Name       Company                    Email Address         Phone  \
0    Alice Smith      ABC Corp                alice@abccorp.com  123-456-7890   
1    Bob Johnson       XYZ Inc                   bob@xyzinc.com  098-765-4321   
2  Charlie Brown  Fictional Co  charlie@invaliddomainxyz123.com  555-123-4567   
3   Diana Prince    Wonder Ent             diana.prince@example  444-555-6666   
4      Eve Adams    Ghost Corp      eve@no-mx-record-domain.com  333-222-1111   
5    Frank White    Valid Tech            frank.white@gmail.com  111-222-3333   
6      Grace Lee    Service Co              grace@serviceco.org  222-333-4444   
7    Henry Green  Innovate Ltd      henry.green@innovateltd.net  777-888-9999   
8       Ivy King   Old Company             ivy@oldcompany.co.uk  999-000-1111   
9     Jack Black     Test Corp                 jack@testcorp.io  123-987-6543   

  Emai

6. Conclusion
This Jupyter Notebook demonstrates the core functionality of the SaaSquatch Lead Quality Enhancer: Email Verifier. By providing clear validation statuses, this tool empowers sales and marketing teams to:

Focus on genuinely reachable leads.

Reduce wasted outreach efforts.

Improve sender reputation.

Enhance the overall quality and reliability of their lead database.

This "Quality First" approach delivers significant business value by making lead generation efforts more efficient and effective.