# Win32com

- ```win32com``` is a Python module that is part of the PyWin32 library that lets Python communicate with Windows applications like Outlook, Excel, Word, etc. - using COM **(Component Object Model)** interfaces. In short, it allows Python to behave like a human using Outlook via VBA.
    - ```win32com.client``` is the submodule used for controlling these applications as a client. You use it to start a COM application (Outlook, excel, etc.), send messages to it (read emails, write Excel files, etc.), and access its objects (folders, messages, workbooks, etc.)
    - **COM (Component Object Model)** is a Microsoft technology that allows different programs to talk to each other. You can think of it as a standard way for Windows programs to expose their features and objects (like emails, folders, Excel cells, etc.) so that other programs - like Python - can control them.
        - For instance, Outlook exposes a COM interface where folders, messages, etc. are accessible.

- **Installing PyWin32**:
    - ```python
        pip install pywin32
        ```

## Key Concepts for Outlook Automation

- **Connect to Outlook application:**
    - ```python
        import win32com.client

        outlook = win32com.client.Dispatch("Outlook.Application")
        namespace = outlook.GetNamespace("MAPI")
        ```

        - ```outlook``` A COM object representing the Outlook Application (used to access all functionality).
        - ```Dispatch("Outlook.Application")``` Launches or attaches to a running Outlook process.
        - ```.GetNamespace("MAPI")``` The root MAPI interface (Messaging API) that gives access to a top-level object (called "namespace") to access all Outlook data like Inbox, Calendar, Contacts, etc.


- **Accessing Folders:**
    - ```python
        inbox = namespace.GetDefaultFolder(6)
        ```

        - **6** represents the Inbox Folder. It is a Outlook's built-in folder ID. Some other IDs include:
            - 5: Sent Mail
            - 3: Deleted Items
            - 4: Outbox
            - 16: Drafts
            - 9: Calendar
    - Once you have the Inbox, you can navigate to subfolders:
        - ```python
            personal_folder = inbox.Folders["Personal Folder"]
            ```
    - You can keep chaining:
        - ```python
            from_leah = personal_folder.Folders['Leah']

- **Accessing Mail Items:**
    - Each folder contains ```.Items```, a **collection** of emails and possibly other Outlook items (meetings, etc.)
    - ```python
        for item in personal_folder.Items:
            if item.Class == 43: # item.Class returns an integer
                print(item.Subject)
        ```
        - You filter by ```item.Class``` because ```.Items``` can include:
            - 43: MailItem
            - 26: AppointmentItem (calendar)
            - 48: TaskItem


- **Useful MailItem Properties in Outlook:**

    - When working with Outlook emails using `win32com`, you interact with `MailItem` objects (Class ID `43`). Below are some of the most useful properties you can access:

| Property          | Type     | Description |
|------------------|----------|-------------|
| `Subject`         | `str`    | The subject line of the email |
| `Body`            | `str`    | The plain text body of the email |
| `HTMLBody`        | `str`    | The body of the email in HTML format |
| `SenderName`      | `str`    | The display name of the sender |
| `SenderEmailAddress` | `str` | The email address of the sender |
| `To`              | `str`    | The recipient(s) in the "To" field |
| `CC`              | `str`    | The recipients in the "CC" field |
| `ReceivedTime`    | `datetime` | The date and time the email was received |
| `SentOn`          | `datetime` | The date and time the email was sent |
| `Attachments`     | `Attachments` collection | Use `.Count`, `.Item(index)` to access attachments |
| `Categories`      | `str`    | The category names assigned to the email (e.g., "Control Request") |
| `Unread`          | `bool`   | `True` if the message is unread |
| `EntryID`         | `str`    | Unique ID for the email (used to retrieve the item again later) |
| `Parent`          | `Folder` object | The folder where the message resides |

- Example Usage:
    - ```python
            for item in folder.Items:
                if item.Class == 43:  # MailItem
                    print("Subject:", item.Subject)
                    print("Received:", item.ReceivedTime)
                    print("Body starts with:", item.Body[:50])
                    print("Category:", item.Categories)
        ```

- **Filtering with ```.Restrict()```**
    - Looping over every email is slow for large mailboxes. Instead, you can filter (like SQL ```WHERE```) with ```.Restrict()```.
    - ```python
        filtered_items = personal_folder.Items.Restrict("[Categories] = 'From Leah'")
        ```
    - This uses **DASL** syntax (like SQL, but for Outlook). You can filter by:
        - ```[Subject]```
        - ```[Categories]```
        - ```[ReceivedTime]```
        - ```[SenderEmailAddress]```
        - etc.
    - ```python
        filtered_items = personal_folder.Items.Restrict("[ReceivedTime] >= '01/01/2024") # This has to be in US date format ('MM/DD/YYYY HH:MM AM/PM')
        ```

- **Sorting with ```.Sort()```**
    - In Outlook automation with ```win32com```, you can use the ```.Sort()``` method to **sort items within a folder (like emails in your Inbox) before looping through them.** This is especially useful if you are only interested in the newest or oldest emails.
    - ```python
        items.Sort("[Received Time]", True) # True sorts in descending order (newest first). False sorts in ascending order.
        ```

- **Loop over Subfolders**
    - ```python
        for subfolder in main_folder.Folders: # subfolder and main_folder are MAPI items (think of them as a single folder in desktop)
            print(subfolder.Name) # If you want to loop through a parent folder, you need to add .Folders to it. .Folders represent a collection of subfolders. You cannot loop over a single parent file in your desktop! But you can loop over the subfolders INSIDE it!)
        ```
    - ```python
        for subfolder in main_folder.Folders:
            for item in subfolder.Items: # .Items represent a collection of items inside each subfolder
                print(item.Subject)
        ```
    

- **Example**
    - ```python
        import win32com.client

        # Connect to Outlook and get the MAPI namespace
        outlook = win32com.client.Dispatch('Outlook.Application')
        namespace = outlook.GetNamespace('MAPI')

        # Navigate to the target folder
        inbox = namespace.GetDefaultFolder(6)  # 6 = Inbox
        personal_inbox = inbox.Folders['Personal Folder']
        from_leah = personal_inbox.Folders['From Leah']

        # Get mail items and sort by ReceivedTime (newest first)
        items = from_leah.Items
        items.Sort('[ReceivedTime]', True)

        # Loop through emails and print key details
        for item in items:
            if item.Class == 43:  # Ensure it's a MailItem
                print('Subject:', item.Subject)
                print('Body:', item.Body)
                print('ReceivedTime:', item.ReceivedTime)
                print('Sender Email:', item.SenderEmailAddress)
                print('---\n')
        ```

In [None]:
import re
from datetime import datetime
from openpyxl import load_workbook
import win32com.client

 

# Check the last updated date in the Excel file

workbook = load_workbook("GSFMO CR and IR Tracker for testing.xlsx")

sheet = workbook['CR IR Tracker']

 

updated_time = sheet['A1'].value

print(updated_time)

 

test_time = '06/04/2025 12:00AM'



 

date_pattern = r"""

    (Start\s*Date|From)              # Match Start Date label

    [\s:–—\-]*                       # Skip over spaces/dashes/colons

    (?P<start>                       # Start named group

        \d{1,2}[\s\-\/]*[A-Za-z]+[\s\-\/]*\d{2,4}     # e.g., 05-JUN-2025 or 5 JUN 2025

        |

        \d{1,2}[\-/]\d{1,2}[\-/]\d{2,4}               # e.g., 05/06/2025

    )

    .*?                              # Allow anything in between (non-greedy)

    (End\s*Date|To)                  # Match End Date label

    [\s:–—\-]*                       # Skip over spaces/dashes/colons

    (?P<end>                         # End named group

        \d{1,2}[\s\-\/]*[A-Za-z]+[\s\-\/]*\d{2,4}     # e.g., 18-JUN-2025

        |

        \d{1,2}[\-/]\d{1,2}[\-/]\d{2,4}               # e.g., 18/06/2025

    )

"""

 

closure_pattern = r"""

    (Date\s*of\s*Successful\s*Implementation) # Match label

    [\s:\-–—\u00A0\u200B]* # Separator: colon, dash, or invisible space

    \n* # Optional line break

    [\s\u00A0\u200B]* # Whitespace or invisible space

 

    (?P<closure> # Named group

        \d{1,2}(st|nd|rd|th)? # Day with optional ordinal

        [\s\-–—/]* # Separator(s)

        [A-Za-z]+ # Month

        [\s\-–—/]* # Separator(s)

        \d{2,4} # Year

    )

"""

 

date_formats = [

    "%d %B %Y",     # 28 May 2025

    "%d-%b-%Y",     # 28-May-2025

    "%d-%B-%Y",     # 28-May-2025

    "%Y-%m-%d",     # 2025-05-28

    "%m/%d/%Y",     # 05/28/2025

    "%d %b %Y",     # 28 May 25

]

 

def parse_date(date_str):

    for fmt in date_formats:

        try:

            return datetime.strptime(date_str.strip(), fmt)

        except ValueError:

            continue

    return None  # Could not parse

 

def clean_date_str(s):

    # Remove spaces around dashes or slashes to normalize dates like '05- JUN -2025' => '05-JUN-2025'

    return re.sub(r'\s*([-\/])\s*', r'\1', s.strip())


 
# Map the lists to Excel columns

column_mapping = {

    "fiscal_year" : "A",

    "vendor_name": "B",

    "cr_ir": "C",

    "temp_perm": "D",

    "received_date": "E",

    "subject": "G",

    "sender_email": "H",

    "start_date_lst": "I",

    "end_date_lst": "J",

    "status_lst": "O"

}

 

closure_subjects = []

 

for vendor_folder in gsfmo_cr.Folders:

    filtered_items = vendor_folder.Items.Restrict(f"[ReceivedTime] >= '{test_time}'")

    filtered_items.Sort("[ReceivedTime]", True)

    for item in filtered_items:

        if item.Class == 43 and "Control Request" in item.Categories:

 

            vendor_name.append(vendor_folder.Name)

            cr_ir.append("CR")

 

            if "Temporary" in item.Body:

                temp_perm.append('T')

            elif "Permanent" in item.Body:

                temp_perm.append('P')

            else:

                temp_perm.append("")

 

            fiscal = item.ReceivedTime.replace(tzinfo=None)

            if fiscal > datetime(2024, 11, 1):

                fiscal_year.append("FY 2025")

            received_time = item.ReceivedTime.strftime("%m%d%Y")

            received_date.append(f"{received_time}")

 

            subject.append(item.Subject)

            sender_email.append(item.SenderEmailAddress)

 

            match = re.search(date_pattern, item.Body, re.DOTALL | re.IGNORECASE | re.VERBOSE)

            if match:

                start_raw = match.group('start')

                end_raw = match.group('end')

 

                start_dt = parse_date(clean_date_str(start_raw))

                if start_dt is not None:

                    start_date = start_dt.strftime('%d-%m-%Y')

                else:

                    start_date = "Enter manually"

 

                end_dt = parse_date(clean_date_str(end_raw))

                if end_dt is not None:

                    end_date = end_dt.strftime('%d-%m-%Y')

                else:

                    end_date = "Enter manually"

 

                start_date_lst.append(start_date)

                end_date_lst.append(end_date)

            else:

                start_date_lst.append("Enter manually")

                end_date_lst.append("Enter manually")

 

            status_lst.append("Open")

 

        elif item.Class == 43 and item.SenderEmailAddress != "/O=EXCHANGELABS/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=9257D7775D104A96B4D8681CB2375415-VRMSUP, TD":

            match = re.search(closure_pattern, item.Body, re.IGNORECASE | re.VERBOSE)

            if match:

                closure_subjects.append(item.Subject)

 

# Now write to Excel starting after the last row with data

start_row = sheet.max_row + 1

 

for i in range(len(subject)):

    sheet[f"{column_mapping['fiscal_year']}{start_row + i}"] = fiscal_year[i] if i < len(fiscal_year) else ""

    sheet[f"{column_mapping['vendor_name']}{start_row + i}"] = vendor_name[i] if i < len(vendor_name) else ""

    sheet[f"{column_mapping['cr_ir']}{start_row + i}"] = cr_ir[i] if i < len(cr_ir) else ""

    sheet[f"{column_mapping['temp_perm']}{start_row + i}"] = temp_perm[i] if i < len(temp_perm) else ""

    sheet[f"{column_mapping['received_date']}{start_row + i}"] = received_date[i] if i < len(received_date) else ""

    sheet[f"{column_mapping['subject']}{start_row + i}"] = subject[i] if i < len(subject) else ""

    sheet[f"{column_mapping['sender_email']}{start_row + i}"] = sender_email[i] if i < len(sender_email) else ""

    sheet[f"{column_mapping['start_date_lst']}{start_row + i}"] = start_date_lst[i] if i < len(start_date_lst) else ""

    sheet[f"{column_mapping['end_date_lst']}{start_row + i}"] = end_date_lst[i] if i < len(end_date_lst) else ""

    sheet[f"{column_mapping['status_lst']}{start_row + i}"] = status_lst[i] if i < len(status_lst) else ""

 

workbook.save("GSFMO CR and IR Tracker for testing.xlsx")

 

print(fiscal_year)

print(vendor_name)

print(cr_ir)

print(temp_perm)

print(received_date)

print(subject)

print(sender_email)

print(start_date_lst)

print(end_date_lst)

print(status_lst)  

In [None]:
# Import dependencies
import openpyxl
import pandas as pd
import win32com.client
import re
from datetime import datetime, timedelta

# Connect to Outlook
outlook = win32com.client.Dispatch('Outlook.Application')
namespace = outlook.GetNamespace('MAPI')

# Retrieve the GSFMO folder and its subfolders
gsfmo_folder = namespace.Folders('td.gsfmo@td.com')
gsfmo_inbox = gsfmo_folder.Folders('Inbox')
gsfmo_cr = gsfmo_inbox.Folders('Control Request')

# Store today's date and cutoff date for inbox filtering
now = datetime.now()
cutoff = now - timedelta(days=7)
cutoff_str = cutoff.strftime('%m/%d/%Y %I:%M%p') # Convert the datetime object into U.S. time string format (e.g. 06/06/2025 03:02PM)

# Initiate empty lists to store values
fiscal_year = []
vendor_name = []
cr_ir = []
temp_perm = []
received_date_lst = []
subject = []
sender_email = []
start_date_lst = []
end_date_lst = []
status_lst = []

# Loop over each vendor folder
for vendor_folder in gsfmo_cr.Folders:
    filtered_items = vendor_folder.Items.Restrict(f"[TimeReceived] >= '{cutoff_str}'") # Filter for items from the last 7 days
    for item in filtered_items:
        if item.Class == 43 and "Control Request" in item.Categories: # For Mailbox items with "Control Request" tag...

            # Loop through the Excel sheet to match Subjec + Raised Date 



            
            
            
            
            
            
            
            
            nontz_receivedTime = item.ReceivedTime.replace(tzinfo = None)
            if nontz_receivedTime >= datetime(2025, 11, 1): # Append fiscal year based on date
                fiscal_year.append("FY 2025")
            elif nontz_receivedTime >= datetime(2026, 11, 1):
                fiscal_year.append("FY 2026")

            received_date = item.ReceivedTime.strftime('%m%d%Y')
            received_date_lst.append(received_date) # Append Received Date

            vendor_name.append(vendor_folder.Name) # Append vendor name
            cr_ir.append("CR") # Append CR
            



# For Control Requests, match against Excel rows. If item does not exist, append.

# For Closure requests, match against Excel rows. Update the status of the row.


In [18]:
from datetime import datetime, timedelta

In [19]:
now = datetime.now()
cutoff = now - timedelta(days=7)
print(cutoff)

2025-05-30 14:57:49.507032
