
**Stephanie Lonsdale**
**Objective**: Email categorization, summarization, and response generation using a Generative AI workflow.

### Import libraries and setup environment

In [1]:
# Import all necessary libraries and set up environment
import os
import json
from dataclasses import dataclass, field
from typing import List, Dict, Optional, Tuple
from datetime import datetime, timedelta
%pip -q install openai tenacity
import math
import time
%pip install pandas
import pandas as pd
from tenacity import retry, wait_exponential_jitter, stop_after_attempt
from openai import OpenAI
import pytz
from IPython.display import display, HTML

pd.set_option("display.max_colwidth", 200)
pd.set_option("display.width", 120)


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


### Helper Functions: File readers
This section defines the helper functions to safely read the CSV and JSON files from the dataset for the layout of this project.
These helpers handle common data formatting issues, these include encoding errors, delimiters and inconsistent JSON structures. 
These helpers also ensure that the system can process any real-world email export file. 

### Helper functions: Datetime Parsing 
This cell standardizes all timestamp fields to UTC and then converts them to local time (America/New_York).
It ensures the system correctly identifies "yesterday's" email for the first Task. 
For "yesterdays" emails I made sure that it was taking March 3rds emails because the most recent email date in the dataset was March 4th.




In [3]:
# CSV reader - encoding/delimiter issues 
def read_csv_robust(path: str) -> pd.DataFrame:
    encodings = ["utf-8", "utf-8-sig", "cp1252", "latin1"] #tries text encodings in order
    delims = [",", ";", "\t", "|"] #common CSV delimiters
    last_err = None #last exception for debugging
    
    #Tries every encoding/delimiter combination until one works
    for enc in encodings:
        for delim in delims:
            try:
                df = pd.read_csv(
                    path,  #file path
                    encoding=enc, #candidate text encoding
                    sep=delim, #candidate field delimiter
                    on_bad_lines="skip",  # skipped malformed rows (pandas>=1.3)
                    engine="python"       # tolerant with weird quoting, CSV parser
                )
                #Logs what worked- or if there are zero rows
                if not df.empty:
                    print(f"[read_csv_robust] Loaded {os.path.basename(path)} with encoding='{enc}', sep='{delim}', rows={len(df)}.")
                else:
                    print(f"[read_csv_robust] Loaded {os.path.basename(path)} but it's empty (encoding='{enc}', sep='{delim}').")
                return df #return dataframe if successful
            except Exception as e:
                last_err = e #remember failure, keey trying
                continue
    #If all combos failed, raise error to make necessary changes
    raise RuntimeError(f"Failed to read CSV {path} with common encodings/delimiters. Last error: {last_err}")

#JSON reader- encoding/wrapping

def read_json_robust(path: str) -> pd.DataFrame:
    with open(path, "rb") as fh: #read as binary
        raw = fh.read() #read all bytes
    
    # heuristics for encodings- try strict decodes with several encodings first
    for enc in ["utf-8", "utf-8-sig", "cp1252", "latin1"]:
        try:
            txt = raw.decode(enc, errors="strict") #this decodes the bytes
            data = json.loads(txt)  #this parses JSON text

            #Normalize to a flat table
            #Emails list under emails key, pure list at top level, and single dict at top level
            if isinstance(data, dict) and "emails" in data and isinstance(data["emails"], list):
                df = pd.json_normalize(data["emails"])
            elif isinstance(data, list):
                df = pd.json_normalize(data)
            elif isinstance(data, dict):
                df = pd.json_normalize([data])
            else:
                df = pd.DataFrame()
            print(f"[read_json_robust] Loaded {os.path.basename(path)} with encoding='{enc}', rows={len(df)}.")
            return df #if success
        except Exception:
            continue #try next encoding - failure detected

    # last resort: permissive decode
    txt = raw.decode("latin1", errors="ignore")
    try:
        data = json.loads(txt)
        if isinstance(data, dict) and "emails" in data and isinstance(data["emails"], list):
            return pd.json_normalize(data["emails"])
        elif isinstance(data, list):
            return pd.json_normalize(data)
        elif isinstance(data, dict):
            return pd.json_normalize([data])
    except Exception:
        pass
    #if ALL else fails, log and return empty data frame 

    print(f"[read_json_robust] Could not parse JSON: {path}")
    return pd.DataFrame()

# Find a likely datetime column and parse into UTC
def attach_parsed_datetime(df: pd.DataFrame) -> pd.DataFrame:
    #If df is empty at this point return a frame with NaT datetime column 
    if df is None or df.empty:
        df = pd.DataFrame()
        df["__dt_utc__"] = pd.NaT
        return df

    # Find a candidate column- find a likely datetime column with exact common names 
    cands_exact = ["date","received","timestamp","datetime","received_at","sent_at","internalDate"]
    date_col = None
    for c in df.columns:
        if c.lower() in cands_exact:
            date_col = c
            break
    #If nothing was found, look for partial matches
    if date_col is None:
        for c in df.columns:
            lc = c.lower()
            if "date" in lc or "time" in lc:
                date_col = c
                break
    #If nothing was still found, return NaT
    if date_col is None:
        df["__dt_utc__"] = pd.NaT
        return df
    
    #Robust string, UTC timestamp parser

    def parse_dt(x):
        try:
            return pd.to_datetime(x, utc=True, errors="coerce")
        except Exception:
            return pd.NaT

    #Parse the chosen ccolumn into a new standardized UTC datetime column
    df["__dt_utc__"] = df[date_col].apply(parse_dt)

    # epoch fallback if most failed
    if df["__dt_utc__"].isna().mean() > 0.5:
        def epoch_parse(x):
            try:
                v = float(x)    #attempt a numeric cast
                if v > 1e12:  # if super large, assume milliseconds 
                    v = v / 1000.0
                return pd.to_datetime(v, unit="s", utc=True, errors="coerce")
            except Exception:
                return pd.NaT
        alt = df[date_col].apply(epoch_parse) #alternate parse attempts
        mask = df["__dt_utc__"].isna() & ~alt.isna() # only fill where alt succeeded
        df.loc[mask, "__dt_utc__"] = alt[mask] #now df 

    return df #df always has a __dt_utc__ column

#timezone aware timestamp to a local calendar date
def to_local_date(ts, tz_name="America/New_York"):
    try:
        return ts.tz_convert(tz_name).date()
    except Exception:
        return pd.NaT


### EmailYesterboxAssistant
This class is the main controller for Task 1 (Email classification and summary).
This cell handles loading CSV/JSON data, identifying yesterday's emails, classifying emails into six categories, generating summary counts and saving results. This cell outputs executive_dashboard_summary.csv, top_critical.csv, and yesterday_categorized.csv. 

This cell also initializes the EmailYesterbox class, it specifies the data folde and runs the full Task 1 pipeline. This includes: discover files, classify emails, and produce and display dashboard summary. This cell also prints the top critical emails and saves output CSVs. 


In [4]:
#Turn this class into a dataclass
@dataclass
class EmailYesterboxAssistant:
    #Public config 
    data_dir: str = "." #Folder with CSV/JSON files
    out_prefix: str = "./task1A_" #Prefix for all output CSV artifacts
    tz_name: str = "America/New_York" #Timezone used for Yesterbox "YESTERDAY"
    debug: bool = False 
    include_files: Optional[List[str]] = None     # explicit file list (full paths)
    filename_keywords: Tuple[str, ...] = ("email","inbox","alex","orion","mail")  # helps filter noisy folders
    
    #Internal state 

    _df_all: pd.DataFrame = field(default_factory=pd.DataFrame, init=False) #union of loaded frames
    _df_yesterday: pd.DataFrame = field(default_factory=pd.DataFrame, init=False) #rows for data-set anchored "yesterday"
    _summary_df: pd.DataFrame = field(default_factory=pd.DataFrame, init=False) #category counts table
    _critical_df: pd.DataFrame = field(default_factory=pd.DataFrame, init=False) #top urgent/deadline rows
    _counts: Dict[str, int] = field(default_factory=dict, init=False) #category--> count mapping
    _ai_conclusion: str = field(default="", init=False) #friendly conclusion
    _anchor_date: Optional[datetime] = field(default=None, init=False)  # max local date in dataset
    _yesterday_date: Optional[datetime] = field(default=None, init=False) #anchor date - 1 day

    #Main entrypoint 
    def run(self):
        print(f"[RUN] data_dir={self.data_dir}") #where we are reading from
        if os.path.isdir(self.data_dir):
            sample = ", ".join(sorted(os.listdir(self.data_dir))[:15])
            print(f"[RUN] Example files in folder: {sample} ...")
        else:
            print("[ERROR] data_dir does not exist:", self.data_dir)
            return
        
    #Discover CSV/JSON candidates in the folder

        csv_files, json_files = self._discover_files()
        if self.include_files:
            # Use ONLY the explicit files if provided
            csv_files = [f for f in self.include_files if f.lower().endswith(".csv")]
            json_files = [f for f in self.include_files if f.lower().endswith(".json")]

        # Filter noisy folders with keywords, but fall back to all if none match
        f_csv = [p for p in csv_files if any(k in os.path.basename(p).lower() for k in self.filename_keywords)]
        f_json = [p for p in json_files if any(k in os.path.basename(p).lower() for k in self.filename_keywords)]
        if not f_csv and not f_json and not self.include_files:
            f_csv, f_json = csv_files, json_files  # fallback
         # Log candidate counts (and names if debug)
        print(f"[RUN] Candidate files ‚Üí CSV={len(f_csv)}, JSON={len(f_json)}")
        if self.debug:
            print("  CSV:", [os.path.basename(p) for p in f_csv])
            print("  JSON:", [os.path.basename(p) for p in f_json])
        # Load files and union them into one DataFrame with a parsed datetime column
        df = self._load_and_union(f_csv, f_json)
        if df.empty:
            print("[RUN] No rows loaded from selected files.")
            return

        # Anchor logic: use the MOST RECENT local date in the data, then take the DAY BEFORE, I did this because the dates aren't in real-time
        #If I structured the system to be in real-time, this would try to take emails from 10/5, which don't exist in the file.
        # Compute local calendar date for each row from its __dt_utc__ timestamp
        df["__local_date__"] = df["__dt_utc__"].apply(lambda x: to_local_date(x, self.tz_name))
        # If at least one local date exists, anchor to the dataset‚Äôs MAX local date,
        # then define ‚Äúyesterday‚Äù as that date minus one day
        if df["__local_date__"].notna().any():
            self._anchor_date = df["__local_date__"].dropna().max()
            self._yesterday_date = self._anchor_date - timedelta(days=1)
            print(f"[ANCHOR] Max date in data: {self._anchor_date} ‚Üí Using day-before: {self._yesterday_date}")
        else:
            self._anchor_date, self._yesterday_date = None, None
            print("[ANCHOR] No parseable dates found; dashboard will be zeroed.")
        # Keep only rows from the computed ‚Äúyesterday‚Äù
        self._df_yesterday = df[df["__local_date__"] == self._yesterday_date].copy()
        print(f"[FILTER] Rows for {self._yesterday_date}: {len(self._df_yesterday)}")

        self._harmonize()
        self._classify()
        self._summarize()
        paths = self._save_artifacts()
        self._print_dashboard()
        return paths

    # loaders #
    def _discover_files(self) -> Tuple[List[str], List[str]]:
        files = os.listdir(self.data_dir)
        # Separate into CSV and JSON lists (non-recursive)
        csv_files = [os.path.join(self.data_dir, f) for f in files if f.lower().endswith(".csv")]
        json_files = [os.path.join(self.data_dir, f) for f in files if f.lower().endswith(".json")]
        print(f"[DISCOVER] Found CSV={len(csv_files)} JSON={len(json_files)} in {self.data_dir}")
        return csv_files, json_files

    def _load_and_union(self, csv_files: List[str], json_files: List[str]) -> pd.DataFrame:
        frames = []
        # Load each CSV robustly; attach source filename; parse datetime column
        for f in csv_files:
            try:
                df = read_csv_robust(f)
                df["__source_file__"] = os.path.basename(f)
                df = attach_parsed_datetime(df)
                frames.append(df)
            except Exception as e:
                print(f"[WARN] Skipping CSV {os.path.basename(f)} ‚Üí {e}")
        # Load each JSON robustly; attach source filename; parse datetime column
        for f in json_files:
            try:
                df = read_json_robust(f)
                if not df.empty:
                    df["__source_file__"] = os.path.basename(f)
                    df = attach_parsed_datetime(df)
                    frames.append(df)
            except Exception as e:
                print(f"[WARN] Skipping JSON {os.path.basename(f)} ‚Üí {e}")
        # If nothing loaded, return empty frame; otherwise union (row-wise)
        if not frames:
            return pd.DataFrame()
        return pd.concat(frames, ignore_index=True, sort=False)

    # harmonize / classify / summarize
    def _harmonize(self):
        # If no data for yesterday, nothing to normalize
        if self._df_yesterday.empty:
            return
        df = self._df_yesterday
        #pick the first existing column from a list; otherwise return a default series
        def first(cols, default=""):
            for c in cols:
                if c in df.columns:
                    return df[c]
            return pd.Series([default]*len(df))
    # Standardize common fields under *_std columns (so later code is schema-agnostic)
        df["sender_std"]   = first(["from","sender","from_email","sender_email","From","email_from"])
        df["subject_std"]  = first(["subject","Subject","SUBJECT","title"])
        df["body_std"]     = first(["body","snippet","text","content","message","Body"])
        df["priority_std"] = first(["priority","Priority","importance","Importance"]).astype(str).str.lower()
        df["is_spam_std"]  = first(["is_spam","spam","label_spam","IsSpam","isSpam"], "false").astype(str).str.lower().isin(["true","1","yes","y"])

    def _contains_any(self, text: str, kws: List[str]) -> bool:
        # Basic keyword matcher used by the classifier
        t = (text or "").lower()
        return any(k in t for k in kws)

    def _classify(self):
         # If nothing to classify, bail
        if self._df_yesterday.empty:
            return
        # Keyword sets for heuristics
        urgent_kw   = ['urgent','asap','immediate','immediately','critical','sev-','sev ','p1','sev1','sev2','outage','down','escalation','blocker','production issue']
        deadline_kw = ['deadline','due','eod','cob','by eod','by cob','needs by','submit by','approval by','respond by','today','tomorrow']
        routine_kw  = ['update','weekly','status','standup','check-in','checkin','sync','minutes','notes','report','summary','recap','digest']
        personal_kw = ['birthday','party','congrats','congratulations','celebration','lunch','coffee','happy hour','dinner','wedding']
        spam_kw     = ['unsubscribe','promotion','sale','discount','limited time','buy now','free trial','offer','coupon']
        personal_domains = ['gmail.com','yahoo.com','outlook.com','icloud.com','hotmail.com','aol.com','proton.me','protonmail.com']
        # Row-level classifier returning a category string
        def label(r):
            subj, body, sender = r.get("subject_std",""), r.get("body_std",""), r.get("sender_std","")
            prio  = r.get("priority_std","")
            # 1) Spam takes precedence
            spamf = bool(r.get("is_spam_std", False))
            if spamf or self._contains_any(subj, spam_kw) or self._contains_any(body, spam_kw):
                return "Spam/Unimportant Emails Filtered Out"
            # 2) Priority or urgent terms
            if prio in ['urgent','high','p1','critical','highest'] or self._contains_any(subj, urgent_kw) or self._contains_any(body, urgent_kw):
                return "Urgent & High-Priority Emails"
            # 3) Deadline language
            if self._contains_any(subj, deadline_kw) or self._contains_any(body, deadline_kw):
                return "Deadline-Driven Emails"
             # 4) Personal/social cues (domain or keywords)
            dom = sender.split("@")[-1].lower() if isinstance(sender, str) and "@" in sender else ""
            if any(dom.endswith(d) for d in personal_domains) or self._contains_any(subj, personal_kw) or self._contains_any(body, personal_kw):
                return "Personal & Social Emails"
            # 5) Routine updates
            if self._contains_any(subj, routine_kw) or self._contains_any(body, routine_kw):
                return "Routine Updates & Check-ins"
             # 6) Everything else is informational
            return "Non-Urgent & Informational Emails"

        self._df_yesterday["__category__"] = self._df_yesterday.apply(label, axis=1)

    def _summarize(self):
        cats = [
            "Urgent & High-Priority Emails",
            "Deadline-Driven Emails",
            "Routine Updates & Check-ins",
            "Non-Urgent & Informational Emails",
            "Personal & Social Emails",
            "Spam/Unimportant Emails Filtered Out"
        ]
         # If no yesterday data or no category column, produce a zeroed summary and a default conclusion
        if self._df_yesterday.empty or "__category__" not in self._df_yesterday.columns:
            self._counts = {c: 0 for c in cats}
            self._summary_df = pd.DataFrame([{"Category": c, "Count": 0} for c in cats])
            self._critical_df = pd.DataFrame()
            self._ai_conclusion = (
                "You have 0 critical emails from yesterday that require action today. "
                "Additionally, there are 0 updates to review at your convenience."
            )
            return
         # Count rows per category
        counts = {c: int((self._df_yesterday["__category__"] == c).sum()) for c in cats}
        self._counts = counts
        self._summary_df = pd.DataFrame([{"Category": c, "Count": counts[c]} for c in cats])

        # top critical by local time
        def to_local(ts):
            try:
                return ts.tz_convert(self.tz_name)
            except Exception:
                return pd.NaT
        # Ensure a datetime column exists
        if "__dt_utc__" not in self._df_yesterday.columns:
            self._df_yesterday["__dt_utc__"] = pd.NaT
        self._df_yesterday["__dt_local__"] = self._df_yesterday["__dt_utc__"].apply(to_local)
         # Keep only urgent/deadline and take the latest 10
        mask = self._df_yesterday["__category__"].isin(["Urgent & High-Priority Emails","Deadline-Driven Emails"])
        self._critical_df = (
            self._df_yesterday[mask]
            .sort_values("__dt_local__", ascending=False)
            .loc[:, ["sender_std","subject_std","__category__","__dt_local__"]]
            .rename(columns={"sender_std":"From","subject_std":"Subject","__category__":"Category","__dt_local__":"Received (local)"})
            .head(10)
        )
        # Compose the AI conclusion sentence used in the dashboard
        critical = counts["Urgent & High-Priority Emails"] + counts["Deadline-Driven Emails"]
        updates  = counts["Routine Updates & Check-ins"]
        self._ai_conclusion = (
            f"You have {critical} critical emails from yesterday that require action today. "
            f"Additionally, there are {updates} updates to review at your convenience."
        )
        #save and print artifacts!
    def _save_artifacts(self) -> Dict[str, str]:
        os.makedirs(os.path.dirname(self.out_prefix) or ".", exist_ok=True)
        paths: Dict[str, str] = {}
        # Paths for each artifact
        p1 = f"{self.out_prefix}executive_dashboard_summary.csv"
        p2 = f"{self.out_prefix}top_critical.csv"
        p3 = f"{self.out_prefix}yesterday_categorized.csv"
        # Write summary and yesterday tables; top_critical only if we have rows
        self._summary_df.to_csv(p1, index=False)
        paths["summary_csv"] = p1
        if not self._critical_df.empty:
            self._critical_df.to_csv(p2, index=False)
            paths["critical_csv"] = p2
        self._df_yesterday.to_csv(p3, index=False)
        paths["categorized_csv"] = p3
        return paths

    def _print_dashboard(self):
        #  print the top-level dashboard with counts and conclusion
        c = self._counts
        total = int(len(self._df_yesterday))
        print(f"\nExecutive Dashboard ‚Äî Max-Date={self._anchor_date} ‚Üí Using Day-Before={self._yesterday_date}")
        print(f"Total Emails from Yesterday: {total}")
        print(f" üõë Urgent & High-Priority Emails: {c.get('Urgent & High-Priority Emails',0)} (Require Immediate Action Today)")
        print(f" ‚ö° Deadline-Driven Emails: {c.get('Deadline-Driven Emails',0)} (Must Be Addressed Today)")
        print(f" üìå Routine Updates & Check-ins: {c.get('Routine Updates & Check-ins',0)} (Review & Acknowledge)")
        print(f" üìé Non-Urgent & Informational Emails: {c.get('Non-Urgent & Informational Emails',0)} (Can Be Deferred or Delegated)")
        print(f" üéâ Personal & Social Emails: {c.get('Personal & Social Emails',0)} (Optional Review)")
        print(f" üóëÔ∏è Spam/Unimportant Emails Filtered Out: {c.get('Spam/Unimportant Emails Filtered Out',0)}")
        print("\nAI Conclusion:")
        print(f"\"{self._ai_conclusion}\"\n")

        # Show top critical table inline in notebook if present
        if not self._critical_df.empty:
            print("Top Critical Emails (up to 10):")
            display(self._critical_df)
        else:
            print("Top Critical Emails: None")

#point to folder (this is for my sake)
data_dir = "/Users/stephanielonsdale/Downloads"
include_files = None   # set to a list of full paths(this is also for my sake)

#Instantiate assistant with settings 
assistant = EmailYesterboxAssistant(
    data_dir=data_dir, #where to look
    out_prefix="/Users/stephanielonsdale/Downloads/task1A_", #prefix for output CSV
    tz_name="America/New_York", #timezone
    debug=True, #verbose logging
    include_files=include_files #explicit file list
)

#run the full task 1 pipeline
paths = assistant.run()
#show the dictionary of artifact paths returned by run- helpful! 
paths


[RUN] data_dir=/Users/stephanielonsdale/Downloads
[RUN] Example files in folder: (PDF ebook) Medical Ethics_ Accounts of Ground-Breaking Cases, 8th Edition.pdf, .DS_Store, .RData, .Rhistory, .localized, 00 Welcome.pptx, 002 PHY Report 1 Lonsdale.doc, 01 Review of Java (1).pptx, 1-Genetics JW 10 (1).pdf, 1-Genetics JW 10 (1).pptx, 1-Genetics JW 10 (2).pptx, 1-Genetics JW 10.pptx, 10 lecture micro path JW 9.pptx, 10F Anomaly Detection.pdf, 10k_Address_manual_reviewed.csv ...
[DISCOVER] Found CSV=20 JSON=0 in /Users/stephanielonsdale/Downloads
[RUN] Candidate files ‚Üí CSV=3, JSON=0
  CSV: ['Alex_emails_march_04-1 (1).csv', 'Alex_emails_march_04-1.csv', 'task1A_task2_ai_draft_responses_llm_per_email.csv']
  JSON: []
[read_csv_robust] Loaded Alex_emails_march_04-1 (1).csv with encoding='cp1252', sep=',', rows=60.
[read_csv_robust] Loaded Alex_emails_march_04-1.csv with encoding='cp1252', sep=',', rows=60.
[read_csv_robust] Loaded task1A_task2_ai_draft_responses_llm_per_email.csv with encod

Unnamed: 0,From,Subject,Category,Received (local)
0,Julia Martin,Approval Request: Budget Approval Needed by EOD,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
71,Emily Johnson,Update ‚Äì Bug Fixes & Refactoring,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
59,Nathan Ellis,Urgent: Performance Degradation in Production System,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
60,Julia Martin,Approval Request: Budget Approval Needed by EOD,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
63,James Patel,Subject: Daily Update ‚Äì Project Titan (March 3),Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
64,David Whitmore,[URGENT] Dashboard Syncing Issues ‚Äì Production Metrics Missing,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
65,Lisa Taylor,Quick Check-In ‚Äì Frontend Updates,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
66,Nathan Cole,Approval Request: Additional AWS Resources for Project Orion,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
67,David Kurien,Blocking Issue Alert ‚Äì Client Data Sync Failing,Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00
68,Mark Swarlos,Daily Update ‚Äì API Migration (March 3),Urgent & High-Priority Emails,2025-03-02 19:00:00-05:00


{'summary_csv': '/Users/stephanielonsdale/Downloads/task1A_executive_dashboard_summary.csv',
 'critical_csv': '/Users/stephanielonsdale/Downloads/task1A_top_critical.csv',
 'categorized_csv': '/Users/stephanielonsdale/Downloads/task1A_yesterday_categorized.csv'}

### API KEY SETUP 
This cell helped me set up my OpenAI API key securely in the environment variable `OPENAI_API_KEY`
This allows all OpenAI API calls throughout the notebook to authenticate correctly.

The short test at the end confirms that my OpenAI key has been detected successfully.
If it prints `True` the API connection is working correctly.


In [9]:
!pip install python-dotenv
from dotenv import load_dotenv
import os

load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")

print("API key detected:", api_key is not None)



API key detected: True


### API Configuration and Helper Functions 
This cell defines core helper functions used for AI text generation 
the _clip() trims long input text safely 
make_messages() builds structured prompts for GPT
_call_llm() sends requests to OpenAIs chat model with retry logic 
generate_llm_reply() handles pacing and generates one AI response per email

The model I am calling here is gpt-40-mini for a faster performance than what 5 was giving me 

In [12]:
#  API config & helpers cell
#A lot of reoccuring imports just because I  get scare di
from tenacity import retry, stop_after_attempt, wait_exponential_jitter, retry_if_exception_type
from openai import OpenAI
from openai import APIError, RateLimitError, APIConnectionError, APITimeoutError
from typing import Any

# Set key in env before running this cell
assert "OPENAI_API_KEY" in os.environ and os.environ["OPENAI_API_KEY"], \
    "Set OPENAI_API_KEY in your environment before running."

client = OpenAI()

# model prompt and sizing 
MODEL_NAME = "gpt-4o-mini"     # using mini for a faster runtime/quicker pacing
MAX_SUBJECT_CHARS = 300
MAX_BODY_CHARS    = 1000  # keep modest to avoid context overages- it was originally taking 30 minutes to run

#clip overly long text fields
def _clip(text: Any, max_chars: int) -> str:
    t = "" if text is None else str(text)
    return (t[:max_chars] + "‚Ä¶") if len(t) > max_chars else t
#chat message sequence for the model
def make_messages(sender: str, subject: str, body: str, category: str) -> List[Dict[str, str]]:
    subject_c = _clip(subject, MAX_SUBJECT_CHARS)
    body_c    = _clip(body, MAX_BODY_CHARS)
#alex carter instructions and establishment
    system_msg = (
        "You are Alex Carter, a project leader at Orion Tech Solutions. "
        "Write concise, professional email replies. Goals: (1) acknowledge sender + issue, "
        "(2) give concrete next steps with timeline/owner, (3) request missing info in up to 3 bullets, "
        "(4) calm, proactive tone, (5) if urgent/deadline, reflect urgency and clarity. "
        "Max 8 sentences. Do not add placeholders like [Name] unless no name is inferable."
    )
#actual content/context for the reply 
    user_msg = (
        f"Category: {category}\n"
        f"Sender: {sender}\n"
        f"Subject: {subject_c}\n\n"
        f"Email Body:\n{body_c}\n\n"
        f"Write the reply as Alex Carter. Output only the email text."
    )
    #standard openAI chat message list - both system and the user
    return [{"role": "system", "content": system_msg},
            {"role": "user",   "content": user_msg}]

# Set the max requests per minute- this helped with the speed of the system

TARGET_RPM = 20
SECONDS_PER_CALL = max(60.0 / TARGET_RPM, 2.0)  # minimum spacing; keep a little headroom

#core LLM with retry logic
#retry wrapper
@retry(
    reraise=True,
    stop=stop_after_attempt(6),  # up to 6 tries
    wait=wait_exponential_jitter(initial=1, max=15),  
    retry=retry_if_exception_type((RateLimitError, APIError, APIConnectionError, APITimeoutError))
)
def _call_llm(messages: List[Dict[str, str]], temperature: float = 0.4) -> str:
    resp = client.chat.completions.create(
        model=MODEL_NAME,
        messages=messages,
        temperature=temperature,
    )
    #extract text from first completion choice and strip whitespace
    return resp.choices[0].message.content.strip()
#wrapper generate one reply with pacing 
def generate_llm_reply(sender: str, subject: str, body: str, category: str, temperature: float = 0.4) -> str:
    """Single polite call with pacing + robust retries."""
    start = time.time() #timestamp for pacing calc
    msgs = make_messages(sender, subject, body, category) #build the prompt
    txt = _call_llm(msgs, temperature=temperature) #call the actual API
    # soft pacing to respect RPM
    elapsed = time.time() - start
    sleep_left = SECONDS_PER_CALL - elapsed
    if sleep_left > 0:
        time.sleep(sleep_left)
    return txt


### GenAI Powered Email Replies
this cell generates AI draft replies for each urgent or deadline-driven email from yesterday
each email is processed individually and one tailored GPT-generated response is able to be created
the results are saved to a separate csv titled: task2_ai_draft_responses_llm_per_email.csv

In [13]:
#  Task 1 cells and Cell 7 first - this is a reminder for me 

import time

# Pull yesterday's critical emails (urgent or deadline-driven)
assert hasattr(assistant, "_df_yesterday"), "Run Task 1 first to build assistant._df_yesterday."

#Only urgent/deadline rows
crit_mask = assistant._df_yesterday["__category__"].isin(
    ["Urgent & High-Priority Emails", "Deadline-Driven Emails"]
)
#Copy only the columns wee need for drafting replies 
critical_df = assistant._df_yesterday.loc[
    crit_mask, ["sender_std", "subject_std", "body_std", "__category__"]
].copy()
#if empty take care of it
if critical_df.empty:
    print("No Urgent/Deadline-Driven emails from yesterday. Nothing to draft.")
    drafts_df = pd.DataFrame(columns=["Sender","Category","Subject","AI_Draft","Status"])
#iterate over rows and generate drafts
else:
    # speed/quality knobs (safe defaults)
    MAX_BODY_CHARS = 1000   # trim very long emails for faster generation
    TEMPERATURE    = 0.35   # lower = more deterministic

    total = len(critical_df)
    print(f"Generating AI drafts for {total} critical emails (1 call per email)‚Ä¶")
    t0 = time.time()
    rows = []

    # resume from previous run if the output file exists
    out_path = f"{assistant.out_prefix}task2_ai_draft_responses_llm_per_email.csv"
    try:
        existing = pd.read_csv(out_path)
        # if same columns, we can resume/append
        have_existing = set(existing.columns) == {"Sender","Category","Subject","AI_Draft","Status"}
        if have_existing:
            print(f"[resume] Found existing file with {len(existing)} rows ‚Üí appending new rows.")
            rows.extend(existing.to_dict(orient="records"))
    except Exception:
        pass

    #loop through each critical emal and call GPT once

    for i, row in critical_df.reset_index(drop=True).iterrows():
        #extract standard fields with safe fallbacks 
        sender   = row.get("sender_std", "") or ""
        subject  = row.get("subject_std", "") or "(no subject)"
        body     = row.get("body_std", "") or ""
        category = row.get("__category__", "")

        # clip body to keep prompts fast and under token budget
        if len(body) > MAX_BODY_CHARS:
            body = body[:MAX_BODY_CHARS] + "‚Ä¶"
    #core generation call
        try:
            #use the helper from previous cell 
            draft = generate_llm_reply(sender, subject, body, category, temperature=TEMPERATURE)
            status = "ok"
        except Exception as e:
            #capture any generation errors 
            draft = f"[Generation error: {type(e).__name__}: {e}]"
            status = "failed"
    # save emails data
        rows.append({
            "Sender": sender,
            "Category": category,
            "Subject": subject,
            "AI_Draft": draft,
            "Status": status
        })
    #every 5 emails, print progress and save a temp CSV
        if (i + 1) % 5 == 0 or (i + 1) == total:
            elapsed = time.time() - t0
            print(f"  ‚Ä¢ {i+1}/{total} done  |  elapsed {elapsed/60:.1f} min")
            pd.DataFrame(rows).to_csv(out_path, index=False)
    #final save and display
    drafts_df = pd.DataFrame(rows)
    drafts_df.to_csv(out_path, index=False)
    print(f"\n‚úÖ Saved AI draft replies ‚Üí {out_path}")
    display(drafts_df.head(10))


Generating AI drafts for 51 critical emails (1 call per email)‚Ä¶
[resume] Found existing file with 52 rows ‚Üí appending new rows.
  ‚Ä¢ 5/51 done  |  elapsed 0.4 min
  ‚Ä¢ 10/51 done  |  elapsed 0.7 min
  ‚Ä¢ 15/51 done  |  elapsed 1.0 min
  ‚Ä¢ 20/51 done  |  elapsed 1.3 min
  ‚Ä¢ 25/51 done  |  elapsed 1.5 min
  ‚Ä¢ 30/51 done  |  elapsed 1.8 min
  ‚Ä¢ 35/51 done  |  elapsed 2.1 min
  ‚Ä¢ 40/51 done  |  elapsed 2.4 min
  ‚Ä¢ 45/51 done  |  elapsed 2.7 min
  ‚Ä¢ 50/51 done  |  elapsed 2.9 min
  ‚Ä¢ 51/51 done  |  elapsed 3.0 min

‚úÖ Saved AI draft replies ‚Üí /Users/stephanielonsdale/Downloads/task1A_task2_ai_draft_responses_llm_per_email.csv


Unnamed: 0,Sender,Category,Subject,AI_Draft,Status
0,Julia Martin,Urgent & High-Priority Emails,Approval Request: Budget Approval Needed by EOD,"Subject: Re: Approval Request: Budget Approval Needed by EOD\n\nHi Julia,\n\nThank you for your email and for sending over the budget breakdown. I understand the urgency and will prioritize this r...",ok
1,James Patel,Urgent & High-Priority Emails,Subject: Daily Update ‚Äì Project Titan (March 3),"Subject: Re: Daily Update ‚Äì Project Titan (March 3)\n\nHi James,\n\nThank you for the detailed update on Project Titan. It‚Äôs great to hear about the progress made, especially with the API integrat...",ok
2,David Whitmore,Urgent & High-Priority Emails,[URGENT] Dashboard Syncing Issues ‚Äì Production Metrics Missing,"Subject: Re: [URGENT] Dashboard Syncing Issues ‚Äì Production Metrics Missing\n\nHi David,\n\nThank you for bringing this urgent issue to my attention. I understand the critical nature of the missin...",ok
3,Lisa Taylor,Urgent & High-Priority Emails,Quick Check-In ‚Äì Frontend Updates,"Subject: Re: Quick Check-In ‚Äì Frontend Updates\n\nHi Lisa,\n\nThanks for the update on the frontend progress; it sounds promising! Regarding your question, we should hold off on pushing the new UI...",ok
4,Nathan Cole,Urgent & High-Priority Emails,Approval Request: Additional AWS Resources for Project Orion,"Subject: Re: Approval Request: Additional AWS Resources for Project Orion\n\nHi Nathan,\n\nThank you for bringing this to my attention. I understand the urgency of provisioning additional AWS reso...",ok
5,David Kurien,Urgent & High-Priority Emails,Blocking Issue Alert ‚Äì Client Data Sync Failing,"Subject: Re: Blocking Issue Alert ‚Äì Client Data Sync Failing\n\nHi David,\n\nThank you for bringing this urgent issue to my attention. I understand the severity of the client data sync failure and...",ok
6,Mark Swarlos,Urgent & High-Priority Emails,Daily Update ‚Äì API Migration (March 3),"Subject: Re: Daily Update ‚Äì API Migration (March 3)\n\nHi Mark,\n\nThanks for the update on the API migration and the progress made so far. I appreciate your proactive approach to the load testing...",ok
7,Tanya Patel,Urgent & High-Priority Emails,URGENT: Approval Needed for 2-Week Extension on Acme Corp Deployment,"Subject: Re: URGENT: Approval Needed for 2-Week Extension on Acme Corp Deployment\n\nHi Tanya,\n\nThank you for bringing this to my attention. I understand the importance of ensuring quality in ou...",ok
8,David Whitmore,Urgent & High-Priority Emails,System Crashing During Shift Changes ‚Äì URGENT,"Subject: Re: System Crashing During Shift Changes ‚Äì URGENT\n\nHi David,\n\nThank you for bringing this critical issue to my attention. I understand the urgency, and we will prioritize resolving th...",ok
9,Emily Johnson,Urgent & High-Priority Emails,Update ‚Äì Bug Fixes & Refactoring,"Subject: Re: Update ‚Äì Bug Fixes & Refactoring\n\nHi Emily,\n\nThanks for the update on the bug fixes and the backend refactor. It‚Äôs great to hear that the UI bugs are resolved and that the API cal...",ok


### LLM-as-a-Judge: Evaluation Setup 
This cells builds the evaluation framework used in Task 3
It defines three key functions, build_judge_messages() constructs the evaluation prompts 
parse_json_messages() safely parses model output
judge_one() evaluates one reply and returns a structured score

In [14]:
# LLM-as-a-Judge 
import re, json, time

# paths to task 2 output files
TASK2_OUT_PATH = f"{assistant.out_prefix}task2_ai_draft_responses_llm_per_email.csv"
TASK2_BATCH_PATH = f"{assistant.out_prefix}task2_ai_draft_responses_llm_batched.csv"

# Deterministic judging 
JUDGE_TEMPERATURE = 0.0

def build_judge_messages(sender: str, subject: str, body: str, ai_reply: str) -> list:
    """Builds a compact instruction-following prompt for the judge."""
    sys = (
        "You are an impartial evaluator. Score the **reply** against the **original email**. "
        "Rate on a 1‚Äì5 scale (5 best) for Relevance, Clarity, and Actionability. "
        "Return a compact JSON object with fields:\n"
        "{\n"
        '  "relevance": <1-5>,\n'
        '  "clarity": <1-5>,\n'
        '  "actionability": <1-5>,\n'
        '  "strengths": ["...","..."],\n'
        '  "improvements": ["...","..."],\n'
        '  "overall_justification": "2‚Äì3 sentences summary"\n'
        "}\n"
        "Be specific and concise. No extra commentary outside JSON."
    )
    user = (
        f"Original Email (metadata):\n"
        f"- Sender: {sender}\n"
        f"- Subject: {subject}\n\n"
        f"Original Email Body:\n{body}\n\n"
        f"AI Draft Reply:\n{ai_reply}\n\n"
        f"Now produce the JSON evaluation."
    )
    return [{"role": "system", "content": sys},
            {"role": "user", "content": user}]

def parse_json_loose(text: str) -> dict:
    """
    Try to parse JSON; if it fails, extract the first {...} block or ```json``` fenced block.
    """
    if not isinstance(text, str):
        return {}
    # direct parse
    try:
        return json.loads(text) #try normal JSON first 
    except Exception:
        pass
    # fenced code block
    m = re.search(r"```json\s*(\{.*?\})\s*```", text, flags=re.S) #fallback purposes
    if m:
        try:
            return json.loads(m.group(1))
        except Exception:
            pass
    # first {...} blob
    m = re.search(r"(\{.*\})", text, flags=re.S)
    if m:
        blob = m.group(1)
        # try to balance braces 
        opens = blob.count("{")
        closes = blob.count("}")
        if opens > closes:
            blob += "}" * (opens - closes)
        try:
            return json.loads(blob)
        except Exception:
            pass
    return {}

def judge_one(sender: str, subject: str, body: str, ai_reply: str) -> dict:
    """Call the judge model and return a normalized dict with scores and notes."""
    msgs = build_judge_messages(sender, subject, body, ai_reply)
    raw = _call_llm(msgs, temperature=JUDGE_TEMPERATURE)   
    data = parse_json_loose(raw)

    # normalize/ format results for output 
    out = {
        "Relevance": data.get("relevance"),
        "Clarity": data.get("clarity"),
        "Actionability": data.get("actionability"),
        "Strengths": "; ".join(data.get("strengths", [])) if isinstance(data.get("strengths"), list) else data.get("strengths"),
        "Improvements": "; ".join(data.get("improvements", [])) if isinstance(data.get("improvements"), list) else data.get("improvements"),
        "OverallJustification": data.get("overall_justification", ""),
        "_raw_judge": raw  # keep the raw model output for auditing/debug
    }
    # ensure numeric scores 1-5
    for k in ["Relevance","Clarity","Actionability"]:
        try:
            v = int(out[k])
            out[k] = max(1, min(5, v))
        except Exception:
            out[k] = None
    return out


### Run LLM judge Evaluations 
This cell runs the full evaluation loop, for each AI-generated reply, the judge model scores it for: relevance, clarity, and actionability. 
This cell provides feedback on strengths, improvements, and justification
The results are saved to task3_llm_judge_evaluations.csv

In [None]:
# Ensure dataframe from Task 1 exists
assert hasattr(assistant, "_df_yesterday"), "Run Task 1 cells first."

# Load Task 2 outputs (prefer per-email)
if os.path.exists(TASK2_OUT_PATH):
    drafts_df = pd.read_csv(TASK2_OUT_PATH)
    per_email = True
elif os.path.exists(TASK2_BATCH_PATH):
    drafts_df = pd.read_csv(TASK2_BATCH_PATH)
    per_email = False
else:
    raise FileNotFoundError("No Task 2 outputs found. Run the Task 2 cell to generate drafts first.")

# filter yesterday‚Äôs urgent and deadline emails again
crit_mask = assistant._df_yesterday["__category__"].isin(
    ["Urgent & High-Priority Emails", "Deadline-Driven Emails"]
)
orig_df = assistant._df_yesterday.loc[
    crit_mask, ["sender_std","subject_std","body_std","__category__"]
].copy().reset_index(drop=True)


# align drafts with original emails (one row per email)
if per_email:
    # Expect one row per email with AI_Draft and Subject/Sender
    # fallback: try a merge on (Sender, Subject) if lengths differ.
    if len(drafts_df) != len(orig_df):
        merged = pd.merge(
            orig_df, drafts_df,
            left_on=["sender_std","subject_std"], right_on=["Sender","Subject"],
            how="left"
        )
        work_df = merged
    else:
        # align side-by-side by index
        drafts_df = drafts_df.loc[:, ["Sender","Subject","AI_Draft"]]
        drafts_df.columns = ["Sender","Subject","AI_Draft"]
        work_df = pd.concat([orig_df, drafts_df], axis=1)
else:
    

    raise ValueError("Task 2 results are batched. Please run the 'per email' Task 2 cell to produce one draft per email.")

# shorten long bodies for faster judge calls
MAX_BODY_FOR_JUDGE = 1200
work_df["body_for_judge"] = work_df["body_std"].astype(str).str.slice(0, MAX_BODY_FOR_JUDGE)

# evaluate each AI reply using judge_one()
rows = []
print(f"Judging {len(work_df)} AI replies‚Ä¶")

t0 = time.time()
for i, r in work_df.iterrows():
    sender   = r.get("sender_std","")
    subject  = r.get("subject_std","")
    body     = r.get("body_for_judge","")
    ai_reply = r.get("AI_Draft","")

    try:
        evald = judge_one(sender, subject, body, ai_reply)
        status = "ok"
    except Exception as e:
        evald = {
            "Relevance": None, "Clarity": None, "Actionability": None,
            "Strengths": "", "Improvements": "", "OverallJustification": "",
            "_raw_judge": f"[Judge error: {type(e).__name__}: {e}]"
        }
        status = "failed"

   # collect results
    rows.append({
        "Sender": sender,
        "Category": r.get("__category__",""),
        "Subject": subject,
        "AI_Draft": ai_reply,
        **evald,
        "JudgeStatus": status
    })

    if (i+1) % 5 == 0 or (i+1) == len(work_df):
        elapsed = (time.time() - t0)/60
        print(f"  ‚Ä¢ {i+1}/{len(work_df)} judged  |  elapsed {elapsed:.1f} min")

# save all evaluations to CSV
task3_path = f"{assistant.out_prefix}task3_llm_judge_evaluations.csv"
task3_df = pd.DataFrame(rows)
task3_df.to_csv(task3_path, index=False)
print(f"\n‚úÖ Saved LLM-as-a-Judge evaluations ‚Üí {task3_path}")

# show first few results for review
display(task3_df.head(10)[[
    "Sender","Category","Subject","Relevance","Clarity","Actionability",
    "Strengths","Improvements","OverallJustification"
]])


Judging 191 AI replies‚Ä¶
  ‚Ä¢ 5/191 judged  |  elapsed 0.2 min


## User Recommendations and Suggestions
Some of these bulleted list I wrote in a google doc and pasted over so I could read through the AI judge output and take notes at the same time. 
Based on the evaluations from the LLM-Judge model this section summarizes the systems strengths, areas for improvement, and next steps I could take to enhance the AI performance:

After analyzing the AI generated email replies evaluated by our LLM judge, the system demonstrated great average scores for each reply, including
- Relevance 5/5
- Clartiy 4.9/5
- Actionability: 4.8/5

These results indicated that my assistant effectively understands the context, writes professional, goal-oriented responses, and maintains clarity across diverse urgent and deadline-driven scenarios

Strengths of my system
- The AI model consistenyl addressed each sender's specific concerns like budget approvals, outages, and production halts 
- Replies were professional and easy to read.  
- Almost every message included explicit next steps, timelines, and requests for clarification.  
- The assistant appropriately prioritized high-impact or time-sensitive issues, confirming awareness of deadlines and escalation protocols.
- Minimal variation across senders and topics‚Äîoutputs remained uniform and coherent.

Some areas for improvement 
- Several responses could specify *exact* times for updates or approvals (e.g., ‚Äúby 2 PM EST‚Äù instead of ‚Äúby end of day‚Äù). 
- Some replies could explicitly reference documents or files mentioned by the sender.  
- While clear and direct, a few urgent replies might benefit from a brief reassurance or appreciation of the sender‚Äôs effort.  
- Occasional redundancy in phrasing could be trimmed to improve readability without losing tone.  

Recommendations for Enhancement: 
- Train or prompt the model to include explicit times and dates when giving commitments.  
- Introduce tone control to adjust warmth or empathy depending on urgency and sender context.  
- Integrate metadata (e.g., department, client type) to tailor tone and level of formality.  
- Detect mentions of ‚Äúattached,‚Äù ‚Äúincluded,‚Äù or ‚Äúsee file‚Äù to ensure acknowledgments appear naturally.  
- Use the judge model‚Äôs scores to fine-tune prompts and measure improvement over iterations.

Overall The AI Email Sorting and Response is able to return near perfect performance in relevance, clarity, and professionalism. Its ability to quickly generate structured, context-aware replies making this an effective digital communication assistant. With minor improvements opportunities present in timeline, precision, and tone adaptability the system could achieve an enterpirse-grade relaibility and serve as a tool that could be used for professional correspondences. 



### References 

OpenAI. (2025). *ChatGPT (GPT-5)* [Large language model]. OpenAI. https://chat.openai.com  
> Used for generative reasoning, model evaluation (LLM-as-a-Judge), and workflow development
- Specific uses include correctly formatting any file path or reading in JSON and CSV file calls, evaluating the judge performance to get averaged scores, and piepeline set up and debugging. 

OpenAI. (2024). *GPT-4o mini* [Large language model]. OpenAI. https://platform.openai.com/docs/models/gpt-4o-mini  
> Used for high-speed AI-generated email drafting during Task 2.  

Pandas Development Team. (2024). *pandas: Powerful data analysis and manipulation tool* (Version 2.x) [Computer software]. https://pandas.pydata.org/  
> Used for data loading, cleaning, and CSV export.  

Python Software Foundation. (2024). *Python: A programming language for data analysis and AI integration* (Version 3.9 +). https://www.python.org/  
> Core programming language used for implementing the project pipeline.  

Tenacity Developers. (2023). *Tenacity: Retry library for Python* [Computer software]. https://tenacity.readthedocs.io/  
> Used to handle API retries and rate-limit control in OpenAI API calls.  

OpenAI. (2025). *OpenAI API reference and developer documentation*. https://platform.openai.com/docs/api-reference  
> Consulted for model configuration, authentication, and best practices in managing completions.  

McCulloh, I., Rodriguez, P., & Cruickshank, I. (2024). *Lecture 8A: The Yesterbox* [Video lecture]. *Applied Generative AI*, Johns Hopkins University, Whiting School of Engineering.  
> Provided conceptual grounding for the Yesterbox email-management approach used in Task 1.  

McCulloh, I., Rodriguez, P., & Cruickshank, I. (2024). *Lecture 8B: The E-mail Project* [Video lecture]. *Applied Generative AI*, Johns Hopkins University, Whiting School of Engineering.  
> Introduced the midterm email project framework and expectations.  

McCulloh, I., Rodriguez, P., & Cruickshank, I. (2024). *Applied Generative AI: Text-to-Label Tasks, Flow, and Applications* [Lecture slides]. Johns Hopkins University, Whiting School of Engineering.  
> Referenced for background on classification, evaluation metrics, and generative workflows used in the email-sorting system.  
- I followed Dr. Cruickshanks style in prompt engineering and vibe-coding with chat to create his workflow, by consulting with ChatGPT 5, I was able to navigate correct set up for this system similar to how he did in lecture examples, I also used his lecture examples as a reference for how my code should look.

Forleo, M. (2012). *The Yesterbox Original Blog Post*. *Yesterbox by Marie Forleo*. https://yesterbox.wordpress.com/#:~:text=At%20the%20end%20of%202012%2C,inbox%20instead%20of%20today%E2%80%99s%20inbox  
> Original concept of the ‚ÄúYesterbox‚Äù productivity method, forming the foundation for the email management framework implemented in Task 1.  
