<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Gmail - Unseen Emails from Important Senders (Identified by User's Reply Rate)
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Gmail/Gmail_Get_emails_stats_by_sender.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=&template=template-request.md&title=Tool+-+Action+of+the+notebook+">Template request</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Gmail+-+Get+emails+stats+by+sender:+Error+short+description">Bug report</a> | <a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Naas/Naas_Start_data_product.ipynb" target="_parent">Generate Data Product</a>

**Tags:** #gmail #productivity #naas_drivers #operations #automation #analytics #plotly

**Author:** [Antonio Georgiev](www.linkedin.com/in/antonio-georgiev-b672a325b)

**Description:** This notebook retrieves the unseen emails from important senders and calculates the user's reply rate to each sender. The importance of the senders is determined based on the user's reply rate, which measures the percentage of answered emails out of the total emails received from each sender. By identifying the important senders with a higher reply rate, the code helps prioritize the user's responses and ensures timely communication. The code then outputs the list of unseen emails from these important senders, providing a focused view of the most relevant and pending email conversations.

## Input

### Import libraries

In [2]:
import datetime
import os
try:
    from imapclient import IMAPClient
except:
    !pip install imapclient --user
    from imapclient import IMAPClient
import naas
from collections import Counter
import quopri
import email.header

### Setup Variables
Create an application password following [this procedure](https://support.google.com/mail/answer/185833?hl=en)
- `username`: This variable stores the username or email address associated with the email account
- `password`: This variable stores the password or authentication token required to access the email account
- `date_start`: Number of days to filter your inbox, it must be negative value
- `most_common_senders`: Number of most common senders you want to list as output

In [3]:
username = "theuniverse.bg@gmail.com"
password = naas.secret.get("GMAIL_APP_PASSWORD")
date_start = -30
most_common_senders = 10

## Model

### Connect to email box

In [4]:
server = IMAPClient('imap.gmail.com')
server.login(username, password)
server.select_folder('INBOX')
print("✅ Successfully connected to INBOX")

✅ Successfully connected to INBOX


### Get all seen emails with their flags (seen or unseen), date, and sender

In [5]:
today = datetime.date.today()
start = today + datetime.timedelta(days=date_start)
seen_messages = server.search(['SEEN', 'SINCE', start.strftime('%d-%b-%Y')])
all_messages = server.search(['SINCE', start.strftime('%d-%b-%Y')])
seen_metadata = server.fetch(seen_messages, ['RFC822.SIZE', 'FLAGS', 'INTERNALDATE', 'ENVELOPE'])
all_metadata = server.fetch(all_messages, ['RFC822.SIZE', 'FLAGS', 'INTERNALDATE', 'ENVELOPE'])
print("✅ Seen emails fetched:", len(seen_metadata))
print("✅ All emails fetched:", len(all_metadata))

✅ Seen emails fetched: 126
✅ All emails fetched: 149


### Get most viewed senders by counting the occurencies

In [6]:
senders = []
answered_senders = []
for msg_id, data in all_metadata.items():
    envelope = data[b'ENVELOPE']
    if envelope.from_:
        sender_email = envelope.from_[0].mailbox.decode() + "@" + envelope.from_[0].host.decode()
        senders.append(sender_email)
        if b'\\Answered' in data[b'FLAGS']:
            answered_senders.append(sender_email)

sender_counts = Counter(senders)
answered_sender_counts = Counter(answered_senders)

rate_by_sender = {}
for sender, count in sender_counts.items():
    answered_count = answered_sender_counts[sender] if sender in answered_sender_counts else 0
    rate = (answered_count / count) * 100
    rate_by_sender[sender] = rate

sorted_senders = sorted(rate_by_sender.items(), key=lambda x: x[1], reverse=True)
top_senders = sorted_senders[:most_common_senders]

print("✅ Top senders by answering rate:")
for sender, rate in top_senders:
    print(f"Sender: {sender}, Answering Rate: {rate}%")

✅ Top senders by answering rate:
Sender: premium@academia-mail.com, Answering Rate: 4.166666666666666%
Sender: info@n.myprotein.com, Answering Rate: 0.0%
Sender: noreply@newsletter.dickieslife.com, Answering Rate: 0.0%
Sender: news@mailing.tommy.com, Answering Rate: 0.0%
Sender: mary@email.numerade.com, Answering Rate: 0.0%
Sender: noreply@gardaland.it, Answering Rate: 0.0%
Sender: updates@academia-mail.com, Answering Rate: 0.0%
Sender: eg@f6s.com, Answering Rate: 0.0%
Sender: newsletter@lifecycle.quizlet.com, Answering Rate: 0.0%
Sender: support@mathway.com, Answering Rate: 0.0%


### Identify the unseen emails for the past two weeks from the top senders

In [7]:
unseen_emails = []
for sender, count in top_senders:
    unseen_messages = server.search(['UNSEEN', 'FROM', sender, 'SINCE', start.strftime('%d-%b-%Y')])
    unseen_emails.extend(unseen_messages)
len(unseen_emails)

19

### Extract the date, sender, and subject from the unseen_emails list to provide data for the output

In [10]:
email_list = []
for msg_id in unseen_emails:
    email_data = server.fetch(msg_id, ['ENVELOPE'])[msg_id][b'ENVELOPE']
    sender = email_data.from_[0].mailbox.decode() + "@" + email_data.from_[0].host.decode()
    date = email_data.date.strftime("%Y-%m-%d")
    subject_bytes = email_data.subject.decode()
    subject = email.header.decode_header(subject_bytes)[0][0]
    if isinstance(subject, bytes):
        subject = subject.decode()
    email_list.append((sender, date, subject))

## Output

### Print the list with the unseen emails for the past two weeks from the top senders

In [11]:
for sender, date, subject in email_list:
    print(f"{sender:<25} {date} \"{subject}\"")

premium@academia-mail.com 2023-07-08 "“T Mitev” cited by “Orsolya Szakály”"
premium@academia-mail.com 2023-07-10 "“T. Mitev” cited by “Xuanji Hou”"
premium@academia-mail.com 2023-07-12 "“T. Mitev” cited by “Xuanji Hou”"
info@n.myprotein.com      2023-07-08 "2x1 | Il Prodotto che costa meno é GRATIS 👀👀"
info@n.myprotein.com      2023-07-09 "Pulley & Gran Dorsale: ecco come allenarli"
info@n.myprotein.com      2023-07-09 "Ultime Ore 🚨 2x1 su 200+ Prodotti"
info@n.myprotein.com      2023-07-10 "45% di SCONTO + Spedizione GRATIS ⚡️ 3,2,1 VIA"
info@n.myprotein.com      2023-07-10 "Hazelnut Whip é Tornato in Stock 😍"
info@n.myprotein.com      2023-07-12 "ULTIME 24H ⏰ Spedizione GRATIS + 40% di SCONTO"
info@n.myprotein.com      2023-07-13 "40% di SCONTO continua su Prodotti Selezionati ⚡️⚡️"
noreply@newsletter.dickieslife.com 2023-07-08 "Abbiamo i modelli Dickies perfetti per te!"
noreply@newsletter.dickieslife.com 2023-07-11 "Riparti in grande stile 🎒"
noreply@newsletter.dickieslife.com 2023