<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Gmail - List most interesting emails
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Gmail/Gmail_Get_emails_stats_by_sender.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=&template=template-request.md&title=Tool+-+Action+of+the+notebook+">Template request</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Gmail+-+Get+emails+stats+by+sender:+Error+short+description">Bug report</a> | <a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Naas/Naas_Start_data_product.ipynb" target="_parent">Generate Data Product</a>

**Tags:** #gmail #productivity #naas_drivers #operations #automation #analytics #plotly

**Author:** [Antonio Georgiev](www.linkedin.com/in/antonio-georgiev-b672a325b)

**Description:** This notebook analyses users' inbox, identifies a list of senders that the user is most interested in, depending on the user's opening rate of the sender's emails in the last two weeks, and outputs the unseen emails for the past 2 weeks from these senders. 
This notebook aims to keep the user on track with their most important emails and ensures that they won't leave an important email unopened

## Input

### Import libraries

In [1]:
import datetime
import os
from imapclient import IMAPClient
import naas
from collections import Counter
import quopri
import email.header

### Setup Variables
Create an application password following [this procedure](https://support.google.com/mail/answer/185833?hl=en)
- `username`: This variable stores the username or email address associated with the email account
- `password`: This variable stores the password or authentication token required to access the email account
- `server`: This variable represents the SMTP server address used for sending emails.

In [15]:
username = "xxxxx@xxxx"
password = naas.secret.get("GMAIL_APP_PASSWORD")
server = IMAPClient('imap.gmail.com')

## Model

### Connect to email box

In [16]:
server.login(username, password)
server.select_folder('INBOX')

{b'PERMANENTFLAGS': (b'\\Answered',
  b'\\Flagged',
  b'\\Draft',
  b'\\Deleted',
  b'\\Seen',
  b'$NotPhishing',
  b'$Phishing',
  b'\\*'),
 b'FLAGS': (b'\\Answered',
  b'\\Flagged',
  b'\\Draft',
  b'\\Deleted',
  b'\\Seen',
  b'$NotPhishing',
  b'$Phishing'),
 b'UIDVALIDITY': 1,
 b'EXISTS': 1462,
 b'RECENT': 0,
 b'UIDNEXT': 1573,
 b'HIGHESTMODSEQ': 243011,
 b'READ-WRITE': True}

### Get all seen emails with their flags (seen or unseen), date, and sender

In [17]:
today = datetime.date.today()
two_weeks_ago = today - datetime.timedelta(days=14)
messages = server.search(['SEEN', 'SINCE', two_weeks_ago.strftime('%d-%b-%Y')])
metadata = server.fetch(messages, ['RFC822.SIZE', 'FLAGS', 'INTERNALDATE', 'ENVELOPE'])

### Get most viewed senders by counting the occurencies

In [18]:
senders = []
for data in metadata.values():
    envelope = data[b'ENVELOPE']
    if envelope.from_:
        sender_email = envelope.from_[0].mailbox.decode() + "@" + envelope.from_[0].host.decode()
        senders.append(sender_email)
senders
sender_counts = Counter(senders)
most_seen_senders = sender_counts.most_common(3)  
print(most_seen_senders)

[('info@n.myprotein.com', 15), ('premium@academia-mail.com', 11), ('news@mailing.tommy.com', 8)]


### Identify the unseen emails for the past two weeks from the top senders

In [20]:
unseen_emails = []
for sender, count in most_seen_senders:
    unseen_messages = server.search(['UNSEEN', 'FROM', sender, 'SINCE', two_weeks_ago.strftime('%d-%b-%Y')])
    unseen_emails.extend(unseen_messages)
unseen_emails

[1558, 1565, 1572, 1564]

### Extract the date, sender, and subject from the unseen_emails list to provide data for the output

In [21]:
email_list = []
for msg_id in unseen_emails:
    email_data = server.fetch(msg_id, ['ENVELOPE'])[msg_id][b'ENVELOPE']
    sender = email_data.from_[0].mailbox.decode() + "@" + email_data.from_[0].host.decode()
    date = email_data.date.strftime("%Y-%m-%d %H:%M:%S")
    subject_bytes = email_data.subject.decode()
    subject = email.header.decode_header(subject_bytes)[0][0]
    if isinstance(subject, bytes):
        subject = subject.decode()
    email_list.append((sender, date, subject))

## Output

### Print the list with the unseen emails for the past two weeks from the top senders

In [22]:
print("The unseen emails by the top 3 senders (based on opened emails) for the last 2 weeks:")
email_list

The unseen emails by the top 3 senders (based on opened emails) for the last 2 weeks:


[('info@n.myprotein.com',
  '2023-07-05 10:01:29',
  'Heyüòµ ti stai perdendo il 2x1 su 200+ Prodotti'),
 ('info@n.myprotein.com', '2023-07-06 10:01:14', 'üö® 2x1 su 200+ Prodotti üö®'),
 ('info@n.myprotein.com',
  '2023-07-07 10:01:08',
  '2x1 su Pi√∫ di 200 Prodotti? S√≠, hai sentito bene üòé'),
 ('premium@academia-mail.com',
  '2023-07-06 02:32:14',
  '50% Off, 2 days only - ‚ÄúT. Mitev‚Äù cited by ‚ÄúEva Papazova‚Äù')]