<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Gmail - List most interesting emails
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Gmail/Gmail_Get_emails_stats_by_sender.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=&template=template-request.md&title=Tool+-+Action+of+the+notebook+">Template request</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Gmail+-+Get+emails+stats+by+sender:+Error+short+description">Bug report</a> | <a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Naas/Naas_Start_data_product.ipynb" target="_parent">Generate Data Product</a>

**Tags:** #gmail #productivity #naas_drivers #operations #automation #analytics #plotly

**Author:** [Antonio Georgiev](www.linkedin.com/in/antonio-georgiev-b672a325b)

**Description:** This notebook analyses users' inbox and extracts a list of senders that the user is most interested in, depending on the user's opening rate of the sender's emails in the last two weeks.

## Input

### Import libraries

In [1]:
import naas
from naas_drivers import email
import pandas as pd
import numpy as np
import plotly.express as px
from datetime import datetime, timedelta, date
import pytz

### Setup Variables
Create an application password following [this procedure](https://support.google.com/mail/answer/185833?hl=en)
- `username`: This variable stores the username or email address associated with the email account
- `password`: This variable stores the password or authentication token required to access the email account
- `smtp_server`: This variable represents the SMTP server address used for sending emails.
- `box`: This variable stores the name or identifier of the mailbox or folder within the email account that will be accessed.

In [2]:
username = "theuniverse.bg@gmail.com"
password = naas.secret.get("GMAIL_APP_PASSWORD")
smtp_server = "imap.gmail.com"
box = "INBOX"
days = 14

## Model

### Connect to email box

In [3]:
emails = email.connect(username, password, username, smtp_server)

### Get all seen emails list

In [None]:
today = datetime.now().date()
two_weeks_ago = today - timedelta(days=14)
sorted_emails = emails.get_emails_by_date(date=two_weeks_ago, condition="after or on")
print(len(sorted_emails))
seen_emails = sorted_emails[sorted_emails['flags'].apply(lambda flags: 'SEEN' in [flag.upper() for flag in flags])]
print(len(seen_emails))

### Get most viewed senders by counting the occurencies

In [5]:
senders = seen_emails['from'].apply(lambda x: x['email'])
df_senders = pd.DataFrame({'sender': senders})
most_viewed_senders = df_senders['sender'].value_counts().head(10)

### Identify the top 3 senders

In [6]:
top_senders = most_viewed_senders.head(3).index.tolist() # create a rate for each sender

### Find the unseen emails for the past two weeks from the top senders

In [7]:
unseen_emails = sorted_emails[sorted_emails['flags'].apply(lambda flags: 'SEEN' not in [flag.upper() for flag in flags])]
top_sender_emails = unseen_emails[unseen_emails['from'].str.contains(top_senders[0], case=False)]

## Output

### Print the list with the unseen emails for the past two weeks from the top senders

In [8]:
print("The unseen emails by the top 3 senders (based on opened emails) for the last 2 weeks:")
print(top_sender_emails)

The unseen emails by the top 3 senders (based on opened emails) for the last 2 weeks:
Empty DataFrame
Columns: [uid, subject, from, to, cc, bcc, reply_to, date, text, html, flags, headers, size_rfc822, size, obj, attachments]
Index: []
