Skip to content

Extract label-folders from a Gmail inbox Takeout into mbox format, preserving read/starred status

Notifications You must be signed in to change notification settings

f00b4r0/gmail-mboxsplitter

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

GMail MBOX splitter

How to use this Python script to extract emails from a GMail inbox Takeout — when you have a lot of emails and other, simpler, imports fail.

1. Download MBOX from Google Takeout

2. Split MBOX by labels

./mbox_split.py --infile google_mbox.mbox

You may need to chmod 755 mbox_split.py to be able to run.

Alternatively, you can just export by label from Google Takeout directly.

This script will generate in the current working directory several mailboxes in the mbox format, corresponding to each of your Gmail labels, plus "Sent", "Archive" and "INBOX".

You can prefix the output files using the -p <prefix> parameter to the script: all output filenames will be prepended with .

Messages are stored only once: messages that have several labels will be storred in only one target mbox, usually corresponding to the first valid label.

The script will generate output in the form of:

Storing <message-id> from "sender" to mbox "label"

Along with an initial count of messages in the source mbox and and final tally of messages stored and ignored. It is thus recommended to redirect the script output to a file.

Note: Meta labels such as "Important", "Unread", "Starred", or "Newsletters" are ignored, however the script attempts to preserve "Unread" and "Starred" status by setting up the corresponding Status and X-Status flags.

If you use dovecot, you can stop there: dsync will happily process the mailboxes generated by this script

3. MBOX to Maildir

An attempt was made to convert on the fly, unfortunately it seems the python Maildir backend does not properly set the filenames according to their delivery date. A script is nevertheless provided as mbox_split_tomaildir.py for the brave: it appears that all email will be successfully converted, but unfortunately it will look as if it was all received at the time the script was run.

About

Extract label-folders from a Gmail inbox Takeout into mbox format, preserving read/starred status

Topics

Resources

Stars

Watchers

Forks

Languages

  • Python 100.0%