Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronize gmail labels #43

Closed
wants to merge 22 commits into from
Closed

Conversation

aroig
Copy link
Member

@aroig aroig commented Jul 28, 2013

Hi,

This big pull-request adds functionality to store gmail labels in a special email header X-Keywords, and sync them just like IMAP flags.

I sent patches to the offlineimap mailing list almost a year ago, here is the thread:

http://comments.gmane.org/gmane.mail.imap.offlineimap.general/5943

Last january X Ryl contacted me about merging this. He asked me to make the labels stuff available also for the plaintext backend, and well, I didn't find the time to do it, so nothing happened. Now here it is.

This pull request contains the original labels stuff, plus several bugfixes thanks to some people who have started using it, and reported problems, plus support for the plaintext backend.

I'm hoping this could be merged, as it is a very useful feature for us label-enthusiast gmail users, and now it has had some reasonable testing.

A summary of what is contained in this pull request:

  1. When a message goes out of gmail server, adds a header X-Keywords with a comma separated list of labels.

  2. Updates the LocalStatus (both, sqlite and plaintext) to include columns for labels and local mtimes. For non-gmail repositories these columns are ignored.

    Also functionality is implemented to migrate between backends and upgrade format in both, plaintext and sqlite (the last batch of commits)

  3. When labels change on the gmail side, syncs them the same way as flags get synced (comparing with LocalStatus etc)

  4. Adds a GmailMaildir folder type, which keeps track of individual message modification times (the POSIX mtime), and uses it to spot messages which have been modified locally. Then, only for those modified messages (typically very few), reads the labels and syncs them back to gmail, the same way as flags.

  5. Adds an option to filter out certain headers when uploading messages to gmail. One may want to remove X-Keywords before sending a message back to gmail.

  6. Adds an option to ignore certain labels, like \Draft, for which flags serve the same purpose. Gmail internally keeps the D flag in sync with the \Draft label, or the F flag with the \Starred label.

Feel free to contact me for additional fixes, required before merging.

@seanfarley
Copy link

I just tried to merge this for my own experiments but ran into a few conflicts that were deep enough in the code to make me come here and ask for a rebase against 6.5.5 ;-) Thanks!

@aroig
Copy link
Member Author

aroig commented Oct 5, 2013

Here it is the rebase. I was about to do it anyway when I saw your message!

@seanfarley
Copy link

Ah, nice! Great timing :-D

@mturquette
Copy link

👍

@chmduquesne
Copy link

Awesome!

@ghost
Copy link

ghost commented Nov 20, 2013

Wow, this looks awesome! Any idea whether/when this will be merged in?

@cooljeanius
Copy link

Having this pull request merged would be helpful for the MacPorts port of offlineimap: https://lists.macosforge.org/pipermail/macports-dev/2013-December/025514.html

@cro
Copy link

cro commented Dec 20, 2013

I'm experimenting with this as well. Do we know the status of this pull request?

@cro
Copy link

cro commented Dec 23, 2013

@aroig I've installed offlineimap from your gmail_labels fork and it doesn't appear to be pulling my labels into my Maildirs. The mtime and labels columns of the LocalStatus-sqlite databases are always empty. Do you have a suggestion as to where I can look to debug this?

@aroig
Copy link
Member Author

aroig commented Dec 24, 2013

@cro Ok, make sure you have offlineimap configured to use type=GmailMaildir as local folder and type=Gmail as the remote one. If you have quick=-1 enabled, it will not fetch labels until you get a new message.

To debug this I'd put some print's on the labels related code to se what happens. The files involved are folder/Gmail.py, folder/GmailMaildir.py.

@cro
Copy link

cro commented Dec 26, 2013

On 12/24, Abdó Roig-Maranges wrote:

@cro [...] and type=Gmail as the remote one [...]

THAT was the piece I was missing! Thanks very much.

@jeroentbt
Copy link

Any ideas on when this would be merged in to master?

@cro
Copy link

cro commented Jan 13, 2014

I don't have an answer to that, but you could just check it out from @aroig's fork and install from there. That's what I did. It's been working great for me.

@jeroentbt
Copy link

@cro I have it set up, and it does work great, but it would be nice knowing this will be maintained.

@aroig
Copy link
Member Author

aroig commented Jan 14, 2014

@jeroentbt Well, I do plan to keep using it for my own mailing needs, so unless a rock from outer space falls to my head, or something like that, it will be kind of maintained for a while.

@chris001
Copy link
Member

I like this pull request. Since it's been out there 6 months and tested by several people, I agree it should be merged into the OfflineIMAP:next or master branch. Any word back recently from anyone on the team of maintainers ?

@jeroentbt
Copy link

@aroig good to hear :). And thank you for the great work!
It's a wonderful feature for gmail users of offlineimap and it would be great if it was readily and easily available to more users.

@attila-lendvai
Copy link

pardon my ignorance, i'm new to emails in emacs, and i couldn't find info on this. i'm approaching it using offlineimap + mu4e.

i think i have set up this branch to sync labels, but what shall i see then?

if i open some emails in mu4e, then i don't see any X-Keywords headers in the messages that have labels. is there a way or need to force syncing labels, or do i need to delete and re-sync into an empty local MailDir with the new settings? (i don't have quick=-1)

@jeroentbt
Copy link

Have you tried setting mu4e-action-tags-header?

Add this to your init.el somewhere:
(setq mu4e-action-tags-header "X-Keywords")

Cheers,
Jeroen

Attila Lendvai writes:

pardon my ignorance, i'm new to emails in emacs, and i couldn't find info on this. i'm approaching it using offlineimap + mu4e.

i think i have set up this branch to sync labels, but what shall i see then?

if i open some emails in mu4e, then i don't see any X-Keywords headers in the messages that have labels. is there a way or need to force syncing labels, or do i need to delete and re-sync into an empty local MailDir with the new settings? (i don't have quick=-1)


Reply to this email directly or view it on GitHub:
#43 (comment)

Jeroen Tiebout

@aroig
Copy link
Member Author

aroig commented Jan 17, 2014

@attila-lendvai In the manpage for the labels fork of offlineimap there are some details on the setup. The relevant part of the .offlineimaprc config file is this:

      [Account Gmail-foo]
      localrepository = Gmaillocal-foo
      remoterepository = Gmailserver-foo
      # Need this to be able to sync labels
      status_backend = sqlite
      synclabels = yes
      # This header is where labels go.
      labelsheader = X-Keywords
      # May not want to propagate this header back to gmail
      filterheader = X-Keywords

      [Repository Gmailserver-foo]
      #This is the remote repository
      type = Gmail
      remotepass = XXX
      remoteuser = XXX

      [Repository Gmaillocal-foo]
      #This is the 'local' repository
      type = GmailMaildir

Make sure you backup your local maildir folder, as well as offlineimap's status folder (usually on ~/.offlineimap). Setting it up and doing a sync (which may take some time the first time) is all you need to use the labels stuff.

You can check the message files for the X-Keywords header to make sure it worked. Then in order for mu4e to recognize the new header, you need to reindex your maildir.

@attila-lendvai
Copy link

i didn't check the most obvious place... sorry... :/

after fixing my config file, deleting my text based old state, and letting an sqlite state be rebuilt... now i have a working mu4e where i have successfully run some tag based queries.

all seems to work, but i need to get rid of the local draft folder where emacs saves .#backup files that confuses syncing, and i'm not sure it won't save unencrypted drafts which shouldn't get synced to gmail.

and maybe also need to look into how to use search query for inbox and only sync All Mail. but that's another story...

syncing works, thanks for the help!

This broke code that relied on the filename being up to date in memory after
messages are copied.
It seems NotImplementedException does not exist. It must be a relic from old
python...
Added the configuration setting usecompression for the IMAP repositories.
When enabled, the data from and to the IMAP server is compressed.
Preparing for a sequence of commits implementing gmail label sync, I have split
some functions so I will be able to change some functionality in folder.Gmail,
with less code repetition.

Also added some functions to folder.base and imaputil I will later need.
Adds two columns to the LocalStatus Sqlite table:
  * labels: A comma separated list of labels, to be used by Gmail folder.
  * mtime: The POSIX modification time for the message in a local Maildir.

The interface for the class LocalStatusSQLite remains compatible with what it
was (i.e. new arguments to functions are optional with default values).
When synclabels config flag is set to "yes" for the gmail repo, offlineimap
fetches the message labels along with the messages, and embeds them into the
body under the header X-Keywords, as a comma separated list.

The configuration option labelsheader allows to change that header under which
labels are stored. X-Keywords is a useful choice as some mail programs may
recognize it.

It also adds an extra pass to savemessageto, that performs label synchronization
on existing messages from gmail to local, the same way it is done with flags.

The ignorelabels configuration seting contains is a list of comma separated
labels that will be left alone. They will not be added nor removed from any
message.
The GmailMaildir repository adds functionality to change message labels. Also
keeps track of messages modification time, so that it can quickly detect when
the labels may have changed.
When filterheaders is set to a comma-separated list of headers, offlineimap
removes those headers from messages before uploading them to the server.
aroig added 10 commits March 27, 2014 21:47
Corrects behaviour of addheader when the message has no '\n\n' separating
headers from content (it may happen with empty content).
A newline was missing when there is no '\n\n' and adds the header at the
beginning.
A query for X-GM-LABELS had slipped in commit 6ecdfc4.
This produced a TypeError because self.messagelist remained None,
instead of {}.
cachemessagelist from Gmail folder mixed up uid's with sequential
message numbers. This effectively broke Gmail folder when synclabels is
enabled and some filtering on messages is performed, like maxage.

Now we use make sure to use sequential numbers in cachemessagelist, as
it compactifies into ranges better than uid's.
For instance, if we run offlineimap with maxage set up, older messages
in statusfolder get removed.

The, if you disable maxage again, older messages will be missing in
status folder, and this leads to KeyError's.
  * Implements Status Folder format v2, with a mechanism to upgrade an
    old statusfolder.

  * Do not warn about Gmail and GmailMaildir needing sqlite backend
    anymore.

  * Clean repository.LocalStatus reusing some code from
    folder.LocalStatus.

  * Change field separator in the plaintext file from ':' to '|'. Now
    the local status stores gmail labels. If they contain field
    separator character (formerly ':'), they get messed up. The new
    character '|' is less likely to appear in a label.
  * Do not inherit LocalStatusSQLiteFolder class from the plaintext
    one.

  * Use some functions already in BaseFolder in both, plaintext and
    sqlite classes.

  * Add a saveall method. The idea is that saveall dumps the entire
    messagelist to disk, while save only commits the uncommited
    changes. Right now, save is noop for sqlite, and equivalent to
    saveall for plaintext, but it enables to be more clever on when we
    commit to disk in the future.

  * Do not migrate from plaintext in LocalStatusSQLiteFolder. That was
    quite hackish, and broke with plaintext v2.
If when we request a LocalStatus folder, the folder has to be created,
we look whether the other backend has data, and if it does we migrate it
to the new backend.

The old backend is left alone, so that if you change back say from
sqlite to plaintext, the older data is still there. That should not lead
to data loss, only a slower sync while the status folder gets updated.
In particular, this makes changing labels for an existing message
safer.

Previously, we did not use a unique filename in tmp/ (used the existing
name) and that could lead to filename collision and data loss if
multiple offlineimap processes accessed the same file.
@aroig
Copy link
Member Author

aroig commented Apr 6, 2014

I rebased this on top of current master

@seanfarley
Copy link

Thanks @aroig! I'm thinking of just using your repo for the MacPorts version because I'm tired of this pull request not being merged in.

@fommil
Copy link

fommil commented Apr 27, 2014

@aroig this sounds amazing... I'd love to get this working with http://notmuchmail.org

Their philosophy is to not touch the original Maildir files (except for some flags like S (seen), R (replied) and F (flagged) which I understand change the name of the file on disk).

Imagine for a moment that I hacked notmuch so that it modified the mail files themselves to add/remove the relevant header information. Would that work seemlessly with your patches, or does adding/removing a label require creating a new message and deleting the old one? (i.e. are messages mutable or immutable?)

@aroig
Copy link
Member Author

aroig commented Apr 27, 2014

@fommil This patch makes messages into a mutable thing. When labels change on gmail, the message file gets modified.

Notmuch already has tags, the trouble with them is that they get stored into the database. Making notmuch store tags inside messages would be like converting a tool that only sniffs around into something that can change and potentially mess up valuable data. I don't think any such patch would be accepted upstream. Plus it would require a nontrivial amount of work to keep the tags inside the database and the tags inside the messages in sync.

My solution was to move away from notmuch and use mu (http://www.djcbsoftware.nl/code/mu/) for mail indexing. mu does not store extra data into the database, it only indexes what is already there, including labels, making things simpler. Then I can modify the labels from the message files with custom scripts or extensions of my email client and the mail indexer can keep just sniffing around.

@fommil
Copy link

fommil commented Apr 27, 2014

@aroig that is all great news. I'm loving notmuch but I'll also look into mu.

@fommil
Copy link

fommil commented Apr 27, 2014

@aroig have you got a tutorial or something for setting up your branch and mu together to do tag/label syncing? e.g. what needs to change from a standard setup

@fommil
Copy link

fommil commented Apr 27, 2014

(and which versions are needed of mu... I have 0.9.9)

@aroig
Copy link
Member Author

aroig commented Apr 27, 2014

@fommil I don't have any tutorial, sorry. For the offlineimap labels thing, you can look at the man page. As for mu, it essentially works fine out of the box, using X-Keywords for the labels header.

I think mu 0.9.9 should be ok for the indexing part. If you want to use mu4e as a mail client, there is a custom action for changing labels that came a few months later. It is fine for tagging single messages but doesn't do bulk tagging yet. I use a script I hacked together for that.

If you have further question regarding this particular setup, please email me, so that we do not hijack this pull-request thread.

@chmduquesne
Copy link

A thing that would be backward compatible with other tools, including
notmuch, would be that instead of adding a header with the tag in the mail,
hack offlineimap so that it makes hardlinked copies of the message in the
relevant subfolders. This would save some bandwidth and some space on the
disk, and notmuch would be happy. Then syncing the tags with notmuch would
just be a matter of adding/removing the message in/from the relevant
subfolder whenever notmuch manipulates the tags.

On Sun, Apr 27, 2014 at 10:09 PM, Abdó Roig-Maranges <
notifications@github.com> wrote:

@fommil https://github.com/fommil I don't have any tutorial, sorry. For
the offlineimap labels thing, you can look at the man page. As for mu, it
essentially works fine out of the box, using X-Keywords for the labels
header.

I think mu 0.9.9 should be ok for the indexing part. If you want to use
mu4e as a mail client, there is a custom action for changing labels that
came a few months later. It is fine for tagging single messages but doesn't
do bulk tagging yet. I use a script I hacked together for that.

If you have further question regarding this particular setup, please email
me, so that we do not hijack this pull-request thread.


Reply to this email directly or view it on GitHubhttps://github.com//pull/43#issuecomment-41507377
.

@fommil
Copy link

fommil commented Apr 28, 2014

@chmduquesne that would be great... but I think there is still quite a bit of work to be done on the notmuch side before this is possible. I'd be happy with the wasted space initially, and then look at hard links later. Anyway, we're hijacking the thread :-)

@konvpalto
Copy link
Member

I had managed to merge 21 out of 22 patches to my 'next' branch, so some testing (around 2-3 days) and if all will go fine, it will be pushed to the 'next' branch at GitHub.

Thanks for everyone's patience. And thanks to @aroig for his work!

@konvpalto
Copy link
Member

Current drop into "next", namely

incorporate all stuff from this pull request. Please, test.

Thanks for the submission and your work.

@aroig
Copy link
Member Author

aroig commented May 6, 2014

Cool! Thanks for taking the time to review it! I'm running from next now and I'll be testing it in the following days.

I just ran pylint on offlineimap's source tree, and managed to discover a bunch of bugs, mostly trivial typos. This is pull reques #86

@konvpalto
Copy link
Member

Thanks! Please, tell about the outcome of the testing: I think that we had gathered enough commits to roll out a new version and make people happy.

@aroig
Copy link
Member Author

aroig commented May 7, 2014

Since yesterday it is working fine. I'd like to do some more testing other than daily usage, like migrating from a no-labels Maildir, etc. I'll probably do that over the weekend and report back.

@aroig
Copy link
Member Author

aroig commented May 11, 2014

Hi again,

I've done some more testing, including migration from a pre-labels Maildir. Everything seems to work fine, except for some small issues I fixed in pull request #87

@konvpalto
Copy link
Member

Thanks, this was incorporated into 6.5.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet