Skip to content

Conversation

@link2xt
Copy link
Collaborator

@link2xt link2xt commented Feb 13, 2021

Current plan is to place the image directly into Chat-User-Avatar header. As the header can be large, it should be placed into the root of encrypted MIME message, like Autocrypt-Gossip. If the message is not encrypted, single-part multipart/mixed message should be created and protected_headers placed into its root.

@link2xt link2xt mentioned this pull request Feb 13, 2021
4 tasks
@link2xt link2xt marked this pull request as draft February 13, 2021 20:58
@link2xt link2xt force-pushed the chat-user-avatar-base64 branch 2 times, most recently from 77b6f36 to d5349c8 Compare February 13, 2021 23:49
@link2xt link2xt mentioned this pull request Feb 14, 2021
@link2xt
Copy link
Collaborator Author

link2xt commented Feb 14, 2021

One problem is that Delta Chat currently only parses headers attached to the outermost part and to the outermost encrypted part.

If the message is unencrypted and we attach a header to the main part, which is then wrapped into multipart/mixed, this header will not be parsed.

The plan is to introduce "hidden headers" in addition to "protected" and "unprotected". They should be attached to the same place as protected headers if the message is encrypted, and to the main part of multipart/mixed if it is not encrypted.

When parsed, if message is unencrypted, "hidden headers" should have lower priority than unprotected headers, so it's impossible to put a From header inside multipart/mixed and override outermost From with it.

@link2xt link2xt force-pushed the chat-user-avatar-base64 branch 2 times, most recently from aad5690 to fce8e4c Compare February 14, 2021 21:19
@r10s r10s mentioned this pull request Feb 16, 2021
@link2xt link2xt force-pushed the chat-user-avatar-base64 branch 3 times, most recently from 9945c85 to 35e78af Compare February 16, 2021 20:51
@link2xt link2xt force-pushed the chat-user-avatar-base64 branch 2 times, most recently from e4db38d to 4da878e Compare February 28, 2021 15:32
@link2xt link2xt force-pushed the chat-user-avatar-base64 branch from 4da878e to 87578e5 Compare March 5, 2021 03:02
@link2xt link2xt marked this pull request as ready for review March 5, 2021 03:02
@link2xt link2xt force-pushed the chat-user-avatar-base64 branch from 87578e5 to e7640c6 Compare March 5, 2021 03:18
@hpk42
Copy link
Contributor

hpk42 commented Mar 5, 2021

Not sure about "hidden headers" -- why precisely do you need this category?
I would think that we try to always put our special chat headers (other than Autocrypt) into multipart/mixed and it always overrides top-level headers (if any). If encrypted, they become better protected. If we still parse top-level headers (in case we don't find a particular header in the multipart/mixed) we would be backward compatible but at some point we could switch off support for parsing chat-headers from top-level at all. One question is if current providers let the bodies with extra mime headers pass.

sidenote: the body (here multipart/mixed) is DKIM-hashed (pretty much by everyone, it's the "bh" attribute) so we could play with veryfying the body hash and get some protection against mangling from providers. But this is not the main argument here.

@link2xt
Copy link
Collaborator Author

link2xt commented Mar 5, 2021

I would think that we try to always put our special chat headers (other than Autocrypt) into multipart/mixed

The problem is that previous versions of Delta Chat don't read the headers there. So it's better to keep them where they are, at least for now.

Copy link
Contributor

@r10s r10s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apart from the issued mentioned below, lgtm, also tested sending a bit, very nice to get clean mails :)

EDIT: see comment below, there is a maybe blocking issue with outlook.

///
/// These are large headers which may hit the header section size limit on the server, such as
/// Chat-User-Avatar with a base64-encoded image inside.
pub hidden: Vec<Header>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if hidden is a good wording here. other suggestions that are maybe a bit closer to the purpose might be large or noimf.

let message = headers
.protected
.into_iter()
.fold(message, |message, header| message.header(header));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as pointed out by @link2xt at #2232 (comment) , these protected headers are needed in the outermost imf-header for unencrypted messages.

however, the logic below (last else branch) will move them to the first body-part-header, where they are not readable by current delta chats - and also not by other mua, which results in messages without a subject (the subject is an protected header, however, must be moved there only when encryption takes place).

probably only a tiny change is needed to correct that, however, with a large impact then :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a fixup after your test.

@r10s
Copy link
Contributor

r10s commented Mar 9, 2021

UNFORTUNATELY THIS DOES NOT WORK WITH OUTLOOK :(

we're getting the error Permanent SMTP error: permanent: 5.6.0 Invalid message content when handling MIME exception. IMC53

hm, seems as if they do not like these headers, we have suspected/feared that a bit, indeed.

it is a pity, what are they fingering around in the message ...

hm, not sure if it makes sense to follow that way longer, maybe vcard is the way to go, although i still think, it is not the best approach ux-wise as these attachments are unexpected. we should think over that, however.

EDIT: or is the linewrapping just wrong? Chat-User-Avatar: is followed by a linebreak instead of the start of the value, https://gist.github.com/r10s/acf5bd6521b772fff7575c93b234c304 , at least i have not seen that much before

@r10s
Copy link
Contributor

r10s commented Mar 9, 2021

EDIT: or is the linewrapping just wrong? Chat-User-Avatar: is followed by a linebreak instead of the start of the value, https://gist.github.com/r10s/acf5bd6521b772fff7575c93b234c304 , at least i have not seen that much before

no that seems not to be the case, i tested with wrapping at 42 characters which resulted in ...

Chat-User-Avatar: base64:iVBORw0KGgoAAAANSUhEUgAAAI8AAAEAEAYAAADaYv
	lpAAFEEUlEQVR4nOz9dZwVx5fHjZ/qOz7DCO4M7u4Q ...

... still the same issue. so it is maybe either length or header-name.

@r10s
Copy link
Contributor

r10s commented Mar 9, 2021

... still the same issue. so it is maybe either length or header-name.

it is the size. i could send an avatar of 2k through outlook, a 83k avatar fails.

EDIT: after binary testing, 22.5k works while 25k does not, with base64 encoding, this looks like a 32k header limit, a bit less.

@r10s
Copy link
Contributor

r10s commented Mar 9, 2021

EDIT: after binary testing, 22.5k works while 25k does not, with base64 encoding, this looks like a 32k header limit, a bit less.

the limit is per-header-key. i could add ten different header-keys of 22.5k each. the resulting header, incl. base64 encoding is >300k. not sure, however, if that is a way to go or if that gets too hackish ;)

@link2xt
Copy link
Collaborator Author

link2xt commented Mar 10, 2021

Splitting similar to filename*0, filename*1 etc. sounds ok, but in our case we'll split into several headers, like Chat-User-Avatar*0, Chat-User-Avatar*1 etc.

@r10s
Copy link
Contributor

r10s commented Mar 10, 2021

Splitting similar to filename0, filename1 etc. sounds ok, but in our case we'll split into several headers, like Chat-User-Avatar*0, Chat-User-Avatar*1 etc.

i tested that with outlook and gmail, using 10 headers of these headers of each 22.5k - no problems, so, yes, maybe this is the way go go if it does not get too complicated. if we run into problems, we can go to the old method again (at least the receiving code is there). in practise, maybe use a bit less, maybe 16k (incl. base64 overhead) per header.

wrt DKIM: that is added by smtp? so, we can be probably pretty sure the body is not modified on receiving, once it is sent out (otherwise far more problems may arise) (i did not test the receiving part at all yet)

@hpk42
Copy link
Contributor

hpk42 commented Mar 11, 2021

Typically DKIM does not sign our Autocrypt or Chat-* headers. They could be modified by MTAs without detection. Then again, we do not directly deal with DKIM which is defined on the MTA-to-MTA level, not MUAs. Note that we can suggest/recommend to sign certain headers but a "Chat-Avatar-*" dynamic headerlist would be tricky, likely, for providers. So what about only using the full avatar in encrypted mkessages and otherwise down-compute it to 22.5K for outlook via a "max_header_size" provider-db setting?

@r10s
Copy link
Contributor

r10s commented Mar 12, 2021

So what about only using the full avatar in encrypted mkessages and otherwise down-compute it to 22.5K for outlook via a "max_header_size" provider-db setting?

during one-to-one talk with @hpk42 , we came to the conclusion, that provider-db is not really needed (and also may be complicated wrt different domains, groups). we can do the 22.5K limit just always for unencrypted messages. most avatars are below that anyway.

@link2xt
Copy link
Collaborator Author

link2xt commented Mar 12, 2021

I also don't like using provider-db for such workarounds, because there are always servers which we don't know about.

@r10s
Copy link
Contributor

r10s commented Mar 13, 2021

i just had a look on my real-life account: there are 355 avatars (files starting with avatar* in blobdir), 50% smaller than 8k, only 10 avatars (less than 3%) are larger than 22.5k - the largest one is 178k, the second largest one 80k. so, having a guarantee that avatars get not too big seems to be a worthful thing anyway.

  1. we could either do this additional check at setting selfavatar, that would require an update of existing avatars.

  2. or do the additional check directly before sending, which, however, may result in double recoding. maybe not the best option - but if that is a simple way to go, probably still fine as we can refine later in this case (no persisted data are changed)

  3. we could also always keep the original on setting selfavatar and recode as needed. not sure, if that would require additional caching performance-wise, in general, i would avoid that :)

@link2xt link2xt force-pushed the chat-user-avatar-base64 branch from e7640c6 to 814362c Compare March 15, 2021 00:24
@link2xt
Copy link
Collaborator Author

link2xt commented Mar 15, 2021

I have also looked into putting the avatar in multipart/related, but Thunderbird shows unreferenced attachments.

We are not the first to try to put large images into headers, in this case author found that headers are limited to 998 characters on some servers: https://quimby.gnus.org/circus/face/

Compressing avatars further would be nice, but there is no easy way to request particular size from the encoder. image create does not support WebP encoding, but supports AVIF encoding. Maybe we could switch to it later, but sure not in this PR. This will need either support on all platforms, which is probably true only for WebP but not AVIF, and images may still be too large occasionally.

For now the easiest solution seems to be splitting into multiple headers.

@link2xt
Copy link
Collaborator Author

link2xt commented Mar 15, 2021

For reference, * is allowed in header field names: https://tools.ietf.org/html/rfc5322#section-3.6.8

@r10s
Copy link
Contributor

r10s commented Mar 15, 2021

We are not the first to try to put large images into headers, in this case author found that headers are limited to 998 characters on some servers: https://quimby.gnus.org/circus/face/

hm, but that refers to the outer imf header, not to the inner body part headers, or? and i think even for the outer imf header, the 998 characters limit is a bit outdated the real-world. Autocrypt headers are larger most times, also many other headers.

Compressing avatars further would be nice, but there is no easy way to request particular size from the encoder.

maybe just check the size in a loop: if it is too large, divide width/height by N, recode to JPEG, and check the size again. compared to other things, that does not sound that hard; we could put that in a function.

For reference, * is allowed in header field names: https://tools.ietf.org/html/rfc5322#section-3.6.8

but there may still be real-world-problems of specific implementations (RFC also does not limit header size afaik :) - and, even more real, the DKIM issue mentioned by @hpk42

also, there may be other limits in the header, as an overall header size of 200K or so, we would bypass that by using an upper limit for avatars as well.

@link2xt
Copy link
Collaborator Author

link2xt commented Mar 15, 2021

and, even more real, the DKIM issue mentioned by @hpk42

What is the issue? I think @hpk42 only suggested to move more headers into body to protect them with DKIM. Which is not possible for Autocrypt though as it should be in the IMF header according to the spec and other clients expect it only there.

maybe just check the size in a loop: if it is too large, divide width/height by N, recode to JPEG, and check the size again. compared to other things, that does not sound that hard; we could put that in a function.

Probably makes sense to open a separate PR for this and merge it in first then. recode_to_image_size currently uses hardcoded pixel width/height, can be changed to reduce the image size until it fits some byte size instead. By the way I think it should return error if the image is not JPEG rather than leaving it as is. And this needs a migration to recode existing avatars or remove them if they can't be recoded.

@r10s
Copy link
Contributor

r10s commented Mar 15, 2021

wrt DKIM: i was refering to the comment #2232 (comment) , however, if DKIM only uses headers in the outer imf-header and our Chat-User-Avatar is always inside body this is not really an issue, indeed.

Probably makes sense to open a separate PR for this and merge it in first then. recode_to_image_size currently uses hardcoded pixel width/height, can be changed to reduce the image size until it fits some byte size instead.

yip, we can do a separate pr, we can even merge this one in first (remaining issue), that would only make outlook unusable for some days, that is no depreciation :)

for recoding, i would keep current BALANCED_AVATAR_SIZE/WORSE_AVATAR_SIZE (which are width+height) - only when the result in bytes is larger that 22k, divide width+height by N and try over.

but, yes, it makes sense to discuss this in a separate PR.

By the way I think it should return error if the image is not JPEG rather than leaving it as is. And this needs a migration to recode existing avatars or remove them if they can't be recoded.

hm, PNG comes from images selected from gallery, i would not cut that option. but we can always recode to JPEG.

@r10s r10s mentioned this pull request Apr 14, 2021
@link2xt link2xt force-pushed the chat-user-avatar-base64 branch 2 times, most recently from 14b0a20 to 438172b Compare April 18, 2021 00:03
Copy link
Contributor

@r10s r10s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i re-checked the issue from https://github.com/deltachat/deltachat-core-rust/pull/2232/files#r590718361 - it is still there.

to reproduce:

  • create a new account, set avatar
  • set unencrypted mail

-> the Subject: header is not in the outer IMF header section but in the inner one.

EDIT: i have written a failing test for that issue at #2370

Copy link
Contributor

@r10s r10s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, great thing! much nicer emails without questions wrt weird and unexpected attachments now :)

successor of this pr will probably be #2298

@Hocuri Hocuri force-pushed the chat-user-avatar-base64 branch from 116b77e to c94fc9f Compare April 29, 2021 14:14
@Hocuri Hocuri force-pushed the chat-user-avatar-base64 branch from c94fc9f to 357eb8d Compare April 29, 2021 16:28
@link2xt link2xt merged commit f713933 into master May 1, 2021
@link2xt link2xt deleted the chat-user-avatar-base64 branch May 1, 2021 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants