Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measuring Gmail clipping limit #41

Open
hteumeuleu opened this issue May 17, 2018 · 34 comments
Open

Measuring Gmail clipping limit #41

hteumeuleu opened this issue May 17, 2018 · 34 comments
Labels

Comments

@hteumeuleu
Copy link
Owner

@hteumeuleu hteumeuleu commented May 17, 2018

Yesterday, an interesting conversation was started on the #emailgeeks Slack by @hellocosmin regarding Gmail clipping limit.

Do we have a source for the magic 102KB message size Gmail clipping limit? I'm curious if someone actually tested this or if it was communicated from an official source, as I've been blindly following it (like most of us probably have), and I've recently seen emails that were 99.something KB getting clipped (before you ask, it was after ESP added stuff to it).

A few people, including myself, weighed in to share their experience and a few test results, but without a definitive answer yet. I loved the little collaborative investigation that went on for a few hours, but it seems Slack is not really appropriate for this (given the temporary nature of conversations there). So I thought here would be a good place to continue this together.

The problem

"Message clipped" screenshot in Gmail

When HTML emails are too large, Gmail clips them with a [Message clipped] notice and a View entire message link. It's been widely shared that this clipping occurs after 102Kb (example: Gmail is clippin my email on Mailchimp). But no one seems to know where this number comes from. And different people have experience different results around the 100Kb mark.

So what is going on exactly? Can we figure out Gmail's clipping algorithm?

@hteumeuleu
Copy link
Owner Author

@hteumeuleu hteumeuleu commented May 17, 2018

Here are a few tests I ran yesterday.

First tests

My first question was how any of this was calculated. Does Gmail consider ~100Kb after doing all its prefixing and filtering (removing styles and HTML tags it doesn't support, converting class names and such)? So first I ran this test with 400 tables and 400 style tags.

Test email clipped at the 181th table

In this test, the email is clipped at the 181th email. If we try to reproduce this locally by keeping the 400 style tags but only keeping 181 tables, we obtain an HTML that weighs exactly 100 Kb (or 99 507 bytes / 102 Kb on drive according to macOS info dialog). Here's the result file of this first test.

To confirm this, I ran a second test without the style tags this time, only the 400 tables. The result shows the email is clipped at the 254th table.

Test email clipped at the 254th table

By reproducing this locally (and only keeping 254 tables), I can measure the weigh of the file to be 100 Kb (or 99 906 bytes / 102 Kb on drive according to macOS info dialog). Here's the result file of this second test.

Finally, I ran a third test with 400 tables, each with an HTML data attribute on each <td> (<td data-a-very-long-attribute-that-gmail-will-filter="true">). This should confirm whether Gmail measures the weight before or after any filtering.

Test email clipped at the 223rd table

The result shows the email is clipped at the 223rd table. But interestingly, the dummy text is clipped right after the first word. Here's the code of this table as seen in Gmail in Chrome by inspecting the code.

<table border="0" cellpadding="0" cellspacing="0" width="100%">
	<tbody>
		<tr>
			<td class="m_320685387637992867style223">
				<h1>223</h1>
				<p>
					Lorem </p>
			</td>
		</tr>
	</tbody>
</table>

If we reproduce that exact same email locally (with only 223 tables and the text clipped at the first word at the end), we obtain a file that is once again exactly 100 Kb (or 100 217 bytes / 102 Kb on drive according to macOS info dialog). Here's the result file of third test.

First observations

Here are the first observations that I draw from these three tests:

  • The weight considered by Gmail for clipping is before any filtering or prefixing happens. So we can use the weight measured in our OS as an indication.
  • My three tests return the email clipped at slightly different weights, but always at the exact same weight "_on drive" (102 Kb on drive according to macOS info dialog).
@revelt
Copy link

@revelt revelt commented May 17, 2018

Thank you for sharing!

It's definitely not exactly 100KB, I've seen crop happen at lesser sizes. My "rule of thumb" has always been to aim at file sizes less than 80KB. 1 character = 1 byte so that's around 80,000 characters, what's easy to check in the code editor if you select-all and see the status bar for total character count.

What's also interesting, if you consider, Gmail will receive not your HTML but what ESP sent it, basically, what you see in the "message's raw source". Various factors will bloat the served HTML code in there: ESP link URL scrambling, quoted printable encoding, sometimes ESP's serve message as BASE64-encoded...

So, maybe 100 could be the threshold, but definitely uppermost and very possible in the shape of 100x1024 characters in the raw source in the HTML part of the message, as received by email server. But I'd aim for less than 80,000 characters in source HTML.

@cossssmin
Copy link

@cossssmin cossssmin commented May 17, 2018

Here's the third test file in Windows:

image

Differences are to be expected, and I think neither are true to what Gmail actually counts server side, as @revelt pointed out.

On this note, I'd encourage taking tools that test your HTML file size against 'Gmail's limit' with a grain of salt. The point is 'we don't know exactly yet', so apparently useful tools might be a little misleading. It can happen they're just a tiny bit off, but that can be the difference between your tracking pixel being removed in all Gmail clients, or not.

Here's the thrid test file that was clipped, in the tool I linked to above:

image

@revelt
Copy link

@revelt revelt commented May 17, 2018

Very very cheeky

@hteumeuleu
Copy link
Owner Author

@hteumeuleu hteumeuleu commented May 17, 2018

@hellocosmin You tested the file after clipping. So isn't the tool actually accurate to show that it will pass Gmail clipping? If I test the original file, it indeed says the email is too big (at 175.869140625 Kb).

@revelt Good point on being careful with manipulations done on the ESP side. I used Putsmail for all my tests, which is pretty safe as far as I know.

@cossssmin
Copy link

@cossssmin cossssmin commented May 17, 2018

Indeed Rémi, I've tested the same third result file. You mentioned macOS reported to be 102KB on disk.

So an HTML that you'd think would very likely be clipped (according to macOS' report) was actually reported as 'safe'. Even if we don't consider 'on disk', and we take it to be 100KB, the difference is still large enough to be misleading: there's a 2,1318359375KB difference at the very least, so logically I can imagine a ~103.13KB file as also being reported 'safe'.

My point was we should take such file size weighing tools as estimates, and definitely not fully trust some JavaScript that calculates text file size as being the same thing Gmail does when deciding to clip :)

@pbiolsi
Copy link

@pbiolsi pbiolsi commented Jun 6, 2018

From your testing/understanding, what do we know about how/if images impact the calculable email weight by Gmail?

Since we're all serving images through some external server or CDN, I've always assumed that their size only had an implication on bandwidth/loadtime... but perhaps that is incorrect and the Gmail image cache is at all to blame for clipping?

Our embedded styles run pretty lean (so not triggering the CSS characters limit) and we use built-in minification provided by our ESP (Listrak) at the time of send to remove whitespace. Yet, we still see emails as small as 32kb getting clipped by Gmail.

Any possibility they impose a pixel height-based limitation that could be triggering that? Like Outlook has been known to?

@cossssmin
Copy link

@cossssmin cossssmin commented Jun 6, 2018

Never saw emails that small being clipped in Gmail, and I sent quite a few that were larger than 32KB. You sure your ESP isn't adding stuff to what Gmail receives?

@revelt
Copy link

@revelt revelt commented Jun 6, 2018

@pbiolsi check the raw source. I'm pretty sure Gmail is measuring raw source. Now, if message is multipart, there's Base64 with some heavy link scrambling, sizes will bloat significantly..

1 character in your email's raw source = 1 byte. That includes escape characters (used in quoted-printable encoding for example). After you rule-out the raw source, then move to images.

Sizes as low as 32KB should not get clipped, something's wrong here

@pbiolsi
Copy link

@pbiolsi pbiolsi commented Jun 8, 2018

@hellocosmin @revelt great points about the ESP bloat, however I've actually seen this in Litmus previews using the Chrome extension (so serving local source pre-ESP... also un-minified in this case). Maybe something else is causing this from the Litmus side, but I believe those previews use PutsMail so should be pretty true to source.

I'll investigate more and post back.

@hteumeuleu
Copy link
Owner Author

@hteumeuleu hteumeuleu commented Oct 16, 2018

Last week on the #emailgeeks Slack, @M-J-Robbins shared an interesting example that gets clipped because of a special character.

Just got an email through that Gmail said was clipped but the code is tiny and no obvious errors.

This is the character in question `` (not sure if slack will auto convert that) but I believe it’s this one https://unicode-table.com/en/0092/

Here’s an html example https://litmus.com/scope/ajulzityfxh6

<html>
  <p>�you�re</p>

A screenshot on Gmail showing an email getting clip because of a special character

This only happens on the new Gmail redesign, not on the old one. This is very interesting because I also noticed a lot of emails getting clipped for no apparent reason since the redesign. I'll try to see if there are more characters triggering this.

@ericlepetit
Copy link

@ericlepetit ericlepetit commented Oct 16, 2018

Funny enough the Github email notification for this message was clipped on Gmail :)
I will bring this up to the Gmail team.

@revelt
Copy link

@revelt revelt commented Dec 10, 2018

hi all, just crossposting here from email geeks slack for posterity because this is surfacing up once in a while. If you want to check, does your email contain any non-ascii characters (like Unicode's culprit "Private Use Two" above), I created a CLI app (terminal app) for that: https://www.npmjs.com/package/email-all-chars-within-ascii-cli

@hthompson82
Copy link

@hthompson82 hthompson82 commented Dec 18, 2018

Not only have I had issues with emails smaller than 102kb clipping, I've also sent several emails WELL over the 102kb (180-220kb) which don't clip. Anyone have any ideas? Getting a brick wall from Google whenever I try to talk to anyone about it.

@revelt
Copy link

@revelt revelt commented Dec 18, 2018

@hthompson82 hi! Do you remember, what was the encoding of your raw messages that arrived into Gmail server (for example, quoted printable, Base64 etc.); did you measure character count in the raw HTML, decoded or the HTML that was put into ESP (former is more interesting); also was the message multipart and if so, how big was text part (hypothesis being text version's content might have affected Gmail clipping limit)?

@hthompson82
Copy link

@hthompson82 hthompson82 commented Dec 20, 2018

@revelt The encoding is "text/html; charset=utf-8". I count the kb size according to how it comes in to my Outlook (since I can't see size in Gmail) so that I can be sure it includes encoding, dynamic data etc. Our test sends are generally multipart but the text version varies between one word ("test") and a couple of paragraphs. Testing without the multipart text version still results in clipping, on the ones which clip.

@jkupczak
Copy link

@jkupczak jkupczak commented Dec 20, 2018

@hthompson82 When you say Outlook, do you mean the Outlook web client or do you mean the desktop application that you have to install?

If you mean the desktop application, does Outlook let you see the original message? I don't have Outlook installed at the moment so I can't check for myself. But I thought that Outlook would only show you the message source which is the version that Outlook parsed using the Word engine.

@cossssmin
Copy link

@cossssmin cossssmin commented Dec 20, 2018

@jkupczak the desktop Outlook app can show you the original HTML source, as it was received. You need to double click the message to open in a new window, then click "Message" in the top left, and then:

View source in Outlook

@hthompson82
Copy link

@hthompson82 hthompson82 commented Jan 3, 2019

@hthompson82 When you say Outlook, do you mean the Outlook web client or do you mean the desktop application that you have to install?

If you mean the desktop application, does Outlook let you see the original message? I don't have Outlook installed at the moment so I can't check for myself. But I thought that Outlook would only show you the message source which is the version that Outlook parsed using the Word engine.

Hi @jkupczak , i use the Outlook desktop app, which includes the size as a column in the inbox. (I also test emails via mail.yahoo.com and hotmail.com in the browser. Plus the various mobile apps on iOS and Android)

As an aside, I've also tested compressing the HTML (stripping out all white space) but to no avail.

@hteumeuleu
Copy link
Owner Author

@hteumeuleu hteumeuleu commented Feb 11, 2019

Last week on the #emailgeeks Slack, @M-J-Robbins shared an interesting example that gets clipped because of a special character.

Just got an email through that Gmail said was clipped but the code is tiny and no obvious errors.
This is the character in question `` (not sure if slack will auto convert that) but I believe it’s this one https://unicode-table.com/en/0092/
Here’s an html example https://litmus.com/scope/ajulzityfxh6

<html>
  <p>�you�re</p>

A screenshot on Gmail showing an email getting clip because of a special character

This only happens on the new Gmail redesign, not on the old one. This is very interesting because I also noticed a lot of emails getting clipped for no apparent reason since the redesign. I'll try to see if there are more characters triggering this.

This bug mentioned in this thread seems fixed. Can anyone else confirm?

@revelt
Copy link

@revelt revelt commented Feb 12, 2019

Not fixed. Same thing — both original <html><p>�you�re</p> and also same thing wrapped with normal HTML head/body in a table. This is still happening.

screen shot 2019-02-12 at 00 40 26

@hteumeuleu
Copy link
Owner Author

@hteumeuleu hteumeuleu commented Aug 19, 2019

Got an even shorter example triggering the clipped message being shown because of a special character:

<html>
  <p>©</p>

Encoding the copyright character into &copy; fixes the problem here.

@jclusso
Copy link

@jclusso jclusso commented Sep 17, 2019

@hteumeuleu I have this issue and I'm using &copy; No idea why though.

@jkupczak
Copy link

@jkupczak jkupczak commented Sep 17, 2019

@jclusso

Got any information you can share.

@revelt
Copy link

@revelt revelt commented Sep 17, 2019

yeah, let's create a minimal case to be able to reproduce.. as they say "He who asserts must prove"

@jclusso
Copy link

@jclusso jclusso commented Sep 17, 2019

I think the issue is that the provider I'm using (customer.io) to send the emails is setting the Content-Type header to text/html; charset=iso-8859-1. This still confuses me why &copy; won't work since it should as far as I'm aware. Most other emails I get seem to be text/html; charset=utf-8 which makes me believe that would solve this.

@hthompson82
Copy link

@hthompson82 hthompson82 commented Sep 17, 2019

Just sent two test emails from Adobe Campaign, one with &#x00A9; and one with ©. The unencoded version clipped, and the encoded one did not. Looks to me like the theory is sound. I use hexadecimal instead of a named entity, though.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width" />
<title>Untitled Document</title>
</head>

<body>
	&#x00A9;
</body>
</html>
@revelt
Copy link

@revelt revelt commented Sep 17, 2019

@hthompson82 I checked the Adobe Creative Cloud newsletter which is allegedly sent from their platform. The encoding in the raw source is charset="windows-1252", encoded in quoted printable, with a corresponding charset=Windows-1252 meta tag in HTML. It seems all nice until one tries to decode the following raw piece:

See the tips<span class=3D"we=
b"> =9B</span></a></td>=20

It is not Windows-1252 but Windows-1251/Windows-1257 — a chevron. Test yourselves, https://dencode.com/en/string/quoted-printable

So, to sum up, it seems that Adobe Campaign platform is using a wrong charset to encode their quoted printable, and as a consequence, all non-ascii characters are mangled. All html-encoded characters don't get encoded because they're within ascii so they're fine.

@cossssmin
Copy link

@cossssmin cossssmin commented May 16, 2020

Here's a weird one:

https://m3.news.ubisoft.com/nl/jsp/m.jsp?c=%405nrrN%2BHHqfIXH1q%2FHaFADEvE%2Bv1jGn1Gp%2FMdstPVNaI%3D

19KB email, shown in full, but still showing the 'clipped' message. I checked the source in Gmail web, and even the tracking pixel is there, so nothing was actually clipped...

This is what it looks like in Gmail web:

image

@hteumeuleu
Copy link
Owner Author

@hteumeuleu hteumeuleu commented May 16, 2020

@cossssmin There's a © character in the footer, so I guess that's why.

@Nivicious
Copy link

@Nivicious Nivicious commented Aug 17, 2020

Just re-coded a template for a new client due to clipping.
Reduced the raw HTML file by 100kb to now sit at 51kb.
Tested via Putsmail and it's still clipping.

Read through this thread and simply replaced the © character with &copy; and voila, fixed.

@avigoldman
Copy link
Contributor

@avigoldman avigoldman commented Aug 21, 2020

On a related note - AMP for Email recently added a fixed 100kb size limit for emails. Probably is a good indication of what Gmail wants actually wants the 102kb limit to be.
ampproject/amphtml#29698 (comment)

@hteumeuleu
Copy link
Owner Author

@hteumeuleu hteumeuleu commented Nov 12, 2020

I just spent half an hour wondering why I got the "Message clipped" on an email. Turns out, I had an HTML comment in french with an accented character (like <!-- Mentions légales -->). I removed the accent and Gmail's message disappeared. Here’s a simple test code to try:

Hello world ! <!-- é -->
@vladh
Copy link

@vladh vladh commented Jan 19, 2021

I encountered this problem when sending email to Gmail from Outlook. Sending a message with any German characters, such as "Grüsse", would cause the message to clip in Gmail.

I fixed this problem by making sure my email was encoded as UTF-8, which can be set in the Outlook settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet