Skip to content

Improve Maildir file uniqueness in qmail-local.c#90

Draft
xenotrope wants to merge 1 commit into
mainfrom
notqmail-maildir-uniqueness
Draft

Improve Maildir file uniqueness in qmail-local.c#90
xenotrope wants to merge 1 commit into
mainfrom
notqmail-maildir-uniqueness

Conversation

@xenotrope

Copy link
Copy Markdown

Some operating systems quickly recycle PIDs, which can lead to
collisions between Maildir-style filenames, which must be unique and
non-repeatable within one second. This patch is just a means of updating
qmail-local to use the format of the revised Maildir protocol.

It uses four unique identifiers:

  • inode number of the file written to Maildir/tmp
  • device number of the file written to Maildir/tmp
  • time in microseconds
  • the PID of the writing process

A Maildir-style filename would look like the following:

In Maildir/tmp:
time.MmicrosecondsPpid.host
In Maildir/new:
time.IinodeVdeviceMmicrosecondsPpid.host

Additionally, this patch further comforms to the revised Maildir
protocol by looking through the hostname for instances of '/' and ':',
replacing them with "\057" and "\072", respectively, when writing it to
disk. Special thanks go to Matthias Andree for design & sanity-checking.

@xenotrope xenotrope requested a review from schmonz August 26, 2019 22:47
@mbhangui

Copy link
Copy Markdown
Contributor

fmt_xlong() should be moved to fmt_xlong.c.
we should add the fmt_xlong() proto to fmt.h

Comment thread qmail-local.c
pid = getpid();
host[0] = 0;
gethostname(host,sizeof(host));

@DerDakon DerDakon Aug 27, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one could do an initial allocation for hostname here, so it has enough room to hold "host" first, which should be enough for all sane cases anyway. Or maybe just do a scan through host if it has any of the special chars and just copy over the whole string otherwise (i.e. always).

The most simple version, which will give us enough room for the unchanged host, which is enough for the usual case, is:

stralloc_readyplus(&hostname, str_len(host));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSIX gethostname does not null-terminate the buffer if it was too long, so I recommend:

        host[sizeof host - 1] = 0;

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best solution here is for users to adopt OpenBSD's gethostname(3): "The returned name is always NUL terminated." I've updated this to use a stralloc and then store the hostname in the sa.s array and write '\0' in the last byte. For hostnames with 256 or more characters, this will truncate it to the first 255.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still misses to preallocate hostname as shown above.

One could also think of not preallocating and simply not touching hostname at all until we actually found a position where it needs to be recoded, and only then allocate it and copy the whole prefix at once. This would however make the code slightly more complicated.

Comment thread qmail-local.c Outdated
Comment thread fmt_xlong.c
Comment thread qmail-local.c Outdated
Comment thread qmail-local.c Outdated
gethostname(host,sizeof(host));

s = host;
for (loop = 0; loop < str_len(host); ++loop)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style used everywhere else is to put the { on the line of the control statement.

@schmonz schmonz added this to the 1.09 milestone May 19, 2020
@schmonz

schmonz commented May 19, 2020

Copy link
Copy Markdown
Member

I'd like to see this rebased and merged sometime soon after 1.08 is out, so I've assigned it the 1.09 milestone. @xenotrope, might you have attention for this sometime in the next week-ish?

Comment thread fmt_xlong.c Outdated
@DerDakon

Copy link
Copy Markdown
Member

You now add fmt_xlong() back to fmt.h after we removed it in #126. Please add it with complete function signature.

@xenotrope xenotrope force-pushed the notqmail-maildir-uniqueness branch from d1e635b to 07f0cef Compare June 10, 2020 02:12

@DerDakon DerDakon left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be awesome if that could be accompanied by a unittest for at least maildir_child() and fmt_xlong().

Comment thread qmail-local.c
s += fmt_ulong(s,time); *s++ = '.';
s += fmt_ulong(s,pid); *s++ = '.';
s += fmt_strn(s,myhost,sizeof(myhost)); *s++ = 0;
s += fmt_ulong(s,time.tv_sec); *s++ = '.';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds both '.' and 'M', I don't thinkg that is intentional.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old Maildir format is "time.pid.hostname:flags". By replacing "pid" with more stuff, including a sub-second time measurement, this patch changes it to "time.MμsecPpid.host:flags". So time.tv_sec and time.tv_usec are both needed, but one comes before the first dot and the other comes after it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put all of them in different lines to improve readability. If you want to group the sections than insert newlines.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-behaving mail software would ignore all the pseudo-random part of the name, so I suppose adding letters in addition to . and 0-9 is fine.
Also, I think I remember there are already software around using letters in there.

Comment thread qmail-local.c Outdated
Comment thread qmail-local.c Outdated
/* writes ulong u in hex to char *s, does not NULL-terminate */
unsigned int fmt_xlong(s,u) char *s; unsigned long u;
{
unsigned int len; unsigned long q; unsigned long c;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please unwrap them also, which also gives the opportunity to initialize them immediately and get rid of the next line entirely.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt_xlong() was removed from qmail-local.c a while back and this regression will be fixed in a new push.

Comment thread qmail-local.c
pid = getpid();
host[0] = 0;
gethostname(host,sizeof(host));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still misses to preallocate hostname as shown above.

One could also think of not preallocating and simply not touching hostname at all until we actually found a position where it needs to be recoded, and only then allocate it and copy the whole prefix at once. This would however make the code slightly more complicated.

Comment thread qmail-local.c Outdated
unsigned long time;
char myhost[64];
struct timeval time;
char host[64];

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 1.08 we renamed this local host to myhost to avoid shadowing the same-named global. Please catch up to the rename to reduce this diff.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. I recognize we’ve made this PR more labor-intensive for you by changing a bunch of stuff out from under it. Sorry about this! And please keep going — I think we’re pretty close to making maintenance of this code everyone’s job.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully maintaining this code doesn't become necessary for anyone, at least not in a long-term capacity. While I deeply appreciate your apology, (really, thank you!) it's not necessary. A number of these new changes seem[1] to be because the "git rebase master" undid some commits I've made in an earlier version of this branch to address previous coding concerns. I blame myself. And also Junio Hamano. Any chance somebody has the temerity to make "darcshub.com" or "Codevillehub.com"?

[1] I'm not entirely sure, because everything seemed like it was OK before I pushed, but... my mother said that once and she was gravely mistaken.

Comment thread qmail-local.c Outdated
void temp_qmail(fn) char *fn;
{ strerr_die5x(111,"Unable to open ",fn,": ",error_str(errno),". (#4.3.0)"); }

/* writes ulong u in hex to char *s, does not NULL-terminate */

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function now lives in it's own file, so drop it here.

Some operating systems quickly recycle PIDs, which can lead to
collisions between Maildir-style filenames, which must be unique and
non-repeatable within one second. This patch is just a means of updating
qmail-local to use the format of the revised Maildir protocol.

It uses four unique identifiers:
  * inode number of the file written to Maildir/tmp
  * device number of the file written to Maildir/tmp
  * time in microseconds
  * the PID of the writing process

A Maildir-style filename would look like the following:

In Maildir/tmp:
  time.MmicrosecondsPpid.host
In Maildir/new:
  time.IinodeVdeviceMmicrosecondsPpid.host

Additionally, this patch further comforms to the revised Maildir
protocol by looking through the hostname for instances of '/' and ':',
replacing them with "\057" and "\072", respectively, when writing it to
disk. Special thanks go to Matthias Andree for design & sanity-checking.
@xenotrope xenotrope force-pushed the notqmail-maildir-uniqueness branch from 93c936e to 5c66122 Compare October 19, 2020 14:40
Comment thread fmt_xlong.c
unsigned long c;
len = 1; q = u;
while (q > 15) { ++len; q /= 16; }
if (s)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reformat this block: put the { in the line of the if() and unwrap the do-while loop.

Comment thread qmail-local.c
hostsa.s[hostnamemaxlen-1] = 0;

s = hostsa.s;
for (loop = 0; loop < str_len(hostsa.s); ++loop)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use strlen() instead of str_len() in new code.

Comment thread qmail-local.c
s += fmt_ulong(s,time); *s++ = '.';
s += fmt_ulong(s,pid); *s++ = '.';
s += fmt_strn(s,myhost,sizeof(myhost)); *s++ = 0;
s += fmt_ulong(s,time.tv_sec); *s++ = '.';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put all of them in different lines to improve readability. If you want to group the sections than insert newlines.

@schmonz schmonz removed the request for review from alanpost November 3, 2020 14:55

@josuah josuah left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to add to what @DerDakon and @leahneukirchen said.

Comment thread qmail-local.c
Comment on lines +107 to +117
if (hostsa.s[loop] == '/')
{
if (!stralloc_cats(&hostname,"\\057")) temp_nomem();
continue;
}
if (hostsa.s[loop] == ':')
{
if (!stralloc_cats(&hostname,"\\072")) temp_nomem();
continue;
}
if (!stralloc_append(&hostname,s+loop)) temp_nomem();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure: there is no program in qmail that tries to decode these escape codes?
It is just in case any of those are encountered?
Why not hard fail instead? It is not expected to have these characters in a host name in the first place, so it might not bother anyone if it stops in these cases.

  <name>  ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]

https://datatracker.ietf.org/doc/html/rfc952

Although, it may happen: hostname / works just fine here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding of / and : is specified in https://cr.yp.to/proto/maildir.html, there's no need to decode it.

Comment thread qmail-local.c
s += fmt_ulong(s,time); *s++ = '.';
s += fmt_ulong(s,pid); *s++ = '.';
s += fmt_strn(s,myhost,sizeof(myhost)); *s++ = 0;
s += fmt_ulong(s,time.tv_sec); *s++ = '.';

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-behaving mail software would ignore all the pseudo-random part of the name, so I suppose adding letters in addition to . and 0-9 is fine.
Also, I think I remember there are already software around using letters in there.

@schmonz

schmonz commented Sep 22, 2023

Copy link
Copy Markdown
Member

@xenotrope, would you be comfortable with me handling the rebasing, etc. and getting this PR merged? I can make sure to keep the change attributed to you (as the patch itself is). We want this one in 1.09 and I'd be happy to take it from here, ideally with your permission.

@schmonz schmonz modified the milestones: 1.09, 1.10 Oct 13, 2023
@aaeegg

aaeegg commented Sep 1, 2024

Copy link
Copy Markdown

The wall-clock time from gettimeofday can sometimes be non-monotonic. On systems that have a monotonic uptime count readable by clock_gettime(CLOCK_MONOTONIC), how about putting that in the delivery identifier in addition to the gettimeofday time?

I propose expressing this as Onm, where n is (in hexadecimal) the tv_sec from clock_gettime(CLOCK_MONOTONIC) and m is (in hexadecimal, zero-padded on the left to 8 digits) the tv_nsec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants