-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JAMES-3074 UidValidity strong typing #3132
JAMES-3074 UidValidity strong typing #3132
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of traps with the usage of uidvalidity.
As it's often the case, putting a strong type reveal long-standing real bugs
thanks for this work
import com.google.common.base.MoreObjects; | ||
import com.google.common.base.Objects; | ||
|
||
public class UidValidity { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are missing the actual definition of UIDVALIDITY from IMAP RFC :
nz-number = digit-nz *DIGIT
; Non-zero unsigned 32-bit integer
; (0 < n < 4,294,967,296)
Sadly we don't have unsigned 32-bit integer in Java, it's why we decided to use a long (that can contain the range [0-2^32]). However, you should ensure we can't create a UidValidity object that is outside that range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that easy:
protected int randomUidValidity() {
return Math.abs(RANDOM.nextInt());
}
We created many unvalid UID VALIDITY, we need to be still lenient and accept already created fields. However newer UID VALIDITY should match condition matthieu highlight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You assume returning a negative integer worked. I would argue it did not and we never noticed.
1 minute search : GNOME/evolution@e81e64f#diff-e4a32bc822c57a9a503008fe6358736fR100
Let's fix it. It happens that we can change the existing values if we detect they were invalid and it will trigger a client re-sync. Not that bad, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that bad, but this needs low level support (update if invalid) on every backend.
This is expensive to implement.
So I would:
- Create a dedicated JIRA ticket for this "invalid UID_VALIDITY are generated"
- Enact the tactical move: merge the strong typing + ensure we don't generate new invalid values. This will at list help with new deployments.
- Document the strategical move: fix all and every invalid UID_VALIDITY (at the cost of resynchronicsation)
Would you agree @mbaechler ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course. A small step now is better than nothing at all.
mailbox/api/src/test/java/org/apache/james/mailbox/model/MailboxAssertingToolTest.java
Outdated
Show resolved
Hide resolved
mailbox/maildir/src/main/java/org/apache/james/mailbox/maildir/MaildirFolder.java
Outdated
Show resolved
Hide resolved
// using the timestamp as uidValidity | ||
long timestamp = System.currentTimeMillis(); | ||
UidValidity timestamp = UidValidity.of(System.currentTimeMillis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jshell> System.currentTimeMillis()/ Integer.MAX_VALUE
$2 ==> 736
That means this piece of code doesn't respect UIDVALIDITY definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW here I believe this code path SHOULD be using UidValidity.random()
, I don't see reasons to be relying on system clock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's what RFC 3501 advise.
However, not with millis because it doesn't fit in unsigned 32 bit integer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can you ensure 2 mailboxes don't get the same value if they are created at the exact same time in two different threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter to have 2 differents mailboxes having the same uidvalidity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine... I create mailboxA and mailboxB at the same time. They have same UID_VALIDITY, let say 36.
Now I fetch my mails, then shut thunderbird down.
Then I rename mailboxA in mailboxC
Rename mailboxB in mailboxA
Rename mailboxC in mailboxB
Now I turn thunderbird on again. Fetch the mailboxes again and sees mailboxA with UIDVALIDITY 36 + mailboxB with UID validity 36 and don't know it SHOULD drop it's cache, and resynchronise...
Yet another example about IMAP RFCs mess IMO. That's why I think keeping this in our code, even if it is an IMAP RFC advice is not wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good scenario, it demonstrates UIDVALIDITY is part of mailbox identity and as mailbox path is a weak identifier, you suggest we use UIDVALIDITY as a unique id.
I agree with that solution.
BTW, I already pointed out that using millisecond in UIDVALIDITY is not working neither because it's 32 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep a record of this discussion somewhere @mbaechler ? ADR? Or link in the ticket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for a comment in the code
mailbox/store/src/main/java/org/apache/james/mailbox/store/StoreMailboxManager.java
Outdated
Show resolved
Hide resolved
...otocols/protocols-pop3/src/main/java/org/apache/james/pop3server/mailbox/MailboxAdapter.java
Show resolved
Hide resolved
Hmmm I must agree I just had in mind to strong type what we already have, while completely missing the real purpose of a |
.../apache/james/mailbox/elasticsearch/events/ElasticSearchListeningMessageSearchIndexTest.java
Outdated
Show resolved
Hide resolved
// using the timestamp as uidValidity | ||
long timestamp = System.currentTimeMillis(); | ||
UidValidity timestamp = UidValidity.of(System.currentTimeMillis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW here I believe this code path SHOULD be using UidValidity.random()
, I don't see reasons to be relying on system clock.
We just discovered plenty of bugs just by introducing a type. It means we are so right when we want strong typing: it helps not break invariants and detect when we break them. I don't expect you to fix everything overnight but we should not let this PR rot because there are real issues to fix. |
That's why I believe we need to enact a tactical decision while letting the strategical one for later (I don't think we currently have resources to do fix existing values). |
@@ -217,8 +218,8 @@ public PreDeletionHooks getPreDeletionHooks() { | |||
/** | |||
* Generate and return the next uid validity | |||
*/ | |||
protected int randomUidValidity() { | |||
return Math.abs(RANDOM.nextInt()); | |||
protected UidValidity randomUidValidity() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should have a single way of generating UidValidity objects and it should be UidValidity.create()
We should document that UidValidity.of
should be used only for deserialization
734a6a2
to
e04a9df
Compare
Only did a rebase with master for resolving conflicts |
Read it |
|
Read it |
Read it. Good initiative ! |
3d1f9bd
to
e643e08
Compare
(forced pushed to modify ticket number in initial commits) |
mailbox/api/src/test/java/org/apache/james/mailbox/model/UidValidityTest.java
Show resolved
Hide resolved
I approve the changes, thanks @chibenwa for the help ! |
.doesNotThrowAnyException(); | ||
} | ||
|
||
@Disabled("On average upon generating 1.000.000 UidValidity we notice 125 duplicates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the implications ? Can you state it in the comment ?
And if it is acceptable maybe the test could be one asserting that we have less than 0.0x % of collisions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the implications ? Can you state it in the comment ?
IMAP clients might be missing some mailbox switches.
And if it is acceptable maybe the test could be one asserting that we have less than 0.0x % of collisions ?
What's the point of testing that IMAP RFC is broken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you have to blame the RFC? (from 3501)
3) If the mailbox is deleted and a new mailbox with the
same name is created at a later date, the server must
either keep track of unique identifiers from the
previous instance of the mailbox, or it must assign a
new UIDVALIDITY value to the new instance of the
mailbox. A good UIDVALIDITY value to use in this case
is a 32-bit representation of the creation date/time of
the mailbox. It is alright to use a constant such as
1, but only if it guaranteed that unique identifiers
will never be reused, even in the case of a mailbox
being deleted (or renamed) and a new mailbox by the
same name created at some future time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMAP clients might be missing some mailbox switches.
The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMAP clients might be missing some mailbox switches.
The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...
mailbox/api/src/main/java/org/apache/james/mailbox/model/UidValidity.java
Show resolved
Hide resolved
/** | ||
* Despite RFC-3501 recommendations, we chose random as a UidVality generation mechanism. | ||
* Timestamp base approach can lead to UidValidity being the same for two mailboxes simultaneously generated, leading | ||
* to some IMPA clients not being able to detect changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/IMPA/IMAP
test this please |
ca715a0
to
f59df28
Compare
Using Optional is a better pattern for representing a not-yet-known value.
- store generation trivialy lead to invalid value on `0` (edge case) - maildir being based on a timestamp was generating too large values Currently allocated values CAN be invalid, but at the very least we should stop generating invalid values. Regarding the choice to be random based: Despite RFC-3501 recommendations, we chose random as a UidVality generation mechanism. Timestamp base approach can lead to UidValidity being the same for two mailboxes simultaneously generated, leading to some IMPA clients not being able to detect changes. See https://issues.apache.org/jira/browse/JAMES-3074 for details
Force pushed to solve a conflict |
f59df28
to
9a38b2c
Compare
mailbox/api/src/main/java/org/apache/james/mailbox/model/UidValidity.java
Outdated
Show resolved
Hide resolved
mailbox/api/src/main/java/org/apache/james/mailbox/model/UidValidity.java
Outdated
Show resolved
Hide resolved
.doesNotThrowAnyException(); | ||
} | ||
|
||
@Disabled("On average upon generating 1.000.000 UidValidity we notice 125 duplicates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMAP clients might be missing some mailbox switches.
The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...
.doesNotThrowAnyException(); | ||
} | ||
|
||
@Disabled("On average upon generating 1.000.000 UidValidity we notice 125 duplicates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMAP clients might be missing some mailbox switches.
The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...
Merged |
No description provided.