Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JAMES-3074 UidValidity strong typing #3132

Closed

Conversation

Arsnael
Copy link
Member

@Arsnael Arsnael commented Feb 20, 2020

No description provided.

Copy link

@mbaechler mbaechler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of traps with the usage of uidvalidity.
As it's often the case, putting a strong type reveal long-standing real bugs

thanks for this work

import com.google.common.base.MoreObjects;
import com.google.common.base.Objects;

public class UidValidity {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are missing the actual definition of UIDVALIDITY from IMAP RFC :

nz-number       = digit-nz *DIGIT
                    ; Non-zero unsigned 32-bit integer
                    ; (0 < n < 4,294,967,296)

Sadly we don't have unsigned 32-bit integer in Java, it's why we decided to use a long (that can contain the range [0-2^32]). However, you should ensure we can't create a UidValidity object that is outside that range.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that easy:


    protected int randomUidValidity() {
        return Math.abs(RANDOM.nextInt());
    }

We created many unvalid UID VALIDITY, we need to be still lenient and accept already created fields. However newer UID VALIDITY should match condition matthieu highlight.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You assume returning a negative integer worked. I would argue it did not and we never noticed.

1 minute search : GNOME/evolution@e81e64f#diff-e4a32bc822c57a9a503008fe6358736fR100

Let's fix it. It happens that we can change the existing values if we detect they were invalid and it will trigger a client re-sync. Not that bad, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that bad, but this needs low level support (update if invalid) on every backend.

This is expensive to implement.

So I would:

  • Create a dedicated JIRA ticket for this "invalid UID_VALIDITY are generated"
  • Enact the tactical move: merge the strong typing + ensure we don't generate new invalid values. This will at list help with new deployments.
  • Document the strategical move: fix all and every invalid UID_VALIDITY (at the cost of resynchronicsation)

Would you agree @mbaechler ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. A small step now is better than nothing at all.

// using the timestamp as uidValidity
long timestamp = System.currentTimeMillis();
UidValidity timestamp = UidValidity.of(System.currentTimeMillis());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jshell> System.currentTimeMillis()/ Integer.MAX_VALUE
$2 ==> 736

That means this piece of code doesn't respect UIDVALIDITY definition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW here I believe this code path SHOULD be using UidValidity.random(), I don't see reasons to be relying on system clock.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's what RFC 3501 advise.
However, not with millis because it doesn't fit in unsigned 32 bit integer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can you ensure 2 mailboxes don't get the same value if they are created at the exact same time in two different threads?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter to have 2 differents mailboxes having the same uidvalidity

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine... I create mailboxA and mailboxB at the same time. They have same UID_VALIDITY, let say 36.

Now I fetch my mails, then shut thunderbird down.

Then I rename mailboxA in mailboxC
Rename mailboxB in mailboxA
Rename mailboxC in mailboxB

Now I turn thunderbird on again. Fetch the mailboxes again and sees mailboxA with UIDVALIDITY 36 + mailboxB with UID validity 36 and don't know it SHOULD drop it's cache, and resynchronise...

Yet another example about IMAP RFCs mess IMO. That's why I think keeping this in our code, even if it is an IMAP RFC advice is not wise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good scenario, it demonstrates UIDVALIDITY is part of mailbox identity and as mailbox path is a weak identifier, you suggest we use UIDVALIDITY as a unique id.

I agree with that solution.

BTW, I already pointed out that using millisecond in UIDVALIDITY is not working neither because it's 32 bits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep a record of this discussion somewhere @mbaechler ? ADR? Or link in the ticket?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for a comment in the code

@Arsnael
Copy link
Member Author

Arsnael commented Feb 21, 2020

Hmmm I must agree I just had in mind to strong type what we already have, while completely missing the real purpose of a UidValidity. This PR definitely needs more work, thanks for the feedbacks :)

// using the timestamp as uidValidity
long timestamp = System.currentTimeMillis();
UidValidity timestamp = UidValidity.of(System.currentTimeMillis());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW here I believe this code path SHOULD be using UidValidity.random(), I don't see reasons to be relying on system clock.

@mbaechler
Copy link

Hmmm I must agree I just had in mind to strong type what we already have, while completely missing the real purpose of a UidValidity. This PR definitely needs more work, thanks for the feedbacks :)

We just discovered plenty of bugs just by introducing a type. It means we are so right when we want strong typing: it helps not break invariants and detect when we break them.

I don't expect you to fix everything overnight but we should not let this PR rot because there are real issues to fix.

@chibenwa
Copy link
Member

I don't expect you to fix everything overnight but we should not let this PR rot because there are real issues to fix.

That's why I believe we need to enact a tactical decision while letting the strategical one for later (I don't think we currently have resources to do fix existing values).

@@ -217,8 +218,8 @@ public PreDeletionHooks getPreDeletionHooks() {
/**
* Generate and return the next uid validity
*/
protected int randomUidValidity() {
return Math.abs(RANDOM.nextInt());
protected UidValidity randomUidValidity() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have a single way of generating UidValidity objects and it should be UidValidity.create()
We should document that UidValidity.of should be used only for deserialization

@Arsnael Arsnael force-pushed the j3057-uid-validity-strong-typing branch from 734a6a2 to e04a9df Compare February 25, 2020 04:13
@Arsnael
Copy link
Member Author

Arsnael commented Feb 25, 2020

Only did a rebase with master for resolving conflicts

@nvduc91
Copy link

nvduc91 commented Feb 25, 2020

Read it

@Arsnael
Copy link
Member Author

Arsnael commented Feb 25, 2020

[e04a9dfe2830dd3a9d635d7a0060cc7aec1a676b] [ERROR] Failures: 
[e04a9dfe2830dd3a9d635d7a0060cc7aec1a676b] [ERROR]   CassandraGenericMailboxMapperTest>MailboxMapperTest.createShouldPersistTheMailbox:93 Expected UID Validity to be <UidValidity{uidValidity=42}> but was <UidValidity{uidValidity=42}>
[e04a9dfe2830dd3a9d635d7a0060cc7aec1a676b] [ERROR]   CassandraGenericMailboxMapperTest>MailboxMapperTest.findMailboxByIdShouldReturnExistingMailbox:251 Expected UID Validity to be <UidValidity{uidValidity=42}> but was <UidValidity{uidValidity=42}>
[e04a9dfe2830dd3a9d635d7a0060cc7aec1a676b] [ERROR]   CassandraGenericMailboxMapperTest>MailboxMapperTest.renameShouldRenameTheMailbox:126 Expected UID Validity to be <UidValidity{uidValidity=42}> but was <UidValidity{uidValidity=42}>

@Arsnael Arsnael added the refactoring Code enhencements that don't alter software behaviour label Feb 25, 2020
@Arsnael Arsnael changed the title [Refactoring] UidValidity strong typing JAMES-3074 [Refactoring] UidValidity strong typing Feb 25, 2020
@Arsnael Arsnael removed the refactoring Code enhencements that don't alter software behaviour label Feb 25, 2020
@Arsnael Arsnael changed the title JAMES-3074 [Refactoring] UidValidity strong typing JAMES-3074 UidValidity strong typing Feb 25, 2020
@trantienduchn
Copy link

Read it

@remk
Copy link
Member

remk commented Feb 25, 2020

Read it. Good initiative !

@chibenwa chibenwa force-pushed the j3057-uid-validity-strong-typing branch 2 times, most recently from 3d1f9bd to e643e08 Compare February 26, 2020 04:34
@chibenwa
Copy link
Member

(forced pushed to modify ticket number in initial commits)

chibenwa pushed a commit to chibenwa/james-project that referenced this pull request Feb 26, 2020
@Arsnael
Copy link
Member Author

Arsnael commented Feb 26, 2020

I approve the changes, thanks @chibenwa for the help !

.doesNotThrowAnyException();
}

@Disabled("On average upon generating 1.000.000 UidValidity we notice 125 duplicates")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the implications ? Can you state it in the comment ?
And if it is acceptable maybe the test could be one asserting that we have less than 0.0x % of collisions ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the implications ? Can you state it in the comment ?

IMAP clients might be missing some mailbox switches.

And if it is acceptable maybe the test could be one asserting that we have less than 0.0x % of collisions ?

What's the point of testing that IMAP RFC is broken?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you have to blame the RFC? (from 3501)

         3) If the mailbox is deleted and a new mailbox with the
            same name is created at a later date, the server must
            either keep track of unique identifiers from the
            previous instance of the mailbox, or it must assign a
            new UIDVALIDITY value to the new instance of the
            mailbox.  A good UIDVALIDITY value to use in this case
            is a 32-bit representation of the creation date/time of
            the mailbox.  It is alright to use a constant such as
            1, but only if it guaranteed that unique identifiers
            will never be reused, even in the case of a mailbox
            being deleted (or renamed) and a new mailbox by the
            same name created at some future time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMAP clients might be missing some mailbox switches.

The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMAP clients might be missing some mailbox switches.

The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...

/**
* Despite RFC-3501 recommendations, we chose random as a UidVality generation mechanism.
* Timestamp base approach can lead to UidValidity being the same for two mailboxes simultaneously generated, leading
* to some IMPA clients not being able to detect changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/IMPA/IMAP

@chibenwa
Copy link
Member

test this please

@chibenwa chibenwa force-pushed the j3057-uid-validity-strong-typing branch from ca715a0 to f59df28 Compare February 27, 2020 04:57
Arsnael and others added 9 commits February 27, 2020 15:38
Using Optional is a better pattern for representing a not-yet-known value.
 - store generation trivialy lead to invalid value on `0` (edge case)
 - maildir being based on a timestamp was generating too large values

 Currently allocated values CAN be invalid, but at the very least we
 should stop generating invalid values.

 Regarding the choice to be random based:

 Despite RFC-3501 recommendations, we chose random as a UidVality
 generation mechanism. Timestamp base approach can lead to UidValidity
 being the same for two mailboxes simultaneously generated, leading
to some IMPA clients not being able to detect changes.

See https://issues.apache.org/jira/browse/JAMES-3074 for details
@chibenwa
Copy link
Member

Force pushed to solve a conflict

chibenwa pushed a commit to chibenwa/james-project that referenced this pull request Feb 27, 2020
@chibenwa chibenwa force-pushed the j3057-uid-validity-strong-typing branch from f59df28 to 9a38b2c Compare February 27, 2020 08:49
.doesNotThrowAnyException();
}

@Disabled("On average upon generating 1.000.000 UidValidity we notice 125 duplicates")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMAP clients might be missing some mailbox switches.

The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...

.doesNotThrowAnyException();
}

@Disabled("On average upon generating 1.000.000 UidValidity we notice 125 duplicates")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMAP clients might be missing some mailbox switches.

The collision has to happen on a mailbox named like a previous one. I don't see that as a problem given the constraint of 32 bit integer, I can't think of a better solution than random ...

chibenwa pushed a commit to chibenwa/james-project that referenced this pull request Feb 28, 2020
@chibenwa chibenwa added the waiting_merge We are about to merge this! label Mar 2, 2020
@chibenwa
Copy link
Member

chibenwa commented Mar 2, 2020

Merged

@chibenwa chibenwa closed this Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting_merge We are about to merge this!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants