Skip to content

Fix UAID and registration_id confusion in the push component.#4605

Merged
mhammond merged 1 commit intomozilla:mainfrom
mhammond:push-fix-db
Oct 29, 2021
Merged

Fix UAID and registration_id confusion in the push component.#4605
mhammond merged 1 commit intomozilla:mainfrom
mhammond:push-fix-db

Conversation

@mhammond
Copy link
Copy Markdown
Member

There's one UAID and one registration_id per device, not per-subscription.

Fixes #4575

  • uaid, auth-key and registration_id are no longer stored in the database
    against each subscription, but instead stored in the metadata table, and
    thus assumed to be the same for all subscriptions.

  • If a subscribe() call returns a different UAID than we had previously, we
    consider all existing subscriptions dead - we delete all old subscriptions
    and the old UAID, then store the new subscription against the new UAID.

  • To help with managing the UAID correctly, the ConnectHttp object, which
    holds on to the UAID etc, is now short-lived - it only lives as long as one
    API call (whereas it previously lived for as long as the push manager).
    This means that the UAID persisted in the meta-data table is the canonical
    source of truth.

  • The "registration id" (think FCM token) is now ignored in the
    constructor (it should be removed ASAP, but it's there and ignored
    to avoid a breaking change. The only way to supply this is to call
    the update() method - but the component will persist this value.

    The end result is that, assuming we've ever called update() in the
    past, is that after startup we can still call subscribe before we've
    called update() as we will use the previous value. This means we can
    avoid some complexity in android-components where we have a complicated
    error prone dance to avoid constructing the component until FCM has
    initialized.

Pull Request checklist

  • Quality: This PR builds and tests run cleanly
    • Note:
      • For changes that need extra cross-platform testing, consider adding [ci full] to the PR title.
      • If this pull request includes a breaking change, consider cutting a new release after merging.
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Changelog: This PR includes a changelog entry in CHANGES_UNRELEASED.md or an explanation of why it does not need one
    • Any breaking changes to Swift or Kotlin binding APIs are noted explicitly
  • Dependencies: This PR follows our dependency management guidelines
    • Any new dependencies are accompanied by a summary of the due dilligence applied in selecting them.

@mhammond
Copy link
Copy Markdown
Member Author

This is a very large patch - sorry about that. I did need to stop myself from going even further though :) This isn't even a breaking change for Fenix! It seems to work well and I think will make push even more reliable for most consumers. There are still further changes we should make to make it even better for webpush, but this is a step in that direction.

I'm flagging JR too - feel free to have as deep or light a review as you like, but I'm primary making sure I'm not misunderstanding our conversations about "should never be multiple UAIDs, a change in UAID means old subs are dead", etc.

CC @grigoryk

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

Merging #4605 (fb3875c) into main (73b922e) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #4605   +/-   ##
=======================================
  Coverage   80.63%   80.63%           
=======================================
  Files          51       51           
  Lines        5221     5221           
=======================================
  Hits         4210     4210           
  Misses       1011     1011           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 73b922e...fb3875c. Read the comment docs.

@mhammond mhammond force-pushed the push-fix-db branch 2 times, most recently from 9069af8 to b5ae692 Compare October 22, 2021 00:46
Comment thread components/push/src/internal/communications.rs Outdated
Comment thread components/push/src/internal/storage/schema.rs
Comment thread components/push/src/internal/subscriber.rs Outdated
fn update_endpoint(&self, channel_id: &str, endpoint: &str) -> Result<bool>;

fn update_native_id(&self, uaid: &str, native_id: &str) -> Result<bool>;
// Some of our "meta" keys are more important than others, so they get special helpers.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Yeah. I see the one of the potential problems now.

FWIW, much like tokenserver, there was a thread that devices might have multiple UAIDs when this was first built. (The second UAID would be handled by a marketing thing, apparently, and would be it's own system.) That added a LOT of complexity that was never actually needed.

👍🏻👍🏻 for pulling all that code and simplifying things.

uaid TEXT NOT NULL,
channel_id TEXT NOT NULL UNIQUE,
channel_id TEXT NOT NULL PRIMARY KEY,
-- `endpoint` must be unique; if 2 scopes ended up with the same endpoint, we'd possibly
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: On the server, we use fernet to construct the endpoint, it will always be unique, including if you've already previously registered this UAID+ChannelID. Prior endpoints will continue to work so long as both the UAID and ChannelID remain valid.

if self.auth.is_none() {
self.auth = response["secret"].as_str().map(ToString::to_string);
// asserting here seems bad! :) But what does this mean? We supplied ours, how could
// the server disagree? (The server seems to only supply this if the UAID changed?)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SenderID really, really shouldn't change. It's how the server authenticates to the bridge system for FCM. IIRC, there was a bug, however, where some devices had been registered to a different SenderID, and then a different copy of the UA was reporting a different SenderID value.

Honestly, asserting here is a really good idea if that ever happens, since it indicates something went very, very wrong.

};
if is_new_uaid {
// apparently the uaid changing but not getting a new secret guarantees we will be
// unable to decrypt payloads. Not clear how we will recover from this though.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't.
Honestly, if you have a new UAID, you shouldn't be getting messages from the old UAID, since that indicates a possible routing error on the server. You might want to send an unregister with the old uaid and secret, just to make sure that the server kills off the old one, just to make sure it's dead, but that's probably not needed.

Comment thread components/push/src/push.udl Outdated
// > may result in new endpoint registration
// which is touching on something we need to address - how to better communicate:
// * we need a new endpoint registration
// * all subscriptions need updating because we have a new endpoint.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that's what the onpushsubscriptionchange message was for? Or am I misunderstanding?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I worded that badly. I changed that part of the comment to read:

// Long story short, there's no way to know that this `update()` call thinks all our
// subscriptions need renewal, but we instead rely on `verify_connection()` being
// called regularly for this purpose.

Copy link
Copy Markdown
Contributor

@tarikeshaq tarikeshaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is a huge improvement!! A couple of nits and questions but otherwise changelog and 🚢 🚢


fn get_meta(&self, key: &str) -> Result<Option<String>>;
fn get_auth(&self) -> Result<Option<String>>;
fn set_auth(&self, uaid: &str) -> Result<()>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn set_auth(&self, uaid: &str) -> Result<()>;
fn set_auth(&self, auth: &str) -> Result<()>;

pub fn open_in_memory() -> Result<Self> {
// A nod to our tests which use this.
#[cfg(test)]
env_logger::try_init().ok();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

)
.into());
}
// Don't fetch the connection from the server if we've already got one.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Don't fetch the connection from the server if we've already got one.
// Don't fetch the subscription from the server if we've already got one.

match e.kind() {
ErrorKind::UAIDNotRecognizedError(_) => {
// Our subscriptions are dead, but for now, just let the existing mechanisms
// deal with that (eg, next `subscribe()` or `verify_subscriptions()`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// deal with that (eg, next `subscribe()` or `verify_subscriptions()`)
// deal with that (eg, next `subscribe()` or `verify_connection()`)


// Dispatch Information returned from [`PushManager::dispatch_info_for_chid`]
dictionary DispatchInfo {
string uaid;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this make this a breaking change? Not 100% sure how the dispach_info_for_chid is used on AC though

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks, that raises a good question - it's technically a breaking change, but it's not in practice - which raises a kind of philosophical question about whether it should be declared breaking or not!

pub fn unsubscribe(&mut self, channel_id: &str) -> Result<bool> {
// TODO(teshaq): This should throw an error instead of return false
// keeping this as false in the meantime while uniffing to not change behavior
// markh: both branches below are broken in our v3 schema - someone may have subscribed,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ya, I wonder what happens if we change the behaviour so that unsubscribe doesn't error out even if they try to unsubscribe to a channel they never subscribed to... maybe use the false I wanted to get rid of as an indicator that it was never subscribed to? Wouldn't be a great API experience thou.. thinking out loud

Copy link
Copy Markdown
Collaborator

@jonalmeida jonalmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me as a consumer. Quite keen for this to land. 🙂

@mhammond mhammond force-pushed the push-fix-db branch 2 times, most recently from 6cf5a27 to dbd7cae Compare October 29, 2021 01:25
There's one UAID and one registration_id per device, not per-subscription.

Fixes mozilla#4575

* `uaid`, auth-key and `registration_id` are no longer stored in the database
  against each subscription, but instead stored in the `metadata` table, and
  thus assumed to be the same for all subscriptions.

* If a `subscribe()` call returns a different UAID than we had previously, we
  consider all existing subscriptions dead - we delete all old subscriptions
  and the old UAID, then store the new subscription against the new UAID.

* To help with managing the UAID correctly, the `ConnectHttp` object, which
  holds on to the UAID etc, is now short-lived - it only lives as long as one
  API call (whereas it previously lived for as long as the push manager).
  This means that the UAID persisted in the meta-data table is the canonical
  source of truth.

* The "registration id" (think FCM token) is now ignored in the
  constructor (it should be removed ASAP, but it's there and ignored
  to avoid a breaking change. The only way to supply this is to call
  the `update()` method - but the component will persist this value.

  The end result is that, assuming we've ever called `update()` in the
  past, is that after startup we can still call `subscribe` before we've
  called `update()` as we will use the previous value. This means we can
  avoid some complexity in android-components where we have a complicated
  error prone dance to avoid constructing the component until FCM has
  initialized.
@mhammond mhammond merged commit 6d33f26 into mozilla:main Oct 29, 2021
@mhammond mhammond deleted the push-fix-db branch October 29, 2021 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

push errors: Error executing SQL: UNIQUE constraint failed: push_record.channel_id

5 participants