Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is cleanup invalid device tokens safe to use? #6052

Open
mtrezza opened this issue Sep 15, 2019 · 25 comments
Open

Is cleanup invalid device tokens safe to use? #6052

mtrezza opened this issue Sep 15, 2019 · 25 comments
Labels
type:feature New feature or improvement of existing feature

Comments

@mtrezza
Copy link
Member

mtrezza commented Sep 15, 2019

Issue Description

Parse Server v2.6.2 introduced a feature to clean up invalid device tokens. There have been issues reported where valid device tokens have been removed, some of which have been closed without conclusion, notably:

The root cause for these issues could be parse-community/parse-server-push-adapter#87 (comment), but because the issues have not been further investigated it's hard to tell.

The feature is still flagged and not in the docs, which makes me think there are still doubts about its functionality.

Is PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS safe to use?
Is anyone using the feature in a production environment and can give feedback?

@davimacedo davimacedo added the type:question Support or code-level question label Sep 16, 2019
@mtrezza
Copy link
Member Author

mtrezza commented Sep 21, 2019

@mrmarcsmith It seems you have been digging into the push topic with other issues you've opened. Do you have any feedback on PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS in a production environment by any chance?

@mrmarcsmith
Copy link
Contributor

@mtrezza I haven’t looked into this issue yet but I’ll see if I have time to dive in while I still have the push system fresh in my mind. Off the top of my head I would imagine the first step would be to write a failing test which would involve mocking the APNS, which would involve bouncing some different types of requests off of the APNS Servers to see how they respond. From the issue you posted I’m thinking one solution could be to keep track of the types of error responses for the different types of certificates and only remove deviceTokens where error responses from ALL certificates is “Bad Device Token” that way if one returns a temporary error we don’t remove it. What do you think?

@mtrezza
Copy link
Member Author

mtrezza commented Sep 22, 2019

I think the basic issue is mocking the APNS, it is a black box, except for what is documented anyway. Therefore I'm looking to gather some feedback on production usage in this thread.

I think rather than testing against all certificates I would put forth the motion to deprecate specifying multiple Apple certificates. Apple offers a single push certificate for sandbox and production just like Android, so it would make sense to reflect that in the adapter.

https://github.com/parse-community/parse-server-push-adapter/blob/65a7f15651dd1dfa2ea9ea13b19a68f8e3e199f6/src/APNS.js#L31-L39

image

@mrmarcsmith
Copy link
Contributor

In a couple of our apps we use two different app identifiers, one for our store version and one for our enterprise version (com.bla.app1-Store & com.bla.app1-Enterprise) therefore we need two production push certificates for one app, but the appIdentifer is stored in the installation so we would could just ensure that parse doesn’t retry on an installation that doesn’t match its appIdentifer. What about this solution?
We Put a warning at start up if they have the PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS set AND they have more than one certificate with the same identifier that says “Warning: having multiple push certificates with the same identifier is incompatible with the PARSE_SERVER_CLEANUP_INVALID_INSTALLATIONS flag. Your pushes will still send but old installations will not be removed. Use Apple’s new combined sandbox and production certificate to re-enable removing of invalid installations.“

@mrmarcsmith
Copy link
Contributor

And of course in addition to the warning, we also actually disable the cleanup if they have multiple certs with the same app identifiers

@mtrezza
Copy link
Member Author

mtrezza commented Sep 22, 2019

@mrmarcsmith Interesting, does parse server officially support multiple apps in 1 deployment?

How to you make sure the proper push certificate is used for the proper app ID?

@mrmarcsmith
Copy link
Contributor

@mtrezza to clarify, these are not two separate “apps”. Both have identical “Parse Server ApplicationIds”, identical iOS Client code and use the same cloud code. The only difference is how they are distributed and how apple treats them.

  1. The first type of deployment is the most common, deployment through the apple App Store. This is for our customers
  2. The second type of deployment uses an “enterprise” certificate which is for enterprise organizations to distribute in house apps that by pass App Store approval. This version is for the contractors that work with our customers through the app.
  • Apple requires store and enterprise versions to have different “bundleIdentifiers”
  • Apple only allows one bundleId per Push Certificate
  • Therefore, Apple requires different push certificates for our two distributions of the same app

The configuration for Parse APNS here requires a “bundleId” to be provided with the certificate. And the “bundleId” field will match the corresponding “parse-default” field “appIdentifier” in the _Installation class.

In theory, Parse should NEVER attempt sending a enterprise installation a push using the Store certificate (and visa-versa) because the bundleId’s don’t match. Which is why it sould be safe to allow multiple certificate to co-exist as long as the have different bundleId’s

@mtrezza
Copy link
Member Author

mtrezza commented Sep 22, 2019

Correct, as long as you specify a bundleId (deprecated) or topic for the adapter it will match the adapter to the appIdentifier. So removing the ability to pass multiple certificates is not ideal.

https://github.com/parse-community/parse-server-push-adapter/blob/65a7f15651dd1dfa2ea9ea13b19a68f8e3e199f6/src/APNS.js#L237

So how about adding your waning and deprecate the production/development flag in the adapter and issue an additional deprecation warning?

Production and development environments are usually separated anyway, so I don’t see a good reason to keep this entanglement.

@mrmarcsmith
Copy link
Contributor

In the case that they want to connect to the development APNS server isn’t it a different apple URL? That’s just my initial hunch on why we asked for the production flag in the config, I’m running out the door so I don’t have time to dig into the adapter code to confirm

@mtrezza
Copy link
Member Author

mtrezza commented Sep 22, 2019

It is a different URL, but the production flag only defines which certificate to try first, not the URL.

If production is true the push is first sent with the production certificate and if that fails, sent again with the development certificate, and vice-versa.

The actual URL is set in node-apn depending on the node environment:

With the "universal" Apple push certificate the production flag seems to become irrelevant and can be deprecated.

I am still unsure of how invalid device tokens are determined. When adding multiple certificates without topic, the push adapter will try all certs until a successful push. Let's say we have 2 certs, the first invalid, the second valid, which results in a failed and then a successful push. Will the device token be added to devicesToRemove?

devicesToRemove.push(token);

@mrmarcsmith
Copy link
Contributor

DeviceRemoval happens if APNS replies with “invalid deviceToken” error. here
I believe this is the bug:
In the case that the production certificate request fails for any reason even though the deviceToken is valid, when it gets sent to the development APNS server OR to the prod server with the wrong and same topic (or no topic) it would returns “invalid deviceToken” error and therefore removes it.

@mtrezza
Copy link
Member Author

mtrezza commented Sep 23, 2019

That is how it works in principle, but I couldn’t figure out yet if each result that is evaluated is

  1. a APNS response (n results for a push that required n APNS requests) or
  2. a summarized push adapter response (e.g. 1 result for a push that required n APNS requests)

In case of 1) a device token would be incorrectly removed if it fails with any tried certificate.

In case of 2) a device token would correctly be removed if it fails with all tried certificates.

@dplewis
Copy link
Member

dplewis commented Oct 15, 2019

@mtrezza Should we encourage users to use a single key instead of certificates?

This is an added benefit that they don’t expire so you don’t have to renew them.

ios: [
      {
        token: {
          key: './push/AuthKey_XYZ.p8',
          keyId: "ABC",
          teamId: "EFG"
        },
        topic: 'com.example',
      },
]

Screen Shot 2019-10-14 at 7 07 53 PM

@mtrezza
Copy link
Member Author

mtrezza commented Oct 15, 2019

@dplewis That's a fair suggestion. Apple says:

One key is used for all of your apps.

@mrmarcsmith Would that also cover your use case of multiple app IDs?

@mrmarcsmith
Copy link
Contributor

As long as we can still do multiple keys like so I’m fine.

ios: [
      {
        token: {
          key: './push/AuthKey_XYZ.p8',
          keyId: "ABC",
          teamId: "EFG"
        },
        topic: 'com.store-example',
      },
      {
        token: {
          key: './push/AuthKey_HIJ.p8',
          keyId: "KLM",
          teamId: "NOP"
        },
        topic: 'com.enterprise-example',
      }
]

The “problem” i face is apple requires a completely different Apple ID for enterprise and store meaning i have to use 2 “keys” because they are completely isolated account by design. “All of your apps” implies “All your app under this Apple ID” of which I have 2.

@mtrezza
Copy link
Member Author

mtrezza commented Oct 15, 2019

@mrmarcsmith Seems to be an edge case but makes sense for parse server to support this. So migrating from multiple certs to multiple key won't solve the issue.

I see two PRs (in order of priority):

  1. ensure invalid device token clean-up works
  2. migrate to using keys instead of certificates

ad 1)
We should continue to allow multiple certs (or keys) and validate the clean-up logic, which I tried to but got stuck, see last paragraph in #6052 (comment). Maybe you or @dplewis can help me understand the code? I can tackle the PR then.

ad 2)
Migration to keys is not required, and won't solve 1). Benefit of keys over certs:

  • lower one-time setup effort, as only 1 keys needs to be added per Apple dev account instead of multiple certs
  • no maintenance needed as keys don't expire unlike certs

@mrmarcsmith
Copy link
Contributor

@mtrezza here’s a idea, we could disallow multiple certificate/keys with duplicate topics. Then just confirm on the adapter side we aren’t attempting to send pushes for topics that don’t match the appIdentifier (which those pushes would never succeed anyways).

@mtrezza
Copy link
Member Author

mtrezza commented Oct 15, 2019

@mrmarcsmith

we could disallow multiple certificate/keys with duplicate topics

That would be a breaking change, possibly for many users, because historically it was common / required to have a separate development and production certificate for the same topic:

ios: [
    {
        pfx: 'aps-prod.p12',
        topic: 'com.example',
        production: true
    },
    {
        pfx: 'aps-dev.p12',
        topic: 'com.example',
        production: false
    }
]

Then just confirm on the adapter side we aren’t attempting to send pushes for topics that don’t match the appIdentifier

That already happens:
https://github.com/parse-community/parse-server-push-adapter/blob/65a7f15651dd1dfa2ea9ea13b19a68f8e3e199f6/src/APNS.js#L92

Again, to ensure the token clean-up works, we only need to clean up the token if it failed for all certs rather than if it failed for any cert when the adapter uses multiple certs.

@mtrezza
Copy link
Member Author

mtrezza commented Mar 20, 2020

I'm back at this issue, preparing a PR to ensure the feature is fit for production.


Current process

APNS:

  1. Recipient installations are grouped by appIdentifier, e.g. com.example.myapp
    https://github.com/parse-community/parse-server-push-adapter/blob/c9a1fc65bb61c71adaf4c79df18420c2c6546849/src/APNS.js#L83-L87

  2. For each appIdentifier group the eligible providers are matched by topic
    https://github.com/parse-community/parse-server-push-adapter/blob/c9a1fc65bb61c71adaf4c79df18420c2c6546849/src/APNS.js#L92

  3. Push notification is sent to all recipients through the first eligible provider
    https://github.com/parse-community/parse-server-push-adapter/blob/c9a1fc65bb61c71adaf4c79df18420c2c6546849/src/APNS.js#L115-L116

  4. For failed push notifications, clear the error response and re-send through the next eligible provider.
    https://github.com/parse-community/parse-server-push-adapter/blob/c9a1fc65bb61c71adaf4c79df18420c2c6546849/src/APNS.js#L121-L124

  5. The result is an array of succeeded and failed push results. A push that failed in step 3 but succeeded at step 4 will be contained only once as succeeded result. A push that failed at step 3 and 4 will be contained only once as failed result. So results are installation unique.
    https://github.com/parse-community/parse-server-push-adapter/blob/c9a1fc65bb61c71adaf4c79df18420c2c6546849/src/APNS.js#L110


FCM:

  1. Recipient installations grouped into batches of 1,000
    https://github.com/parse-community/parse-server-push-adapter/blob/02348635ab72c5e57f12a220decb96f3d6df6887/src/GCM.js#L35

  2. Batches are sent; for FCM there is only 1 provider, so no retry logic like in APNS.


Result handling in parse server:

  1. Results are tracked

    return pushStatus.trackSent(results, UTCOffset).then(() => results);

  2. Device tokens of failed results are added to devicesToRemove. For non-relevant errors the device token will not be added for removal, e.g. "temporarily unavailable" or connection error.

    if (error === 'NotRegistered' || error === 'InvalidRegistration') {
    devicesToRemove.push(token);
    }
    // APNS errors
    if (error === 'Unregistered' || error === 'BadDeviceToken') {
    devicesToRemove.push(token);
    }

  3. Tokens are deleted from installation

    if (devicesToRemove.length > 0 && cleanupInstallations) {
    logger.info(
    `Removing device tokens on ${devicesToRemove.length} _Installations`
    );
    database.update(
    '_Installation',
    { deviceToken: { $in: devicesToRemove } },
    { deviceToken: { __op: 'Delete' } },
    {
    acl: undefined,
    many: true,
    }
    );
    }


Issues

Bottom line, the potential issue of incorrectly cleaning up valid device tokens only exists for APNS. The following scenarios are possible:

a) Push to correct provider fails with some "temporarily unavailable", and retry push to incorrect provider fails with "token invalid".

b) Push to correct provider fails because certificate is expired, and retry push to incorrect provider fails with "token invalid".

c) Push to incorrect provider fails because provider topic doesn't match installation appIdentifier. This is currently impossible, but there is legacy code that should be removed:
https://github.com/parse-community/parse-server-push-adapter/blob/02348635ab72c5e57f12a220decb96f3d6df6887/src/APNS.js#L238-L241
Also, a code comment should be added, why the match is required for future reference.

d) It is currently allowed to add APNS providers without topic. These wildcard providers are used if no provider was found where the topic matches the installation appIdentifier. This scenario does not require any specific consideration, because it is covered by the above scenarios.
https://github.com/parse-community/parse-server-push-adapter/blob/02348635ab72c5e57f12a220decb96f3d6df6887/src/APNS.js#L250-L252

Just to be sure, I re-checked, whether the error codes relevant to removal of device token are correct, and yes, they are:


Bottom bottom line, issues are only possible if:
a) There is more than 1 APNS provider with the same topic, e.g. for prod / dev environments:

push: {
    ios: [
        {
            pfx: 'aps-pro.p12',
            topic: 'com.example.myapp',
            production: true
        },
        {
            pfx: 'aps-dev.p12',
            topic: 'com.example.myapp',
            production: false
        }
    ]
}

b) There is more than 1 APNS provider and at least one has no topic, e.g.:

push: {
    ios: [
        {
            pfx: 'aps-pro1.p12',
            topic: 'com.example.myapp',
            production: true
        },
        {
            pfx: 'aps-pro2.p12',
            production: true
        }
    ]
}

Solution

ad a) Disallow multiple APNS adapters with same topic.

  • Apple offers a universal APNS certificate for sandbox and production environment, so multiple certificates are not required anymore and the production flag can be deprecated.

ad b) Disallow APNS adapter without topic.

  • I don't see any purpose for adapters without topic.

Both changes are breaking changes, but the effort required by the user is just to remove one adapter from config file if there are multiple and download the universal certificate.

@mtrezza
Copy link
Member Author

mtrezza commented Mar 20, 2020

@mrmarcsmith Does the proposed solution above still allow your scenario of using an enterprise cert and regular cert for the same parse app ID?

@dplewis Should I start a PR?

@dplewis
Copy link
Member

dplewis commented Mar 20, 2020

Go for it! You have done the research on this topic. I’ll help as much as I can.

@mtrezza
Copy link
Member Author

mtrezza commented Mar 24, 2020

A recent observation: Cleaning up device tokens for FCM works as expected, but it doesn't seem to work well for APNS.

Device tokens of uninstalled iOS apps are not removed because the APNS push response is successful, as by the push-adapter verbose log. I confirmed this with 2 year old APNS device tokens and recent ones. I have only seen APNS device tokens removed for incorrect environment, e.g. sending a sandbox token to the production APNS endpoint.

The issue seems to be the APNS.
Worth mentioning in the docs when the feature becomes official.

See also

https://forums.developer.apple.com/thread/88332

@mtrezza
Copy link
Member Author

mtrezza commented Nov 2, 2020

Applied label "feature" because this thread came to be about bringing the device token clean-up from its experimental status to a standard feature. The analysis in this thread should provide the the required basis to do this.

@cr0ybot
Copy link

cr0ybot commented Mar 22, 2023

@mtrezza I realize all of the info is probably in this thread, but was this ever resolved in any way? I'd love to use this feature on a 6-year-old app with tons of "BadDeviceToken" error messages whenever there is a push sent out.

@mtrezza
Copy link
Member Author

mtrezza commented Mar 22, 2023

@cr0ybot As far as I'm aware that issue has not been touched since then. The behavior should still be the same. So you could use it, considering the caveats.

I've quickly revisited my previous comments here and it seems that if you use a universal APNS certificate for sandbox + production and thus have only one certificate entry in the push adapter configuration for APNS, there shouldn't be any issue. I don't see how we could solve the remaining issue, that can probably only be solved on the APNS side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature New feature or improvement of existing feature
Projects
None yet
Development

No branches or pull requests

5 participants