Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keys problem with latest packages #546

Closed
gkc opened this issue Oct 5, 2022 · 24 comments · Fixed by atsign-foundation/at_server#965
Closed

Keys problem with latest packages #546

gkc opened this issue Oct 5, 2022 · 24 comments · Fixed by atsign-foundation/at_server#965
Assignees
Labels
8 SP 8 Story Points - 5 Days Large blocker Flagging for resolution bug Something isn't working P1 Priority 1 unplanned Unplanned work mid-sprint plan

Comments

@gkc
Copy link
Contributor

gkc commented Oct 5, 2022

Problem
Keys of newly-created atsigns become unusable upon app restart when using latest libraries

To reproduce

  • build iot_receiver app in mwc_demo repo
  • run it, create a new atSign. Remember to back up the atKeys file
    • should work correctly
    • restart app, choose the atSign you just created
    • you will be presented with a screen asking that you type in your PIN

Workaround

  • onboard the same atSign into the store version of AtmospherePro, by importing the backed-up keys file
  • should work fine
  • restart the iot_receiver app, choose the same atSign again
  • should work fine

Observations

  • problem is not the keys file
@gkc
Copy link
Contributor Author

gkc commented Oct 5, 2022

Encountered by @nickelskevin and @cconstab

Reproduced by @tinashe404 and workaround identified. The fact that this workaround works points to where the problem is

@sonle-geekyants please can you investigate urgently as we have a broken collection of published packages right now

@gkc gkc added bug Something isn't working P1 Priority 1 blocker Flagging for resolution unplanned Unplanned work mid-sprint plan labels Oct 5, 2022
@sonle-geekyants
Copy link
Contributor

sonle-geekyants commented Oct 6, 2022

I've checked on Onboarding example and I see that serverEncryptionPublicKey from BE returns null => result returns ACTIVATE status => app navigate to OTP Screen. Please ask to BE checks Staging environment

Screen Shot 2022-10-06 at 13 50 05

Screen.Recording.2022-10-06.at.13.35.09.mov

@gkc
Copy link
Contributor Author

gkc commented Oct 6, 2022

@sonle-geekyants this is happening in production

What are you asking to be checked? Why do you need it be checked? Sorry, your message isn't clear to me.

Have you followed the instructions I gave in the bug description as to how to reproduce the issue?

@gkc
Copy link
Contributor Author

gkc commented Oct 6, 2022

More context: Issue was encountered by Scott, then by @cconstab and @nickelskevin. Reproduced by @tinashe404 yesterday using instructions given above.

@sonle-geekyants
Copy link
Contributor

Because I wasn't added to the iot_receiver app in the mwc_demo repo, I couldn't follow your instructions.
I see that when I call API transmission publickey to server, BE returns the error AT0015, could you check this error?

Screen Shot 2022-10-06 at 15 34 33

@gkc
Copy link
Contributor Author

gkc commented Oct 6, 2022

@sonle-geekyants The repo is public. The app code is here: https://github.com/atsign-foundation/mwc_demo/tree/trunk/flutter/iot_receiver

You will fail to find public:publickey@atsign if onboarding has not been completed for that atSign. Did you reset the atSign at some point?

@nitesh2599
Copy link
Contributor

I tried reverting back a few packages to use the previous versions and it started working for me, here's the branch https://github.com/atsign-foundation/mwc_demo/tree/fix/new_onboarding_atsigns, and here's the commit atsign-foundation/mwc_demo@1472997.

Will try to figure out if it is happening because of any one of the packages.

@gkc
Copy link
Contributor Author

gkc commented Oct 6, 2022

Thank you @nitesh2599 !

@sonle-geekyants
Copy link
Contributor

sonle-geekyants commented Oct 7, 2022

I see when we use at_client: 3.0.38 it is working but if upgrade to version 3.0.39 (or 3.0.40) it will work failure
In at_client -> at_persistence_secondary_server library, we use version 3.0.34 it is working and if we upgrade it to version 3.0.36 it works error

@gkc
Copy link
Contributor Author

gkc commented Oct 7, 2022

Thank you @sonle-geekyants!

@VJag can you/team investigate what changed in at_persistence_secondary_server between 3.0.34 and 3.0.36 please, and see if you can diagnose the problem?

@gkc gkc assigned VJag and unassigned sarika01 and sonle-geekyants Oct 7, 2022
@sonle-geekyants
Copy link
Contributor

I use version 3.0.35 and it's working so I'm comparing version 3.0.35 and 3.0.36.
I think may be wrong in Hive and I'm still checking it.

@gkc
Copy link
Contributor Author

gkc commented Oct 7, 2022

Ok thank you @sonle-geekyants

@VJag let's wait until @sonle-geekyants has finished their investigation

@VJag
Copy link
Member

VJag commented Oct 7, 2022

Ok @gkc

@sonle-geekyants
Copy link
Contributor

Sorry, I've investigated but haven't found the cause yet. Could you ask someone who knows this library to investigate with me? @gkc

@sitaram-kalluri
Copy link
Member

sitaram-kalluri commented Oct 7, 2022

Because I wasn't added to the iot_receiver app in the mwc_demo repo, I couldn't follow your instructions. I see that when I call API transmission publickey to server, BE returns the error AT0015, could you check this error?

Screen Shot 2022-10-06 at 15 34 33

Investigated the issue and suspect the following changes might be causing the issue: atsign-foundation/at_server#893

Reasoning:

The signing_privatekey and signing_publickey are being skipped from sync and the changes are available on the client side, but the corresponding changes on the server are NOT released to production yet. Hence server tries to send the signing keys to the client (to sync); client receives the keys but does not update the local commit id (since on client side, skip signing keys are in-place) which leads to sync run into a loop. As a result, the encryption public key does not sync to the remote secondary and hence throws public:publicekey@ not found the key store.

Possible fixes (If above mentioned results to be root cause of the issue):

  1. Revert the skip commit log changes on at_persistence_secondary_server
  2. Promote the v3.0.24 version to production to make the "skip signing_key on sync" available on the server side.
  3. Restrict the app to use at_persistence_secondary_server with version 3.0.35 untill we release the at_secondary_server - v3.0.24 to production.

Tested # 1 on @are83asparagus and works good. Pushed the changes to revert_skip_signing_keys

Following is the test performed on mwc demo receiver app:

  1. Onboard an atsign and store the back-up keys
  2. Uninstall the app
  3. Re-install the app and provide the .keys file and app is onboarded successfully.

@gkc @VJag @murali-shris : Please suggest on which of the above approaches to take in order fix the issue.

@sitaram-kalluri
Copy link
Member

Basing on discussion between Jagan, Murali and myself, the approach-1 seems feasible.

Also the initial thought to skip signing keys is to resolve the decryption error in the sync conflict resolution ( because the data in the server is not encrypted which we are trying to decrypt in client side)

Later we added a check 'isEncrypted ' before decrypting the data in sync conflict resolution. Hence skip signing keys is no longer needed.

Please share your thoughts on this?

@gkc
Copy link
Contributor Author

gkc commented Oct 9, 2022

  1. Have you tested to prove that "skip signing keys" is no longer necessary?
  2. In any event, I would prefer that we explicitly prevent syncing of data that shouldn't be synced so my preference is to promote current canary ti prod. Note that to do that we need to do another canary release first so that the new promote workflow can work

@sitaram-kalluri
Copy link
Member

@gkc :

Tested the below scenario's on the canary servers and staging server

  1. On canary server, on-boarded atmosphere pro app and initial sync was successful.
  2. On canary server, able to send and receiver files successfully.

On Staging atsign's:

  1. Created two new atsign's and able to onboard the app with back-up keys file successfully.
  2. With atmosphere pro app, able to send and receive files successfully.

@sitaram-kalluri
Copy link
Member

sitaram-kalluri commented Oct 10, 2022

Tested the issue on a staging atsign:

onboard_new_atsign_app.mp4

@sitaram-kalluri
Copy link
Member

image

  1. New client sends sync:-1 to the server
  2. Server sends the sync response which includes signing private key and signing public key
  3. On the client side, the entries gets updated into the keystore but not into the commit log
    (Due to changes in at_persistence_secondary_server v3.0.36).
  4. Now, the localCommitId will always be less than the serverCommitId leading the sync process to get into infinite loop
  5. Hence the public encryption key is not synced to the cloud secondary which leads to onboarding process failure

As an immediate solution, we are reverting the changes related to the skip sync of signing keys and statsNotificationId and publishing the at_persistence_secondary_server and at_client packages.

We will investigate possible approaches to accommodate skipping of certain keys on client/server. Also we have to revisit the condition that result in sync loop.

@gkc
Copy link
Contributor Author

gkc commented Oct 11, 2022

Got it. And to be specific, the condition that never becomes false is in _syncFromServer, while (serverCommitId > localCommitId) which means we never get to the part of the sync process which syncs from the client

@gkc
Copy link
Contributor Author

gkc commented Oct 11, 2022

Reopening as we have not yet published the new package nor pushed the new canary server release

@gkc gkc reopened this Oct 11, 2022
@sitaram-kalluri
Copy link
Member

Fixes the issues in at_persistence_secondary_server with 3.0.39 and at_client 3.0.41

@gkc gkc added the 8 SP 8 Story Points - 5 Days Large label Oct 14, 2022
@gkc
Copy link
Contributor Author

gkc commented Oct 14, 2022

Complete. Turned out to be a lot of work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8 SP 8 Story Points - 5 Days Large blocker Flagging for resolution bug Something isn't working P1 Priority 1 unplanned Unplanned work mid-sprint plan
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants