Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account migration #2179

Merged
merged 69 commits into from
Feb 21, 2024
Merged

Account migration #2179

merged 69 commits into from
Feb 21, 2024

Conversation

dholms
Copy link
Collaborator

@dholms dholms commented Feb 14, 2024

Adds account migration flow based on schemas in #2170
Entryway impl: #2185

Supporting features for account migration:

  • ability to "import" a repo and index the diff (configurable to be not allowed)
  • ability to request a signed short-lived service auth token for account
  • ability to create an account with a PDS with an existing DID based on a proof of ownership of that DID (service auth token as an authorization header)
  • ability to request an arbitrary signed PLC operation from the PDS. This route requires confirmation through an email token
  • a route for a PDS to submit a PLC op on the user's behalf. The PDS will check the PLC to ensure that it correctly configures the DID. If it would break the account (bad pds endpoint/signing key/rotation keys/handle), it does not allow submission
  • a route to get a recommendation of DID credentials for a user's account
  • a route to introspect the status of an account - repo state, indexed record count, whether it is active or not, whether the DID is active or not, how many blobs are missing, etc
  • a route to an account to list missing blobs in their repository, along with the record uri that the blob is associated with

Smuggled into this is the ability to activate/deactivate an atproto account. When deactivated, the account cannot make any writes to its repository, and reads for the repository are not served to any viewers (other than the account in question).

@@ -136,6 +137,14 @@ export class SqlRepoReader extends ReadableBlockstore {
return builder.execute()
}

async countBlocks(): Promise<number> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that this could cause some blocking.

Copy link
Collaborator Author

@dholms dholms Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's only used for reads (and we're in WAL mode), I don't think it would block, but it might fail with a SQLITE_BUSY and have to retry

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry didn't explain that well! I was picturing blocking the main app thread. E.g. if there's a second of work in sqlite to do here, the app thread doing the work will block for a second.

const blob = await ctx.actorStore.transact(requester, (actorTxn) => {
return actorTxn.repo.blob.addUntetheredBlob(input.encoding, input.body)
})
const blob = await ctx.actorStore.transact(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in here underline for me how it might be beneficial to split-up the db work from the upload work. As I understand it, as long as this transaction remains open other writes to the same actor store will block/timeout. Adding a read into the transaction makes it a little higher stakes now, as I think that kicks sqlite into doing some additional locking.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I'll come back to this so that it doesn't hold the txn open while uploading 👍

Comment on lines +26 to +28
await ctx.actorStore.transact(did, (store) =>
importRepo(store, input.body),
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that in general—i.e. even if not given our implementation/policy—we can expect this to last long enough that it might time out the connection. And some implementations might want to plop it on a job queue and deal with it totally asynchronously.

Even we might consider just tossing the repo car on disk or into s3 then working with it streaming or in parts to make it robust to repos that don't happily fit into memory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(To be clear I think this is fine for our current purposes, mostly want to leave the door open to these possibilities.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah hmmm. is there a way that we can do both? like process it synchronously in most cases, but drop into async if it's large enough? Not sure how that would surface in schemas 🤔

Comment on lines +42 to +43
if (roots.length !== 1) {
throw new InvalidRequestError('expected one root')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we might look into failing earlier for this case which could feasibly be detected up front, just a slight vector for abuse.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's a good point 👍

we need a better CAR parsing lib first 😅

Comment on lines 74 to 76
throw new InvalidRequestError(
`Could not parse record at '${write.collection}/${write.rkey}'`,
)
Copy link
Collaborator

@devinivy devinivy Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any errors in here will end-up unhandled and cause a crash. What we probably want to do is setup an AbortController and user p-queue's support for signals to thread it through and bail on work once there's a failure. One the diff.writes loop ends and the queue is determined idle we can call signal.throwIfAborted() to raise the original error if one occurred.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof whatdya know 🙃

i'll fix this up

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth a quick peek: 153c90b

Copy link
Collaborator

@devinivy devinivy Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth simulating to be sure, but looks good 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup i gave it a test & it worked how you'd expect 👍

const user = await ctx.accountManager.getAccount(did)
const user = await ctx.accountManager.getAccount(did, {
includeDeactivated: true,
includeTakenDown: true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we let takendown accounts to this point thanks to accessCheckTakedown.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right 👍

const account = await ctx.accountManager.getAccountByEmail(email)
const account = await ctx.accountManager.getAccountByEmail(email, {
includeDeactivated: true,
includeTakenDown: true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding support for taken-down accounts to reset their password, update their email. Just double-checking we like that 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it from update email, but I do think it's import for resetting your password 👍

You may want to reset your password in order to export some of your private data or get a signed PLC operation for instance

)
}

if (!pdsEndpoint || pdsEndpoint !== ctx.cfg.service.publicUrl) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If trailing slashes aren't significant, we might consider permitting the pds endpoint both with and without the tailing slash.

Copy link
Collaborator

@devinivy devinivy Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If we pursue this, similar logic also appears in identity.submitPlcOperation.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to just leave it more conservative for now, and we can make it more lax over time if we decide to. Easier to go that way than the other 🤔

Comment on lines 250 to 254
try {
return await this.userDidAuth(reqCtx)
} catch {
return { credentials: null }
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend we require the auth be valid if it's present. It's conservative on the server side, but also usually friendly to the caller too, since if they included auth they probably intended to take action in the context of those credentials. They would also be aware that they were holding onto some bad credentials.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup that's fair 👍

Comment on lines +260 to +263
adminService = async (reqCtx: ReqCtx): Promise<AdminServiceOutput> => {
const payload = await this.verifyServiceJwt(reqCtx, {
aud: this.dids.entryway ?? this.dids.pds,
iss: [this.dids.admin],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a world where pdses have dids, I wonder if we'll eventually want service auth to include the pds as the issuer and the user as the subject. Not relevant to this PR, just wanted to preserve the thought before it slips away!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup I could totally see that 👍

Copy link
Collaborator

@devinivy devinivy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a dream to see this working, it came out real nice 🤤

Base automatically changed from account-migration-lexicons to main February 21, 2024 01:23
@dholms dholms merged commit 30b05a7 into main Feb 21, 2024
10 checks passed
@dholms dholms deleted the account-migration branch February 21, 2024 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants