Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo rebases #730

Merged
merged 304 commits into from
Apr 4, 2023
Merged

Repo rebases #730

merged 304 commits into from
Apr 4, 2023

Conversation

dholms
Copy link
Collaborator

@dholms dholms commented Mar 28, 2023

This implements rebases in both the repository library & on the PDS.

Note: this does not yet add a rebase XRPC method to the PDS as it can be a costly operation. However, we will likely want to support that at some point.

We also only currently support the simplest case of rebase: no operations, just removing history

For the PDS, the rebase process is:

  • wipe the existing bookkeeping (repo_commit_block & repo_commit_history)
  • add the new repo block
  • walk the repo & get a list of all cids that are still in it (including the new root block)
  • reindex all of those in repo_commit_block & repo_commit_history
  • left join ipld_block with repo_commit_block & delete all blocks from ipld_block that do not have a related repo_commit_block

After that we:

  • assign all existing blobs to be associated with the new root
  • emit a seq event for the rebase, containing only the new root block
  • invalidate all previous seqs for that repo

dholms and others added 30 commits March 15, 2023 17:31
…-app-migration

Lexicon: declaration, assertion/confirmation, follow app migration
…gration

Lexicon: votes to likes app migration
Grapheme counting for lex + increase post size
@dholms dholms changed the base branch from main to feature/subscription-revamp March 28, 2023 18:56
Base automatically changed from feature/subscription-revamp to lex-refactor March 28, 2023 20:37
Base automatically changed from lex-refactor to main March 31, 2023 17:34
Copy link
Collaborator

@devinivy devinivy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💎 looks solid!

packages/pds/src/sequencer/events.ts Outdated Show resolved Hide resolved
@@ -69,7 +75,7 @@ export class RepoService {
const commitData = await this.formatCommit(storage, did, writes, swapCommit)
await Promise.all([
// persist the commit to repo storage
await storage.applyCommit(commitData),
storage.applyCommit(commitData),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 nice snag!

Comment on lines -121 to -140

// DataStores
// ---------------

export type DataValue = {
key: string
value: CID
}

export interface DataStore {
add(key: string, value: CID): Promise<DataStore>
update(key: string, value: CID): Promise<DataStore>
delete(key: string): Promise<DataStore>
get(key: string): Promise<CID | null>
list(count?: number, after?: string, before?: string): Promise<DataValue[]>
listWithPrefix(prefix: string, count?: number): Promise<DataValue[]>
getUnstoredBlocks(): Promise<{ root: CID; blocks: BlockMap }>
writeToCarStream(car: BlockWriter): Promise<void>
cidsForPath(key: string): Promise<CID[]>
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I support, good call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was added in a certain frame of mind. But it's only been adding friction recently & it's not clear when/if we'll be supporting other datastores 👌

@@ -124,6 +124,14 @@ export class RepoBlobs {
.execute()
}

async processRebaseBlobs(did: string, newRoot: CID) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a question for another day, but I think as-is blobs on deleted records will stick around, including after a rebase. So there could be blobs associated with the rebase commit, but no record present in that commit referencing them. Hmmm!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah that's a good point 🤔

alright I guess I'll probably have to list repo-blobs then

Comment on lines 168 to 172
.deleteFrom('ipld_block')
.where(
'cid',
'in',
this.db.db
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to also be constrained by did?

Suggested change
.deleteFrom('ipld_block')
.where(
'cid',
'in',
this.db.db
.deleteFrom('ipld_block')
.where('creator', '=', this.did)
.where(
'cid',
'in',
this.db.db

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I see, this is just sort of general cleanup that runs 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

altho actually youre right because we don't have an index on just cid

packages/pds/src/sql-repo-storage.ts Outdated Show resolved Hide resolved
@dholms dholms merged commit ac905f3 into main Apr 4, 2023
@dholms dholms deleted the repo-rebase branch April 4, 2023 14:58
mloar pushed a commit to mloar/atproto that referenced this pull request Nov 15, 2023
* pr feedback

* count utf8 & grapheme length

* add maxUtf8

* siwtch max semantics

* plural

* update post schema

* added bytes & cid refs

* add ipld<>json

* fixin up a could tings

* Add app.bsky.richtext.facet, replace post entities with facets

* plural actors

* wip

* Setup backlinks table on pds

* wip

* send & recieve cids/bytes with xrpc

* Track backlinks when indexing records on pds

* handle ipld vals in xrpc server

* added cids & bytes to codegen

* In createRecord, add deletions to avoid duplicate likes/follows/reposts

* Tests and fixes for prevention of dupe follows, likes, reposts

* Backlink migration tidy

* cleanup dag json parser

* Fix dupe backlink inserts

* Tidy

* blob refs + codegen

* Make profile displayName optional

* Test view and updateProfile for empty display name

* working into pds

* Make aggregate counts optional on post and profile views

* Make viewer state optional on post view for consistency

* Remove deprecated myState field on profile view

* Tidy repo method descriptions

* tests & types & fixes

* Implementation and tests for putRecord

* Remove updateProfile method

* Update repo service so that head can be taken for update externally

* Lex updates for compare-and-swap records/commits

* Add error to lex for bad repo compare-and-swaps

* Improve update-at-head thru repo service

* common package

* Implement and test compare-and-swaps on repo write methods

* Use lex discriminator for applyWrites

* Remove post entity/facet index

* Update lex descriptions to clarify repo write semantics

* Make deleteRecord idempotent w/ tests

* cleanup

* fix things up

* adding more formats

* tests

* updating schema

* Only generate tid rkeys on pds, support literal rkeys on client

* Add backlink indexes

* Update format of post embed views, fix external uri validation

* fixing up tests

* Include embeds on record embeds

* cleanup

* Notify users when they are quoted

* Remove determineRkey indirection

* fix api tests

* support concatenated cbor

* integrating to server

* re-enable tests

* fix up tests

* Thread compare-and-swaps down into repo service rather than use pinned storage

* Tidy

* Update packages/common/tests/ipld-multi.test.ts

Co-authored-by: devin ivy <devinivy@gmail.com>

* Update packages/lexicon/src/validators/formats.ts

Co-authored-by: devin ivy <devinivy@gmail.com>

* pr feedback

* pr feedback

* Add postgres-specific migration path for missing profile display names

* Tidy/clarify deep embeds

* Tidy

* rm unused escape

* decrease crud race count

* update subscribeRepos lexicon

* Fix applyWrite lexicon re: collection fields

* sign post event type

* update cids & bytes json encoding

* update lex blob & cid-link types

* updated codegen & pds

* number -> float

* missed a couple

* remove old image constraints

* pr feedback + descripts

* no hardcoded port numbers

* remove separate tooLarge evt

* fix dumb build error

* fixin gup lex + xrpc server

* better parsing of message types

* dont mutate body in subscription

* bugfix in subscription

* rm commented out code

* init feature branch

* undo

* Remove old lexicons

* Remove creator from profile view

* wip

* rework seqs

* fixed up tests

* bug fixing

* sequence handles & notify in dbTxn

* tidy

* update lex to include times

* test syncing handle changes

* one more fix

* handle too big evts

* dont thread sequencer through everything

* Split common into server vs web-friendly versions

* Make lexicon, identifier web-safe using common-web

* Switch api package to be a browser build, fix identifier package for browser bundling

* Fix pds and repo for lexicon package changes, tidy

* Make common-web a browser build, tidy

* fixing up deps

* fix up test

* turn off caching in actions

* Standardize repo write interfaces around repo input

* Update repo write endpoints for repo input field

* Remove scene follows during app migration

* API package updates (bluesky-social#712)

* Add bsky agent and various sugars to the api package

* Add richtext library to api package

* Update richtext to use facets and deprecate entities

* Update richtext to use utf8 indices

* Richtext converts deprecated entity indices from utf16 locations to utf8 locations

* Add note about encodings in the lexicon

* Add RichText facet detection

* Remove dead code

* Add deprecation notices to lexicons

* Usability improvements to RichText

* Update the api package readme

* Add RichText#detectFacetsWithoutResolution

* Add upsertProfile to bsky-agent

* Update packages/pds/src/api/com/atproto/repo/applyWrites.ts

Co-authored-by: devin ivy <devinivy@gmail.com>

* pr feedback

* fix flaky timing streaming tests

* simplify emptyPromise

* fixed up open handles

* fix missed repo syntax

* fix error in test from fkey constraint

* fix another api agent bug

* Embed consistency, add complex record embed

* Tidy embed lex descriptions

* rename pg schemas

* use swc for jest

* fix up deps

* cleanup

* Update pds indexing, views, tests for complex record embeds

* fixing up profile view semantics

* wip

* update snaps

* Rename embed.complexRecord to embed.recordWithMedia

* Tidy aroud record w/ media embeds

* Add grapheme utilities to api RichText (bluesky-social#720)

Co-authored-by: dholms <dtholmgren@gmail.com>

* Fix: app.bsky.feed.getPostThread#... to app.bsky.feed.defs#... (bluesky-social#726)

* Update bskyagent to use repo param

* Minor typing fix

* setting up rebase in repo & storage

* repo tests

* Add exports to api package: blobref & lex/json converters (bluesky-social#727)

* Add exports to api package: BlobRef & lex/json converters

* Add an example react-native fetch handler

* Switch all lingering references of recordRef to strongRef

* Update lexicon for richtext facets to have multiple features, byte slice rather than text slice

* Implement multi-feature richtext facets on pds

* integrate into services & sequencer

* more tests

* Update api package to use updated richtext facets

* one more test

* Minor fixes to admin repo/record views

* Fix app migration exports, remove old app migration

* Fix: sort richtext facets so they can render correctly

* Disable app migration dummy checks that don't work on live deploy

* Optimize lex de/serialization using simple checks

* Tidy comment typos

* App migration to cleanup notifications for likes, follows, old scene notifs

* Fix notification reason for change from vote to like

* Update packages/pds/src/sequencer/events.ts

Co-authored-by: devin ivy <devinivy@gmail.com>

* pr feedback

* handle rebased blobs properly

---------

Co-authored-by: devin ivy <devinivy@gmail.com>
Co-authored-by: Paul Frazee <pfrazee@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants