Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Post Migration #12423

Open
SilverWolf32 opened this issue Nov 18, 2019 · 261 comments
Open

Support Post Migration #12423

SilverWolf32 opened this issue Nov 18, 2019 · 261 comments
Labels
suggestion Feature suggestion

Comments

@SilverWolf32
Copy link

#177 – Support Account Migration – was closed after implementing follower migration, but this is only one small part of a true migration. To really be able to change instances, you need to be able to take your posts with you. There was some good discussion on this over there; I'm opening a new issue to make it clear that this is a separate concern from that issue, which seemed to evolve into only being about followers.

Personally, I couldn't care less about migrating my followers/following list. I can refollow people and have them refollow me. It'll all work out. My posts, however, are currently impossible to restore.

@MirceaKitsune
Copy link

Once more I fully support this and have been eagerly awaiting the much requested feature. I would love to see the ability to import the data archives (favorites boosts and toots) exported from the https://instance.social/settings/export section, which is currently a one-way system only. Unfortunately there doesn't seem to be a plan for this yet, but I'm staying hopeful that will change eventually. I believe Eugen mentioned performance concerns on the previous discussion.

@trwnh
Copy link
Member

trwnh commented Nov 19, 2019

Reiterating points from previous discussion:

@SilverWolf32
Copy link
Author

Thanks @trwnh!

status import cannot be backfilled for performance reasons

What's backfilling?

@trwnh
Copy link
Member

trwnh commented Nov 20, 2019

backfilling is #34 -- fetching old statuses from a profile that were never delivered to the instance. you can't expect to deliver 100,000 activities every time you migrate, because that's wildly inefficient.

@MirceaKitsune
Copy link

MirceaKitsune commented Nov 20, 2019

Regarding performance and backfilling: This is why I believe that post migration is possible, but those who choose to use it will need to accept a long waiting period. There will probably have to be three types of limitations to make it viable:

  1. An account queue. For example, the instance may only allow up to 5 accounts to be migrated simultaneously. The archive import menu should show a message saying something like "you have to wait until 10 more accounts finish migrating before we start the import process".
  2. Every post and action will be migrated with a given delay. This could be one post per 10 seconds. The import menu should then show the message "10 out of 1000 posts migrated" to let you know about the status of the procedure.
  3. Only a certain number of the latest activities will be imported. If an archive contains 1.000.000 actions, the interface informs the user that only the latest 100.000 will be considered.

Obviously, if you're importing something like 10.000 boosts + faves + toots, the process may take days if not weeks. It's a price the user will have to pay. But in this format it sounds like it should be performance viable at least.

@SilverWolf32
Copy link
Author

That makes sense! Personally I'm perfectly fine with them taking a long while to move over, so long as they move over eventually.

I'm not quite sure what the limit should be, though – one post per 10 seconds seems awfully slow to me, unless you have like 100 accounts being migrated at once. Maybe set it relatively fast by default (like 1/s or maybe even a little higher), since most instances are small and won't need to deal with a lot of migrations, then larger instances can set it slower?

@SilverWolf32
Copy link
Author

SilverWolf32 commented Nov 20, 2019

Although I would really, really prefer if everything gets migrated, even if it takes weeks or even months. I don't want to lose my old posts – that's basically an archive of what I'm like, so new people can get to know me a little better before they decide whether to follow me. It's also history, and while some may not want to keep that sort of thing, others (like me) would.

(Even if you technically have everything in the backup, that's incredibly cumbersome to look at, given I'm not aware of any software that can actually import it.)

@trwnh
Copy link
Member

trwnh commented Nov 20, 2019

stuff can get imported, but it is absolutely a bad idea to redeliver it. i am firmly of the opinion that a Move with post import should maybe do a rewrite of existing statuses in the database, but nothing more. it is unwise to even consider backfilling until the actual decoupling and importing gets implemented.

more to the point in general, though: it is fundamentally bad idea to assume that every instance should have a copy of every single post at all times. we don't need to duplicate everything 6000x or more. what is fundamentally happening with a "migration" is that the authority is being shifted; the location is being shifted, that's all. it's a Bad Idea to treat them as new statuses. hopefully i've explained why

@SilverWolf32
Copy link
Author

Hm...well, would they be new statuses? I figured they would simply be reassigned, but then maybe redelivered as the originals, not duplicated and "reposted".

Or maybe I'm completely misunderstanding how federation works. I know very little about the backend.

every instance should have a copy of every single post at all times. we don't need to duplicate everything 6000x or more

Is this actually being discussed? I'd say for a migration the situation is a bit different; only the destination server has to receive a copy of every single post, not everybody.

@trwnh
Copy link
Member

trwnh commented Nov 20, 2019

only the destination server has to receive a copy of every single post

yeah, and it should be limited to this.

looking at prior art from zot, you can do an online migration (fetch old messages from your outbox and inbox to the new server) or an offline migration (using an exported account archive that contains at minimum your keys, your posts, and your address book)

@SilverWolf32
Copy link
Author

That sounds lovely! I've been a little worried in the back of my mind about how to handle a server that just suddenly dies.

@lmachucab
Copy link

Is there at least a first-party tool that users (not servers) can use to work their downloaded backup and generate something usable? Something like a HTML file (plus media?) that can be uploaded into a remote server. Would at least temporarily solve the problem of making the lost content accessible again.

@MirceaKitsune
Copy link

@lmachucab I made a NodeJS script that I used to port my faves and boosts to other accounts. Unfortunately it no longer worked last time I tried it, and it was a crude solution that put quite a bit of stress on the target server and even triggered an instance's flood protections once. If you think you might have some use for it still, it's still available on my Github:

https://github.com/MirceaKitsune/mastodon_migrate

@masstransithonchkrow
Copy link

Hi there! Mastodon.cloud literally has a month to live. We're being shut down because the domain is being forced to bow to insane regulations, and the admin isn't having any of it.
https://mastodon.cloud/@TheAdmin/104227496654670309

We had been the target of immense DDoS attacks for the past six months.
If you need help or simply wish to deliberate, please let me know.

@msikma
Copy link

msikma commented May 27, 2020

I'd like to second that Mastodon.cloud disappearing shows this is a crucial feature. Mastodon is a great piece of software, and designing it to be decentralized was a great idea. Making it possible for anyone to set up an instance, yet having instances be able to communicate with one another, makes the network resilient and eliminates single points of failure.

But without the ability to jump from one instance to another at will, this decentralization has a large drawback. If an instance decides to call it quits, you're just going to lose your data. This also means that as a user, you can't be sure if the instance you picked is going to live a long or a short life. Given how many instances there are, it's totally reasonable to expect that some of them are going to disappear.

Aside from users choosing to migrate somewhere else, there should also be a migration plan for instance admins. If an instance shuts down, it shouldn't just be that active users who care about their account get to keep their data, because that means even if user migration is possible there's still going to be a big loss of online culture and data. I think this should probably be a separate issue and come with a host of problems by itself, but without it I think we'll see Mastodon content regularly disappearing forever as instances come and go.

@MirceaKitsune
Copy link

Yeah even I didn't expect this sad news. mastodon.cloud banned me randomly and without explanation after a few months of using it... at the time I was upset, now I'm glad I escaped this event. It's sad to hear even Japan is becoming a dangerous place for internet freedom and has a government failing to understand technology and the flow of information.

I know mastodon.cloud was one of the largest instances out there. If it can go down, the fediverse as a whole is at serious threat of significant data loss. That's why I would further implore that we please support FULL account migration now... even if as an option disabled by default which the instance admin can enable if they so wish, in case performance is such a big concern.

An equally better idea would be a node-based approach to storage, similarly to how IPFS / DAT in concept; Instead of an instance being stored on and served by one physical server, it can be stored in a decentralized cloud where anyone may run a node to serve a copy of the data. Sadly this would require such a rewrite that we'd be talking about a whole new project over the existing Mastodon.

@masstransithonchkrow
Copy link

A reminder that we're supposed to "Own Your Own Data With Mastodon!" If we don't find a way to do that with toots, we are bound to be sued for false advertising. I smell a frivolous lawsuit that's bound to cripple the Fediverse and we need to get ahead of it.

While I may not ideologically agree with Gab, which accounts for 25% of the Fediverse's mass, they have every right to be here as we do, and we need to prepare for pointed media smear campaigns that could potentially harm our reputation.

We need to emphasize our content controls, and remind users that they can ban who they want from their own console, and that moderators do not have to do it for them. We have that advantage over Twitter and we need to advertise th out of it.

@Valenoern
Copy link

I smell a frivolous lawsuit

Mastodon's own code is under the AGPL, right?
as I remember, it and the GPL have some fairly thorough sections saying you (gab) can't turn around and sue a project you built on (mastodon) because it didn't fit your use case, and that if they can't cancel liability/warranty, courts are asked to "arrange something as close to that as possible".

That's the situation you're talking about, right?

I'll agree the AGPL would have no effect on media outlets saying whatever charged things about mastodon they want to, though.

@masstransithonchkrow
Copy link

That is an interesting point. I think they forked Mastodon, though (as of 2.8.5) and created Gab out of it. Ironically, it seems they were able to preserve their "toots" during the migration.

My comment revolved around the idea that you can "Own Your Own Data With Mastodon". If this is the case, we must ensure that people have the capability to export their toots otherwise that slogan could be misconstrued as false advertising. We must have a provision that allows users to migrate to another server if the one they're on is being shut down through no fault of its own (regulatory crackdowns in the case of Mastodon.cloud).

What happened in Minneapolis is a distraction. The attacks against the Fediverse and all other entities like it will intensify. We must fortify our infrastructure and address any outstanding issues so that we're ready for any exodus.

@masstransithonchkrow
Copy link

Even a static representation of said toots is fine with me, similar to Twitter's archive, which you can navigate offline. They don't need to be integrated into the post chronology. Perhaps we could create a designation for such toots, and also prohibit their editing or removal to ensure a smooth migration.

@ccoenen
Copy link

ccoenen commented May 30, 2020

there's an issue for that #9461 (export including some bare html-version of your toots)

With regard to european GDPR, there's the rule that you must have access to a machine readable version of you data. This is satisfied already. While this satisfies the letter of this particular law, I do wish it was easier for everyone to use that data (hence the ticket above). I also think re-importing would be nice.

Yet, I do not believe there's any grounds to your "false advertising" point. Sorry.

(edit: I linked to the wrong issue before, this has been corrected.)

@masstransithonchkrow
Copy link

Thank you for clarifying that. I'll follow that issue too. ^_^

@CoWinkKeyDinkInc
Copy link

I strongly support this. I have my own independent instance that I don't want to maintain anymore so I moved over to another instance. DigitalOcean doesn't let you download complete images as a backup so I can't wait to upgrade to a later version that supports it and transfer the old content over.

@lmachucab
Copy link

This is a feature that it's impressive that Mastodon does not have in 2020, honestly. Without the ability to export and import post history not only can accounts be lost but also identities - as even if an account is migrated to another server, at the moment this is a partial "in name only" process. Any content that the old account had posted is still bound to the lifetime of the old service (and its FQDN).

After having tested some scripts for extracting the information from the exported data dumps, I think there's enough of those scripts and projects wandering about that it's perfectly feasible to grab one (1), integrate it into Mastodon, and offer it as an utility that eg.: generates full sessions / scripts for clients like toot to execute, resulting in a toot post or toot upload of parts of or of the whole archive in HTML or similar readable format.

@mattbk
Copy link

mattbk commented Nov 23, 2020

@lmachucab do the existing scripts deal with not treating the imported posts as new posts that would be picked up for federation? From what I'm reading above, that's what should be avoided to reduce load on servers.

I would be happy if my posts were only viewable from my account page/feed rather than immediately (or ever) federated (i.e., they would only be federated if someone was looking at my account). All I want to be able to do is change my username or move to another instance if needed.

@masstransithonchkrow
Copy link

masstransithonchkrow commented Nov 29, 2020

I thought I'd bring this up:

One of the SPC admins revived bofa.lol, a previously discontinued instance, only for it to regurgitate its old content over TWKN, some of which was excessively vile. bofa.lol aside, I'd like to focus on the mechanism that caused those toots to refederate and see if we can harness that mechanism to archive them for other instances.

As of this writing, bofa.lol was taken down less than a day after it was briefly revived.

*Edited for context and some spelling mistakes at 21:50-0500

@Beyarz
Copy link

Beyarz commented Jul 3, 2021

This is a feature that it's impressive that Mastodon does not have in 2020, honestly.

2021*

@TomatDividedBy0
Copy link

TomatDividedBy0 commented Jul 25, 2021

This is going to be incredibly important to ensuring Fediverse platforms don't centralize around the most popular instances. If I have the ability to completely move my account from one instance to another with minor hassle, chances are I'm going to be more willing to take risks on smaller instances. Some of my first accounts on Masto/Peertube were on small instances, but they ended up closing down on me unexpectedly.

@z3dx95g7
Copy link

@Gargron, do you consider this to be a feature worth pursuing? if so, what is needed to make it happen? it's been open for nearly 2 years without any comments from mastodon's lead, so i thought i would ask.

@dj-sf
Copy link

dj-sf commented Aug 28, 2023

hi @juliennnnn , I've been looking into the approach that I suggested in my above messages. I'm making progress on the research side but there is a lot to learn and I have not started implementing anything yet. My goal is to have a POC complete by the end of 2023 and by then we should better understand whether this approach is feasible.

I'm also looking for others who are interested in working on this, so if you know anyone it please feel free to put them in touch with me. It might help speed things up.

@MastodonContentMover
Copy link

MastodonContentMover commented Sep 3, 2023

Apologies for the delay in sharing this here. A development/test version of the solution outlined here is now available as an external command-line tool, at https://mastodoncontentmover.github.io/

It's not as sophisticated a solution as many in this thread are looking for, but it provides basic functionality to move post content, including media attachments, for people who need to migrate between instances. It does not preserve threaded interactions, but it does preserve self-reply threads (involving threads where all posts are by the account owner themselves).

It respects the Mastodon standard API rate limits with an additional margin, and it is possible for users to optionally slow the tool's activity even more. Media posts in particular are significantly throttled because media processing is resource intensive, and by default it suppresses public posts so that timelines are not flooded. It's possible to selectively save and/or repost only bookmarked posts. It runs on any platform that has a Java runtime environment (version 1.7 or above), and the source code is available on Github.

It is only intended as a workaround; I built it because I needed to move my own content, but was mindful throughout that this functionality was also needed by others — it makes sense to share it, but because of how it came to be and because of time constraints it is a little rough around the edges.

Nonetheless, I hope it helps some of those facing the loss of their content due to the absence of this functionality from Mastodon itself, and ultimately I hope it might prompt Mastodon to consider building at least this basic level of functionality into core (which would offer some advantages such as allowing admins more control over post importing and how that is prioritised on their instance, and possibly allowing posts to be correctly backdated as well).

@haley-exe
Copy link

Really fantastic to see some progress toward a content migration solution! Thank you for sharing your work.

@erlend-sh
Copy link

I favor a very simple ‘forwarding address’ approach that leans on existing federation features as a form of archiving:

Regarding the upcoming W3C meeting on data portability:

https://cosocial.ca/@evan/111036813575691039

..I’m in favor of (mostly already supported) redirects as opposed to rewriting the ownership of legacy posts.

I think of my AP posts as physical letters being sent out en masse. Once sent, the address on that letter can’t be changed, but response letters send to my old address can be forwarded to my new one.

When I move to a new server, here’s my plan:

  1. Make account on new server

  2. While the new account has 0 followers, I re-announce (boost) every single original post on my legacy account via a scripted action.

  3. Then I migrate the actual account, moving my follow lists across.

The boosts act as a form of federated archiving in case the legacy server shuts down. I think all I’m missing is a kind of 301 redirect status that points my old posts to my new ones.

https://writing.exchange/@erlend/111045396211041648

@dj-sf
Copy link

dj-sf commented Oct 15, 2023

@MastodonContentMover this looks great! Will definitely help a lot of people solve this problem for themselves without requiring any major changes to mastodon's architecture.

I'm also continuing to work on the solution I described in my earlier posts. I'm hoping it will provide a seamless solution to the Fediverse's issues with account and post migration in general.

I've gotten started on it and have figured out a lot of the implementation details, but have been struggling to implement them due to my poor understanding of OAuth 2.0 and OIDC. I'm brushing up on that at the moment.

For more regular updates or if you want to get involved, follow the discussion thread I've started here.

@vmstan vmstan added the suggestion Feature suggestion label Nov 17, 2023
@golfinq
Copy link

golfinq commented Feb 13, 2024

I think doing a form of auto-boosting into the new instance might be the simplest way forward. Internally this could be a special boost that would then be "assigned" to the new account so that it shows up in archives.

@omentic
Copy link

omentic commented Feb 21, 2024

I've been reading through this and past issues to get a gist of what the technical problems here are. There's a lot of fluff above and in previous threads, so to save others the time of catching up here's a summary of what needs to be done.

  • Past toots need to be imported: Import toots from CSV #981. Imported toots must not be redelivered: this is just a changing of toot ownership. There is a potential performance issue. This is probably fine: imports can happen slowly, or could require admin approval. Concerns about trust are IMO not relevant.

If one is migrating from an instance about to bite the dust: that's all you need, really. This is pretty much implemented with MastodonContentMover above. It fucks up posting dates but that's not fixable without being an admin.


If one is migrating from a live instance, there's more to think about. Ideally, if the original server is up & cooperating, it would be nice for posts & conversations to function just as they did on the original post: i.e. there needs to be a way to say "this toot is now owned by a different instance, send your requests there". So if someone comes across a pre-migration post, they can see all replies to it and can like/boost/reply to it. To the user, this would probably look like: user likes post -> instance sends like request -> receives 301 Moved Permanently with a url -> instance sends new like request -> post is liked (the details may be wrong here). Unfortunately the implementation of this appears to be moderately complex and requires some things Mastodon does not currently have.

  • How does the original instance know where on the new instance the new post is? I believe Akkoma et al use GUIDs for posts, but Mastodon does not: a post id is only guaranteed to be locally unique. So you can't just say "the original post was at instance.social/@username/123456789 and the user migrated to migrated.social/@newsexyusername, therefore the new post is at migrated.social/@newsexyusername/123456789 like you could in an ideal world.
  • Mastodon currently internally uses a mixture of username-based APIs (ex. @username@instance.social) and identifier-based APIs (ex. local and remote users are uniquely identified by a number, say 1025612). This should change to use the identifier universally: then that 301 Moved Permanently in the above example only needs to happen once. There is an open issue for this: [v5.0] Use local id consistently -- not just API routes, but for URIs and internal references as well #10745 (this also would make Feature request: Make it possible to change a username #15320 essentially free)

Paging @trwnh as they seem to deeply understand the issues here: Is this an accurate summary? Did I miss anything? I'm unable to find a tracking issue for "use GUIDs", so I suppose that's just this one.

@ShadowJonathan
Copy link
Contributor

Datapoint: Mastodon's federation inboxes arent "smart", and shared inboxes are a thing. Likes are sent as activitypub objects, and those are just sent to inboxes. Mastodon's inboxes just stores the incoming data into a queue to be processed later. This way, one request doesn't hold up the line, and the federation worker is immediately ready to receive the next one.

This would make any "this post isn't ours anymore" response have to happen asynchronously, or the "data owner" of a post needs to be changed, and the URI's rewritten when that happens, though that has other problems (such as ID types not being consistent between servers, needing every ID to get re-fetched to re-map them, resulting in a multi-MB blob of remapping, possibly.)

@omentic
Copy link

omentic commented Feb 21, 2024

@ShadowJonathan I think the "this post isn't ours anymore" response happening asynchronously should be fine? (if it isn't, can you elaborate?) With #10745 this only ever happens once per instance after all as the instance will then update its user id (my example pipeline is a bit wrong in this respect, I wrote it before understanding #10745).

Also as I understand it ids as discussed in #10745 are already not consistent between servers.

@aigarius
Copy link

Organizationally I would expect that if a user has migrated away from one (working) server to another, then the source server might not want to host or publish the content of the users post. With that in mind the "this content isn't ours anymore" can be come a very simple problem to solve by having a single, static redirect on the user level. "That user is over there now, you might be able to find this post there or you might not" is a perfectly valid answer IMHO.

@trwnh
Copy link
Member

trwnh commented Feb 21, 2024

Is this an accurate summary? Did I miss anything? I'm unable to find a tracking issue for "use GUIDs", so I suppose that's just this one.

@omentic yeah mostly, the only thing i'd point out is that using GUIDs isn't enough because anyone can claim any id. so you need to bind it to some authority -- either an HTTP origin (DNS hostname) or some public key cryptography alternative (which is its own big breaking change, but would ultimately grant more portability). "identity" in general has to be rooted in one of the two. you'd basically need an indirection layer along the lines of domain.example/@user or domain.example/users/147154782934 being aliases for some other independent identifier. of course the challenge is in getting everyone to agree on how to assign and resolve those identifiers. FWIW, bluesky and the AT protocol decided their answer was going to be "just use a single (placeholder/PLC) nameserver"... they could do that because they're the only stakeholder.

something like #10745 would make it easier to get there but would not solve the issue entirely on its own. it just allows the username to change. you'd still need a way to allow the authority to change (or otherwise defer to some other authority).

@omentic
Copy link

omentic commented Feb 21, 2024

using GUIDs isn't enough because anyone can claim any id

@trwnh could you elaborate on this? What's wrong with having a Really Big GUID namespace and assigning ids randomly?

If I understand things, then right now definitionally that authority is username@example.social. Would there be any issues with just... using that to its full extent, and so having moved accounts create GUIDs relative to their original username/domain combo? I think we can rely on old username/domain combos not being reused, because IIRC that breaks federation in other ways.

@trwnh
Copy link
Member

trwnh commented Feb 21, 2024

the problem is bad actors that intentionally use a conflicting id. it doesn't matter how big the namespace is. you need some authority, like a domain or otherwise signing the guid. right now the authority is https://example.com and under a key-based system it would be the keypair.

the best we can do right now is you could mint identifiers against some stable authority, like a PURL service. basically, if changing something is a problem, you have two possible solutions:

@silverpill
Copy link

so you need to bind it to some authority -- either an HTTP origin (DNS hostname) or some public key cryptography alternative (which is its own big breaking change, but would ultimately grant more portability)

@trwnh In FEP-ef61 the authority is a DID (so it's a breaking change), but I think the interoperability with existing software can be preserved if implementations will generate IDs as HTTPS URLs containing a DID URL instead of just DID URLs (this idea is discussed in the "Compatibility" section of the proposal). Similar to how IPFS objects can be referenced either by ipfs: URIs or by HTTPS URLs via a gateway.

@omentic
Copy link

omentic commented Feb 21, 2024

@trwnh I still don't see the problem. Where do bad actors come into play? You have to trust the migrated-from instance to correctly point at your new profile, and subsequently point other instances to update their internal id mapping. Another instance's GUID is only relied upon by the migrated-to instance when importing posts, so that they have the same GUID (actually, going forth they wouldn't even need to use the old authority originalusername@example.social, if the point of the authority is for uniqueness: though i still don't understand why you need an authority to begin with). Where is the possibility for malicious behavior, given you necessarily trust the migrated-from instance and presumably trust the migrated-to instance?

In general I think that fully decoupling identity from instances requires a significant amount of additional complexity and could be done later anyway, so if it isn't strictly necessary for post migration I'd like to avoid thinking about it.

@omentic
Copy link

omentic commented Feb 21, 2024

(you might have to kind of hand-hold me through an explanation of why bad actors are a problem here - i only starting reading through these issues last night, and have no experience running an instance or dealing with federation)

@SteveDinn
Copy link

Once a post's UID becomes known, any bad-acting instance admin can claim that post for their own just by adding a post with that ID to their server. At best, there is a duplicate ID now floating out there in the fediverse. At worst, the forged post is now considered to be the real post.

I believe the commenters above are correct; there has to be some tether to an authority to validate a post identifier.

@trwnh
Copy link
Member

trwnh commented Feb 21, 2024

OK, so here's an example:

  • I make a post f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b
  • Someone else makes a post claiming to be post f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b
  • Given the identifier f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b, a resolver has to make some decision on which one to trust, or which one is more authoritative.
    • Option: anchor that authority to a domain, for example example.com/post/f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b
    • Option: anchor that authority to a keypair, in other words "this is post f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b as signed for by pubkey BCC9972F818152BEDAC75760250B07ED4BCC9F7D99FE8FE981194940FC39761E"

The problem as it relates to migration of posts is that example.com will change, and domain2.example/post/f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b is not guaranteed to be the same. In fact, even the /post/id URI structure is not guaranteed. If it were guaranteed, then we could have some level of consistency in being able to resolve posts at a new host by "simply" substituting the hostnames with a find-and-replace.

@omentic
Copy link

omentic commented Feb 21, 2024

Ah, I see now.

I don't think this needs to be a problem. Posts currently have an internal UUID: there are duplicates ids between instances floating abouts, and it doesn't matter because remote posts are not resolved by relying entirely on the UUID. What I thought was being proposed with the GUIDs (and what I think should be proposed) is that the "global" aspect of the GUID only becomes relevant upon post import. Things can function as they currently function otherwise: despite ostensibly being a GUID were all actors trusted, the GUID is treated exactly how internal post identifiers are currently treated - as a UUID - with the sole exception being during post import, where you trust both parties.

So this would mean that upon a move to a new username/domain the old /post/id URI structure is guaranteed and we have some level of consistency in being able to resolve posts at a new host, through a one-time 301 Moved Permanently update that'll update what Mastodon's internal user ids resolve to.

@omentic
Copy link

omentic commented Feb 25, 2024

(@trwnh if you've got the time, i'm curious if my thinking here is accurate? if so i might move ahead with a GUID proposal)

@kartonrad
Copy link

@omentic
Hm.
The task you are trying to solve is the following.

How can Intstance A, the moved-from instance, communicate the migration of a user to all other relevant instances in the network.

At the end of the day, all relevant instances need to change the information of the users origin in their databases.

You can't just assign a GUID, because in fedi, there is no single Authority. Nothing ensures a GUIDs uniqueness, nor can you resolve it uniquely to an origin server or a public key.

So, sure, using GUIDs would mean that all TRUSTED servers wouldn't have to worry about id collisions between posts.

However, any malicious server can claim: "I generated this GUID for this post. It has the content: 'i am a moron, signed, Donald Duck'"

Even though Mr. Duck has never posted something like that.

The mechanism we use to keeo ids globally unique right now is the Domain Name System.
By tying a post to an id, like so: "253737@mastodon.social" (which isnt how it works right now, ids currenty are full urls)

The AUTHORITY who can delete a post, recieves replies for the post, cab edit the post, etc, is baked right in.

A "full move" of a post is essentially a transfer of authority, and given that ids are tied to the hostname, IDS MUST CHANGE in order for the move to take place.

The only way around this would be cryptography.
Instead of the hostname of the id matching the sender being the proof of authority, a cryptographic signiture (e.g ownership of a private key) wpuld be used as a proof of authority.

This is a massive breaking change with its own problems - regarding key management and key ownership- as trwnh said.

But with cryptography, you could theoretically spin up a new instance, dig out your private key, and sign a thousand messages saying 'i'm here now'

And then all existing fedi servers would recieve the proof, and add a redirection entry into their database, or change the ids of existing objects, though that would be a greater risk to data integrity.

@kartonrad
Copy link

kartonrad commented Feb 25, 2024

My guess would be:

Solution 1:

  • our solution will end up relying on trust in the moved-from server, it needs to be running and uncompromised
  • we use the existing migration
  • with a boost of all old posts upon move
  • we notify all followers of the redirection
  • we redirect the old to the new inbox using 301, and servers that know of the move send to the new inbox automatically
  • we display the old, boosted posts on the profile of the new server (preferrably in a clear way that shows they were migrated)
  • we make sure the new server takes responsibility for the boosted data, securely stores attachments, lets users export the old posts alongside the new ones, etc.
  • we make sure the new server handles notifications for the redirected activities

Solution 2:

  • we do the same as above, however
  • the initial move can also be done through a second mechanism, a cryptographic proof.
  • the key is managed in a way that empowers users. users can choose to generate a keypair locally and sign a 'lease of authority' posted to the actors jsonld
  • this lease allows the servers key to sign all the same actions as theuser key can
  • when the old server goes rouge, the affected user can locally sign and distribute 'revocation of authority' and 'notice of migration' messages to their followers servers.
  • this approach would probably need a skilled working group to work out a specification that is fit for this purpose
  • having this discussion with randoms in github issues is probably unproductixe xD
  • this would be the unholy mother of breaking changes

@kartonrad
Copy link

Solution 2 also doesnt work as a proof to remote instances that are newer than your move -

They have never seen your uncompromised actor jsonld
So it would be a pure coping mechanism
True nomadic identity seems inpossible in fedi?
Unless we want to have public key handles or central key repositories (both cringe)

@golfinq
Copy link

golfinq commented Feb 25, 2024

Your solution 1 seems like the least complicated system of migration, if I am understanding correctly - the 301s create authority over the post with the notification of the change federating in a simliar way to edit notifications. That way any further actions taken on the post originating from the new instance will be trusted.

@omentic
Copy link

omentic commented Feb 25, 2024

@kartonrad I think you're missing what I'm saying. I am saying that

You can't just assign a GUID, because in fedi, there is no single Authority. Nothing ensures a GUIDs uniqueness, nor can you resolve it uniquely to an origin server or a public key.

this is not true, and I am also saying that

So, sure, using GUIDs would mean that all TRUSTED servers wouldn't have to worry about id collisions between posts. However, any malicious server can claim: "I generated this GUID for this post. It has the content: 'i am a moron, signed, Donald Duck'"

this is not relevant, and that

A "full move" of a post is essentially a transfer of authority, and given that ids are tied to the hostname, IDS MUST CHANGE in order for the move to take place. The only way around this would be cryptography.

this is false. (the transfer of authority part is true. the ids needing to change and the only way around this being cryptography is false.)

@alper
Copy link

alper commented Feb 25, 2024

I keep getting notifications here. Stop yapping and write the code that will make this happen.

Update: This is not to the core team (who I hope will some day get to it), this is to everybody who's commenting here. This is open source and it thrives off contributions. The amount of energy wasted here is incredible.

@silverpill
Copy link

True nomadic identity seems inpossible in fedi?

@kartonrad Nomadic identity already works in Hubzilla and Streams, both are Fediverse projects. They use different protocols alongside ActivityPub, but a similar solution can be implemented in pure ActivityPub too.

That solution is described in proposal FEP-ef61: https://codeberg.org/fediverse/fep/src/branch/main/fep/ef61/fep-ef61.md

It introduces a new kind of object ID where authority is indicated by a cryptographic key instead of a domain name.

@kartonrad
Copy link

@silverpill
Jup. They moved away from http url ids.
Which...
Is kind of not spec compliant.

But... is there really any project that is fully activity pub compliant?

Mastodon could, of course, move to towards this system, theoretically.

But, as stated in the document you linked:

Nomadic accounts are currently not supported by ActivityPub but are available via the Nomad protocol.

I guess i did say "fedi" not "AP"

@ldexterldesign
Copy link

ldexterldesign commented Feb 27, 2024

Unsubscribed

IMO this should be a discussion

Good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
suggestion Feature suggestion
Projects
None yet
Development

No branches or pull requests