Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Post Migration #12423

Open
SilverWolf32 opened this issue Nov 18, 2019 · 269 comments
Open

Support Post Migration #12423

SilverWolf32 opened this issue Nov 18, 2019 · 269 comments

Comments

@SilverWolf32
Copy link

#177 – Support Account Migration – was closed after implementing follower migration, but this is only one small part of a true migration. To really be able to change instances, you need to be able to take your posts with you. There was some good discussion on this over there; I'm opening a new issue to make it clear that this is a separate concern from that issue, which seemed to evolve into only being about followers.

Personally, I couldn't care less about migrating my followers/following list. I can refollow people and have them refollow me. It'll all work out. My posts, however, are currently impossible to restore.

@MirceaKitsune
Copy link

Once more I fully support this and have been eagerly awaiting the much requested feature. I would love to see the ability to import the data archives (favorites boosts and toots) exported from the https://instance.social/settings/export section, which is currently a one-way system only. Unfortunately there doesn't seem to be a plan for this yet, but I'm staying hopeful that will change eventually. I believe Eugen mentioned performance concerns on the previous discussion.

@trwnh
Copy link
Member

trwnh commented Nov 19, 2019

Reiterating points from previous discussion:

@SilverWolf32
Copy link
Author

Thanks @trwnh!

status import cannot be backfilled for performance reasons

What's backfilling?

@trwnh
Copy link
Member

trwnh commented Nov 20, 2019

backfilling is #34 -- fetching old statuses from a profile that were never delivered to the instance. you can't expect to deliver 100,000 activities every time you migrate, because that's wildly inefficient.

@MirceaKitsune
Copy link

MirceaKitsune commented Nov 20, 2019

Regarding performance and backfilling: This is why I believe that post migration is possible, but those who choose to use it will need to accept a long waiting period. There will probably have to be three types of limitations to make it viable:

  1. An account queue. For example, the instance may only allow up to 5 accounts to be migrated simultaneously. The archive import menu should show a message saying something like "you have to wait until 10 more accounts finish migrating before we start the import process".
  2. Every post and action will be migrated with a given delay. This could be one post per 10 seconds. The import menu should then show the message "10 out of 1000 posts migrated" to let you know about the status of the procedure.
  3. Only a certain number of the latest activities will be imported. If an archive contains 1.000.000 actions, the interface informs the user that only the latest 100.000 will be considered.

Obviously, if you're importing something like 10.000 boosts + faves + toots, the process may take days if not weeks. It's a price the user will have to pay. But in this format it sounds like it should be performance viable at least.

@SilverWolf32
Copy link
Author

That makes sense! Personally I'm perfectly fine with them taking a long while to move over, so long as they move over eventually.

I'm not quite sure what the limit should be, though – one post per 10 seconds seems awfully slow to me, unless you have like 100 accounts being migrated at once. Maybe set it relatively fast by default (like 1/s or maybe even a little higher), since most instances are small and won't need to deal with a lot of migrations, then larger instances can set it slower?

@SilverWolf32
Copy link
Author

SilverWolf32 commented Nov 20, 2019

Although I would really, really prefer if everything gets migrated, even if it takes weeks or even months. I don't want to lose my old posts – that's basically an archive of what I'm like, so new people can get to know me a little better before they decide whether to follow me. It's also history, and while some may not want to keep that sort of thing, others (like me) would.

(Even if you technically have everything in the backup, that's incredibly cumbersome to look at, given I'm not aware of any software that can actually import it.)

@trwnh
Copy link
Member

trwnh commented Nov 20, 2019

stuff can get imported, but it is absolutely a bad idea to redeliver it. i am firmly of the opinion that a Move with post import should maybe do a rewrite of existing statuses in the database, but nothing more. it is unwise to even consider backfilling until the actual decoupling and importing gets implemented.

more to the point in general, though: it is fundamentally bad idea to assume that every instance should have a copy of every single post at all times. we don't need to duplicate everything 6000x or more. what is fundamentally happening with a "migration" is that the authority is being shifted; the location is being shifted, that's all. it's a Bad Idea to treat them as new statuses. hopefully i've explained why

@SilverWolf32
Copy link
Author

Hm...well, would they be new statuses? I figured they would simply be reassigned, but then maybe redelivered as the originals, not duplicated and "reposted".

Or maybe I'm completely misunderstanding how federation works. I know very little about the backend.

every instance should have a copy of every single post at all times. we don't need to duplicate everything 6000x or more

Is this actually being discussed? I'd say for a migration the situation is a bit different; only the destination server has to receive a copy of every single post, not everybody.

@trwnh
Copy link
Member

trwnh commented Nov 20, 2019

only the destination server has to receive a copy of every single post

yeah, and it should be limited to this.

looking at prior art from zot, you can do an online migration (fetch old messages from your outbox and inbox to the new server) or an offline migration (using an exported account archive that contains at minimum your keys, your posts, and your address book)

@SilverWolf32
Copy link
Author

That sounds lovely! I've been a little worried in the back of my mind about how to handle a server that just suddenly dies.

@lmachucab
Copy link

Is there at least a first-party tool that users (not servers) can use to work their downloaded backup and generate something usable? Something like a HTML file (plus media?) that can be uploaded into a remote server. Would at least temporarily solve the problem of making the lost content accessible again.

@MirceaKitsune
Copy link

@lmachucab I made a NodeJS script that I used to port my faves and boosts to other accounts. Unfortunately it no longer worked last time I tried it, and it was a crude solution that put quite a bit of stress on the target server and even triggered an instance's flood protections once. If you think you might have some use for it still, it's still available on my Github:

https://github.com/MirceaKitsune/mastodon_migrate

@masstransithonchkrow
Copy link

Hi there! Mastodon.cloud literally has a month to live. We're being shut down because the domain is being forced to bow to insane regulations, and the admin isn't having any of it.
https://mastodon.cloud/@TheAdmin/104227496654670309

We had been the target of immense DDoS attacks for the past six months.
If you need help or simply wish to deliberate, please let me know.

@msikma
Copy link

msikma commented May 27, 2020

I'd like to second that Mastodon.cloud disappearing shows this is a crucial feature. Mastodon is a great piece of software, and designing it to be decentralized was a great idea. Making it possible for anyone to set up an instance, yet having instances be able to communicate with one another, makes the network resilient and eliminates single points of failure.

But without the ability to jump from one instance to another at will, this decentralization has a large drawback. If an instance decides to call it quits, you're just going to lose your data. This also means that as a user, you can't be sure if the instance you picked is going to live a long or a short life. Given how many instances there are, it's totally reasonable to expect that some of them are going to disappear.

Aside from users choosing to migrate somewhere else, there should also be a migration plan for instance admins. If an instance shuts down, it shouldn't just be that active users who care about their account get to keep their data, because that means even if user migration is possible there's still going to be a big loss of online culture and data. I think this should probably be a separate issue and come with a host of problems by itself, but without it I think we'll see Mastodon content regularly disappearing forever as instances come and go.

@MirceaKitsune
Copy link

Yeah even I didn't expect this sad news. mastodon.cloud banned me randomly and without explanation after a few months of using it... at the time I was upset, now I'm glad I escaped this event. It's sad to hear even Japan is becoming a dangerous place for internet freedom and has a government failing to understand technology and the flow of information.

I know mastodon.cloud was one of the largest instances out there. If it can go down, the fediverse as a whole is at serious threat of significant data loss. That's why I would further implore that we please support FULL account migration now... even if as an option disabled by default which the instance admin can enable if they so wish, in case performance is such a big concern.

An equally better idea would be a node-based approach to storage, similarly to how IPFS / DAT in concept; Instead of an instance being stored on and served by one physical server, it can be stored in a decentralized cloud where anyone may run a node to serve a copy of the data. Sadly this would require such a rewrite that we'd be talking about a whole new project over the existing Mastodon.

@masstransithonchkrow
Copy link

A reminder that we're supposed to "Own Your Own Data With Mastodon!" If we don't find a way to do that with toots, we are bound to be sued for false advertising. I smell a frivolous lawsuit that's bound to cripple the Fediverse and we need to get ahead of it.

While I may not ideologically agree with Gab, which accounts for 25% of the Fediverse's mass, they have every right to be here as we do, and we need to prepare for pointed media smear campaigns that could potentially harm our reputation.

We need to emphasize our content controls, and remind users that they can ban who they want from their own console, and that moderators do not have to do it for them. We have that advantage over Twitter and we need to advertise th out of it.

@Valenoern
Copy link

I smell a frivolous lawsuit

Mastodon's own code is under the AGPL, right?
as I remember, it and the GPL have some fairly thorough sections saying you (gab) can't turn around and sue a project you built on (mastodon) because it didn't fit your use case, and that if they can't cancel liability/warranty, courts are asked to "arrange something as close to that as possible".

That's the situation you're talking about, right?

I'll agree the AGPL would have no effect on media outlets saying whatever charged things about mastodon they want to, though.

@masstransithonchkrow
Copy link

That is an interesting point. I think they forked Mastodon, though (as of 2.8.5) and created Gab out of it. Ironically, it seems they were able to preserve their "toots" during the migration.

My comment revolved around the idea that you can "Own Your Own Data With Mastodon". If this is the case, we must ensure that people have the capability to export their toots otherwise that slogan could be misconstrued as false advertising. We must have a provision that allows users to migrate to another server if the one they're on is being shut down through no fault of its own (regulatory crackdowns in the case of Mastodon.cloud).

What happened in Minneapolis is a distraction. The attacks against the Fediverse and all other entities like it will intensify. We must fortify our infrastructure and address any outstanding issues so that we're ready for any exodus.

@masstransithonchkrow
Copy link

Even a static representation of said toots is fine with me, similar to Twitter's archive, which you can navigate offline. They don't need to be integrated into the post chronology. Perhaps we could create a designation for such toots, and also prohibit their editing or removal to ensure a smooth migration.

@ccoenen
Copy link

ccoenen commented May 30, 2020

there's an issue for that #9461 (export including some bare html-version of your toots)

With regard to european GDPR, there's the rule that you must have access to a machine readable version of you data. This is satisfied already. While this satisfies the letter of this particular law, I do wish it was easier for everyone to use that data (hence the ticket above). I also think re-importing would be nice.

Yet, I do not believe there's any grounds to your "false advertising" point. Sorry.

(edit: I linked to the wrong issue before, this has been corrected.)

@masstransithonchkrow
Copy link

Thank you for clarifying that. I'll follow that issue too. ^_^

@CoWinkKeyDinkInc
Copy link

I strongly support this. I have my own independent instance that I don't want to maintain anymore so I moved over to another instance. DigitalOcean doesn't let you download complete images as a backup so I can't wait to upgrade to a later version that supports it and transfer the old content over.

@lmachucab
Copy link

This is a feature that it's impressive that Mastodon does not have in 2020, honestly. Without the ability to export and import post history not only can accounts be lost but also identities - as even if an account is migrated to another server, at the moment this is a partial "in name only" process. Any content that the old account had posted is still bound to the lifetime of the old service (and its FQDN).

After having tested some scripts for extracting the information from the exported data dumps, I think there's enough of those scripts and projects wandering about that it's perfectly feasible to grab one (1), integrate it into Mastodon, and offer it as an utility that eg.: generates full sessions / scripts for clients like toot to execute, resulting in a toot post or toot upload of parts of or of the whole archive in HTML or similar readable format.

@mattbk
Copy link

mattbk commented Nov 23, 2020

@lmachucab do the existing scripts deal with not treating the imported posts as new posts that would be picked up for federation? From what I'm reading above, that's what should be avoided to reduce load on servers.

I would be happy if my posts were only viewable from my account page/feed rather than immediately (or ever) federated (i.e., they would only be federated if someone was looking at my account). All I want to be able to do is change my username or move to another instance if needed.

@masstransithonchkrow
Copy link

masstransithonchkrow commented Nov 29, 2020

I thought I'd bring this up:

One of the SPC admins revived bofa.lol, a previously discontinued instance, only for it to regurgitate its old content over TWKN, some of which was excessively vile. bofa.lol aside, I'd like to focus on the mechanism that caused those toots to refederate and see if we can harness that mechanism to archive them for other instances.

As of this writing, bofa.lol was taken down less than a day after it was briefly revived.

*Edited for context and some spelling mistakes at 21:50-0500

@Beyarz
Copy link

Beyarz commented Jul 3, 2021

This is a feature that it's impressive that Mastodon does not have in 2020, honestly.

2021*

@formerlytomato
Copy link

formerlytomato commented Jul 25, 2021

This is going to be incredibly important to ensuring Fediverse platforms don't centralize around the most popular instances. If I have the ability to completely move my account from one instance to another with minor hassle, chances are I'm going to be more willing to take risks on smaller instances. Some of my first accounts on Masto/Peertube were on small instances, but they ended up closing down on me unexpectedly.

@z3dx95g7
Copy link

@Gargron, do you consider this to be a feature worth pursuing? if so, what is needed to make it happen? it's been open for nearly 2 years without any comments from mastodon's lead, so i thought i would ask.

@trwnh
Copy link
Member

trwnh commented Feb 21, 2024

Is this an accurate summary? Did I miss anything? I'm unable to find a tracking issue for "use GUIDs", so I suppose that's just this one.

@omentic yeah mostly, the only thing i'd point out is that using GUIDs isn't enough because anyone can claim any id. so you need to bind it to some authority -- either an HTTP origin (DNS hostname) or some public key cryptography alternative (which is its own big breaking change, but would ultimately grant more portability). "identity" in general has to be rooted in one of the two. you'd basically need an indirection layer along the lines of domain.example/@user or domain.example/users/147154782934 being aliases for some other independent identifier. of course the challenge is in getting everyone to agree on how to assign and resolve those identifiers. FWIW, bluesky and the AT protocol decided their answer was going to be "just use a single (placeholder/PLC) nameserver"... they could do that because they're the only stakeholder.

something like #10745 would make it easier to get there but would not solve the issue entirely on its own. it just allows the username to change. you'd still need a way to allow the authority to change (or otherwise defer to some other authority).

@omentic
Copy link

omentic commented Feb 21, 2024

using GUIDs isn't enough because anyone can claim any id

@trwnh could you elaborate on this? What's wrong with having a Really Big GUID namespace and assigning ids randomly?

If I understand things, then right now definitionally that authority is username@example.social. Would there be any issues with just... using that to its full extent, and so having moved accounts create GUIDs relative to their original username/domain combo? I think we can rely on old username/domain combos not being reused, because IIRC that breaks federation in other ways.

@trwnh
Copy link
Member

trwnh commented Feb 21, 2024

the problem is bad actors that intentionally use a conflicting id. it doesn't matter how big the namespace is. you need some authority, like a domain or otherwise signing the guid. right now the authority is https://example.com and under a key-based system it would be the keypair.

the best we can do right now is you could mint identifiers against some stable authority, like a PURL service. basically, if changing something is a problem, you have two possible solutions:

@silverpill
Copy link

so you need to bind it to some authority -- either an HTTP origin (DNS hostname) or some public key cryptography alternative (which is its own big breaking change, but would ultimately grant more portability)

@trwnh In FEP-ef61 the authority is a DID (so it's a breaking change), but I think the interoperability with existing software can be preserved if implementations will generate IDs as HTTPS URLs containing a DID URL instead of just DID URLs (this idea is discussed in the "Compatibility" section of the proposal). Similar to how IPFS objects can be referenced either by ipfs: URIs or by HTTPS URLs via a gateway.

@omentic
Copy link

omentic commented Feb 21, 2024

@trwnh I still don't see the problem. Where do bad actors come into play? You have to trust the migrated-from instance to correctly point at your new profile, and subsequently point other instances to update their internal id mapping. Another instance's GUID is only relied upon by the migrated-to instance when importing posts, so that they have the same GUID (actually, going forth they wouldn't even need to use the old authority originalusername@example.social, if the point of the authority is for uniqueness: though i still don't understand why you need an authority to begin with). Where is the possibility for malicious behavior, given you necessarily trust the migrated-from instance and presumably trust the migrated-to instance?

In general I think that fully decoupling identity from instances requires a significant amount of additional complexity and could be done later anyway, so if it isn't strictly necessary for post migration I'd like to avoid thinking about it.

@omentic
Copy link

omentic commented Feb 21, 2024

(you might have to kind of hand-hold me through an explanation of why bad actors are a problem here - i only starting reading through these issues last night, and have no experience running an instance or dealing with federation)

@SteveDinn
Copy link

Once a post's UID becomes known, any bad-acting instance admin can claim that post for their own just by adding a post with that ID to their server. At best, there is a duplicate ID now floating out there in the fediverse. At worst, the forged post is now considered to be the real post.

I believe the commenters above are correct; there has to be some tether to an authority to validate a post identifier.

@trwnh
Copy link
Member

trwnh commented Feb 21, 2024

OK, so here's an example:

  • I make a post f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b
  • Someone else makes a post claiming to be post f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b
  • Given the identifier f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b, a resolver has to make some decision on which one to trust, or which one is more authoritative.
    • Option: anchor that authority to a domain, for example example.com/post/f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b
    • Option: anchor that authority to a keypair, in other words "this is post f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b as signed for by pubkey BCC9972F818152BEDAC75760250B07ED4BCC9F7D99FE8FE981194940FC39761E"

The problem as it relates to migration of posts is that example.com will change, and domain2.example/post/f2c1d6e4-31a2-4a3c-834c-1c09a8f1673b is not guaranteed to be the same. In fact, even the /post/id URI structure is not guaranteed. If it were guaranteed, then we could have some level of consistency in being able to resolve posts at a new host by "simply" substituting the hostnames with a find-and-replace.

@omentic
Copy link

omentic commented Feb 21, 2024

Ah, I see now.

I don't think this needs to be a problem. Posts currently have an internal UUID: there are duplicates ids between instances floating abouts, and it doesn't matter because remote posts are not resolved by relying entirely on the UUID. What I thought was being proposed with the GUIDs (and what I think should be proposed) is that the "global" aspect of the GUID only becomes relevant upon post import. Things can function as they currently function otherwise: despite ostensibly being a GUID were all actors trusted, the GUID is treated exactly how internal post identifiers are currently treated - as a UUID - with the sole exception being during post import, where you trust both parties.

So this would mean that upon a move to a new username/domain the old /post/id URI structure is guaranteed and we have some level of consistency in being able to resolve posts at a new host, through a one-time 301 Moved Permanently update that'll update what Mastodon's internal user ids resolve to.

@omentic
Copy link

omentic commented Feb 25, 2024

(@trwnh if you've got the time, i'm curious if my thinking here is accurate? if so i might move ahead with a GUID proposal)

@kartonrad
Copy link

@omentic
Hm.
The task you are trying to solve is the following.

How can Intstance A, the moved-from instance, communicate the migration of a user to all other relevant instances in the network.

At the end of the day, all relevant instances need to change the information of the users origin in their databases.

You can't just assign a GUID, because in fedi, there is no single Authority. Nothing ensures a GUIDs uniqueness, nor can you resolve it uniquely to an origin server or a public key.

So, sure, using GUIDs would mean that all TRUSTED servers wouldn't have to worry about id collisions between posts.

However, any malicious server can claim: "I generated this GUID for this post. It has the content: 'i am a moron, signed, Donald Duck'"

Even though Mr. Duck has never posted something like that.

The mechanism we use to keeo ids globally unique right now is the Domain Name System.
By tying a post to an id, like so: "253737@mastodon.social" (which isnt how it works right now, ids currenty are full urls)

The AUTHORITY who can delete a post, recieves replies for the post, cab edit the post, etc, is baked right in.

A "full move" of a post is essentially a transfer of authority, and given that ids are tied to the hostname, IDS MUST CHANGE in order for the move to take place.

The only way around this would be cryptography.
Instead of the hostname of the id matching the sender being the proof of authority, a cryptographic signiture (e.g ownership of a private key) wpuld be used as a proof of authority.

This is a massive breaking change with its own problems - regarding key management and key ownership- as trwnh said.

But with cryptography, you could theoretically spin up a new instance, dig out your private key, and sign a thousand messages saying 'i'm here now'

And then all existing fedi servers would recieve the proof, and add a redirection entry into their database, or change the ids of existing objects, though that would be a greater risk to data integrity.

@kartonrad
Copy link

kartonrad commented Feb 25, 2024

My guess would be:

Solution 1:

  • our solution will end up relying on trust in the moved-from server, it needs to be running and uncompromised
  • we use the existing migration
  • with a boost of all old posts upon move
  • we notify all followers of the redirection
  • we redirect the old to the new inbox using 301, and servers that know of the move send to the new inbox automatically
  • we display the old, boosted posts on the profile of the new server (preferrably in a clear way that shows they were migrated)
  • we make sure the new server takes responsibility for the boosted data, securely stores attachments, lets users export the old posts alongside the new ones, etc.
  • we make sure the new server handles notifications for the redirected activities

Solution 2:

  • we do the same as above, however
  • the initial move can also be done through a second mechanism, a cryptographic proof.
  • the key is managed in a way that empowers users. users can choose to generate a keypair locally and sign a 'lease of authority' posted to the actors jsonld
  • this lease allows the servers key to sign all the same actions as theuser key can
  • when the old server goes rouge, the affected user can locally sign and distribute 'revocation of authority' and 'notice of migration' messages to their followers servers.
  • this approach would probably need a skilled working group to work out a specification that is fit for this purpose
  • having this discussion with randoms in github issues is probably unproductixe xD
  • this would be the unholy mother of breaking changes

@kartonrad
Copy link

Solution 2 also doesnt work as a proof to remote instances that are newer than your move -

They have never seen your uncompromised actor jsonld
So it would be a pure coping mechanism
True nomadic identity seems inpossible in fedi?
Unless we want to have public key handles or central key repositories (both cringe)

@golfinq
Copy link

golfinq commented Feb 25, 2024

Your solution 1 seems like the least complicated system of migration, if I am understanding correctly - the 301s create authority over the post with the notification of the change federating in a simliar way to edit notifications. That way any further actions taken on the post originating from the new instance will be trusted.

@omentic
Copy link

omentic commented Feb 25, 2024

@kartonrad I think you're missing what I'm saying. I am saying that

You can't just assign a GUID, because in fedi, there is no single Authority. Nothing ensures a GUIDs uniqueness, nor can you resolve it uniquely to an origin server or a public key.

this is not true, and I am also saying that

So, sure, using GUIDs would mean that all TRUSTED servers wouldn't have to worry about id collisions between posts. However, any malicious server can claim: "I generated this GUID for this post. It has the content: 'i am a moron, signed, Donald Duck'"

this is not relevant, and that

A "full move" of a post is essentially a transfer of authority, and given that ids are tied to the hostname, IDS MUST CHANGE in order for the move to take place. The only way around this would be cryptography.

this is false. (the transfer of authority part is true. the ids needing to change and the only way around this being cryptography is false.)

@alper
Copy link

alper commented Feb 25, 2024

I keep getting notifications here. Stop yapping and write the code that will make this happen.

Update: This is not to the core team (who I hope will some day get to it), this is to everybody who's commenting here. This is open source and it thrives off contributions. The amount of energy wasted here is incredible.

@silverpill
Copy link

True nomadic identity seems inpossible in fedi?

@kartonrad Nomadic identity already works in Hubzilla and Streams, both are Fediverse projects. They use different protocols alongside ActivityPub, but a similar solution can be implemented in pure ActivityPub too.

That solution is described in proposal FEP-ef61: https://codeberg.org/fediverse/fep/src/branch/main/fep/ef61/fep-ef61.md

It introduces a new kind of object ID where authority is indicated by a cryptographic key instead of a domain name.

@kartonrad
Copy link

@silverpill
Jup. They moved away from http url ids.
Which...
Is kind of not spec compliant.

But... is there really any project that is fully activity pub compliant?

Mastodon could, of course, move to towards this system, theoretically.

But, as stated in the document you linked:

Nomadic accounts are currently not supported by ActivityPub but are available via the Nomad protocol.

I guess i did say "fedi" not "AP"

@ldexterldesign
Copy link

ldexterldesign commented Feb 27, 2024

Unsubscribed

IMO this should be a discussion

Good luck!

@MastodonContentMover
Copy link

MastodonContentMover commented Jan 11, 2025

Unsubscribed

IMO this should be a discussion

Good luck!

The suggestion / feature request is valid.

The problem is that it is repeately derailed by posts making in-depth and obscure technical arguments implying that as long as there is no perfect solution, no solution should be implemented. Not only does this blind everyday users visiting this thread with technical language, it has also left them without a way to move their content for five years and counting — on a platform that claims to offer mobility between instances.

Meanwhile, other Fediverse platforms offer the functionality, and third-party tools have existed for long enough that even they are beginning to age.

The technical discussion of how a perfect solution could be implemented could well be better as a discussion, but the request for this functionality to exist should remain here, and, in the absence of consensus on a perfect solution, a practical workaround so users are not held to ransom with their content should be put in place at the earliest opportunity.

Without this, the claim that Mastodon offers users genuine mobility between instances is significantly flawed.

How many years is it going to take? The perfect is the enemy of the good.

@shodanx2
Copy link

The ability of user to migrate without friction, loss of connection and so on, would undermine the power of the mastodon owner class. The technical hurdles are excuses for not granting users this level of autonomy to simply escape the influence of their instance owner and their delegates (moderators).

Control of the means of communication is the name of the game and users being able to simply pack up and leave with their relationships and history intact really puts a dent in that.

I don't think we're going to see progress on this front until the way moderators shape public discourse with no accountability is recognized and their power is checked.

I have very little confidence of that happening anytime soon, this issue has had no discussion in a year. There is no willingness to acknowledge the problem by those who have made it their mission to build mastodon, let alone resolve it.

Every avenues of this problem have already been discussed and I don't think there is much left to say, it's not a "how" question. I think twitter users fleeing came here and saw they would become prisoners here and if they are going to be prisoners they chose to go back to Rasputin instead, the devil you know.

At this point it's been so long that this discussion is more for the design of whatever will replace mastodon than for mastodon itself.

Hopefully the next one will make frictionless user migration a day one priority.

@MastodonContentMover
Copy link

MastodonContentMover commented Jan 12, 2025

Hearing Mastodon repeatedly touted as a platform that offers users mobility, and that this is one of its major advantages over legacy centralized social media and a sign of great progress, while this glaring omission in what's needed to deliver actual mobility remains unaddressed is curious, for sure, but I refuse to be defeatist about it.

Here is what would work:

  • On migration, as part of the current process which already has a mechanism to confirm the user does in fact own the accounts on both the source (old) and target (new) instance, users are given a choice to migrate their old content (they can choose not to)
  • They can perhaps make some choices about which content is migrated — all of it, or just bookmarked posts, or just posts with more than X retweets or favourites, or just posts with media.... these options are really nice-to-haves, so could be added later
  • Once the migration is kicked off, the target instance pulls a shallow list of all the post ids for the account on the source instance — shallow so this doesn't try to pull a huge chunk of data all at the same time, with accompanying challenges of where to store it, how to cache it during downtime etc.
  • The target instance now runs, as a very low priority, a batch job that periodically pulls one of the posts from the source instance, checks to see if it matches the criteria selected as options (is it bookmarked, does it have media etc.).
  • This job runs in chronological order, so more recent posts can be linked to earlier posts if they were originally threaded.
  • If the post on the source instance matches the criteria, a new post is created on the target instance (all post content and media are pulled from the original, source instance and included in the new post on the target instance)
  • There are a couple of different ways dates could be handled, but simplest (to reduce necessary interface changes for various Mastodon clients) would be for the date of the new post on the target instance to be set to that of the post on the source instance. The date of the import could be put in a separate (new) property on the post object, which could then act as a flag so the post could be identified as imported in the UI, and/or it could be used to create a new record in the edit history for the post, so that clients that show edit history but haven't been updated to visually identify imported posts would at least have some indication that something had changed since the posts's original creation. I think the details around exactly how this is handled are flexible, but however it works it's important to have 1) the post dated somehow to its original date so it shows up correctly in the timeline on the profile, 2) some indication visible at an interface level that it is a migrated/imported post.
  • If the source post was a reply to an earlier post by the same account on the source instance, the new post is linked to the newly imported copy on the target instance of that earlier post (so that for same-account posts, at least, threads are retained)
  • If the length of the source post exceeds the length limit on the target instance, the post is split into multiple, threaded new posts that fit the target limit (note here that the code to preserve same-account threads must link subsequent posts to the last of any of these split-up long posts — this is possible).
  • If the number of media attachments in the source post exceeds the limit on the target instance, again add threaded new posts to hold the extra media attachments (this is possible). If the size of an attachment (a video, for example) exceeds the size limit on the target instance, either re-encode the media to fit in the limit on the target instance or flag the failure to the user so they can decide how to handle this and add the media manually to a new post created on the target instance without the problem media (the exact detail of how to handle this can be worked out)
  • Post properties on the new post are set to match those of the original source post
  • User tags in the new posts do not prompt a notification for the tagged user
  • The new post is not sent to timelines, irrespective of visibility level, although hashtags can and probably should be visible in searches of past posts
  • A nice to have would be storing the original post url from the source instance in the new post on the target instance, and linking this somewhere in the interface, so that (for as long as the source instance still exists) users can visit the original post on the original source instance, and see interactions and edit history etc.

This gives users the ability to preserve their own content when migrating between instances, while doing the processing at a rate that can be controlled by the administrator of the new, target instance (and at a priority that will not interfere with other instance processing). Migrated posts don't flood timelines. Original post dates are retained. While interactions with other users are, sadly, lost, content posted as threads by the migrating user account are retained. User tags and hashtags can be retained in a way that functions normally (without having to mangle them with special characters so that they look like tags but don't trigger notifications etc.), without flooding users with notifications from old posts.

This is better than what can be achieved with a third-party client approach, gives instance administrators more control and preserves data that can't reliably be preserved when exporting and re-importing. It would be possible to build a comparable mechanism that re-imports data from an earlier export from a source instance, but it's possible for a malevolent user to adjust the data in an exported file to fake things like post dates so personally I think dates should only be preserved when the migration takes place directly between instances (even though this does mean both instances need to be online at the same time, for a period until the post migration is complete).

It is also better than expecting users to migrate when that means abandoning potentially years of content — often creative work — that they want to keep. It's a substantial piece of programming work, but not huge. And it's absolutely possible — it could be implemented as soon as someone has the skills and availability to code it.

@Schoeneh
Copy link

I second this proposed approach by @MastodonContentMover!

@shodanx2
Copy link

shodanx2 commented Jan 12, 2025

Don't forget about relationships
If migrating your server means everyone subscribed to you no longer see you as usual on their homepage, the migration is not fictionless. It means losing your audience.

And that means server owners and their delegates can leverage that against you, you will have to accept their condition or be un-personned if you try to leave.

It's not just about backing up a couple jpegs, it means the comment thread, the whole metadata package.

Other than copying the bulk data, the relationships could be preserved by leaving behind a redirect, or having a distributed signed table of known redirect, so that even if the server caught on fire with no backups, you can still broadcast to the fediverse where you are leaving to.

The freedom of users should be equivalent as if every mastondon user ran their one single user server.
And if there's a disagreement between user and the actual server, it should always be possible for that user to move to their own instance at no penalty whatsoever for any reason and without needing anyone else's permission.

@Schoeneh
Copy link

Schoeneh commented Jan 12, 2025

@shodanx2
And if there's a disagreement between user and the actual server, it should always be possible for that user to move to their own instance at no penalty whatsoever for any reason and without needing anyone else's permission.

Yes, that's it, that's the sentence!

May I quote you in a public post to get some more eyes and attention on this issue?

@MastodonContentMover
Copy link

Don't forget about relationships

Relationships already migrate.

It's not just about backing up a couple jpegs, it means the comment thread, the whole metadata package.

It would be nice to have that. It demands a solution that isn't straightforward, and we've spent the last five years not giving anyone a solution even for what is straightforward because we can't agree on how to give them what isn't.

This request, as per the original post, is for users to be able to migrate their own post content to new accounts. The solution described here achieves that.

It would be helpful for requests for more sophisticated functionality (such as "true" portability of posts that retains interactions seamlessly) to be spun off into a separate discussion, rather than spend another five years not giving users what they've needed for the last five years because we can't yet decide on how to give them more than they're asking for.

@Schoeneh
Copy link

rather than spend another five years not giving users what they've needed for the last five years because we can't yet decide on how to give them more than they're asking for.

100% this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests