-
-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Post Migration #12423
Comments
Once more I fully support this and have been eagerly awaiting the much requested feature. I would love to see the ability to import the data archives (favorites boosts and toots) exported from the https://instance.social/settings/export section, which is currently a one-way system only. Unfortunately there doesn't seem to be a plan for this yet, but I'm staying hopeful that will change eventually. I believe Eugen mentioned performance concerns on the previous discussion. |
Reiterating points from previous discussion:
|
Thanks @trwnh!
What's backfilling? |
backfilling is #34 -- fetching old statuses from a profile that were never delivered to the instance. you can't expect to deliver 100,000 activities every time you migrate, because that's wildly inefficient. |
Regarding performance and backfilling: This is why I believe that post migration is possible, but those who choose to use it will need to accept a long waiting period. There will probably have to be three types of limitations to make it viable:
Obviously, if you're importing something like 10.000 boosts + faves + toots, the process may take days if not weeks. It's a price the user will have to pay. But in this format it sounds like it should be performance viable at least. |
That makes sense! Personally I'm perfectly fine with them taking a long while to move over, so long as they move over eventually. I'm not quite sure what the limit should be, though – one post per 10 seconds seems awfully slow to me, unless you have like 100 accounts being migrated at once. Maybe set it relatively fast by default (like 1/s or maybe even a little higher), since most instances are small and won't need to deal with a lot of migrations, then larger instances can set it slower? |
Although I would really, really prefer if everything gets migrated, even if it takes weeks or even months. I don't want to lose my old posts – that's basically an archive of what I'm like, so new people can get to know me a little better before they decide whether to follow me. It's also history, and while some may not want to keep that sort of thing, others (like me) would. (Even if you technically have everything in the backup, that's incredibly cumbersome to look at, given I'm not aware of any software that can actually import it.) |
stuff can get imported, but it is absolutely a bad idea to redeliver it. i am firmly of the opinion that a Move with post import should maybe do a rewrite of existing statuses in the database, but nothing more. it is unwise to even consider backfilling until the actual decoupling and importing gets implemented. more to the point in general, though: it is fundamentally bad idea to assume that every instance should have a copy of every single post at all times. we don't need to duplicate everything 6000x or more. what is fundamentally happening with a "migration" is that the authority is being shifted; the location is being shifted, that's all. it's a Bad Idea to treat them as new statuses. hopefully i've explained why |
Hm...well, would they be new statuses? I figured they would simply be reassigned, but then maybe redelivered as the originals, not duplicated and "reposted". Or maybe I'm completely misunderstanding how federation works. I know very little about the backend.
Is this actually being discussed? I'd say for a migration the situation is a bit different; only the destination server has to receive a copy of every single post, not everybody. |
yeah, and it should be limited to this. looking at prior art from zot, you can do an online migration (fetch old messages from your outbox and inbox to the new server) or an offline migration (using an exported account archive that contains at minimum your keys, your posts, and your address book) |
That sounds lovely! I've been a little worried in the back of my mind about how to handle a server that just suddenly dies. |
Is there at least a first-party tool that users (not servers) can use to work their downloaded backup and generate something usable? Something like a HTML file (plus media?) that can be uploaded into a remote server. Would at least temporarily solve the problem of making the lost content accessible again. |
@lmachucab I made a NodeJS script that I used to port my faves and boosts to other accounts. Unfortunately it no longer worked last time I tried it, and it was a crude solution that put quite a bit of stress on the target server and even triggered an instance's flood protections once. If you think you might have some use for it still, it's still available on my Github: |
Hi there! Mastodon.cloud literally has a month to live. We're being shut down because the domain is being forced to bow to insane regulations, and the admin isn't having any of it. We had been the target of immense DDoS attacks for the past six months. |
I'd like to second that Mastodon.cloud disappearing shows this is a crucial feature. Mastodon is a great piece of software, and designing it to be decentralized was a great idea. Making it possible for anyone to set up an instance, yet having instances be able to communicate with one another, makes the network resilient and eliminates single points of failure. But without the ability to jump from one instance to another at will, this decentralization has a large drawback. If an instance decides to call it quits, you're just going to lose your data. This also means that as a user, you can't be sure if the instance you picked is going to live a long or a short life. Given how many instances there are, it's totally reasonable to expect that some of them are going to disappear. Aside from users choosing to migrate somewhere else, there should also be a migration plan for instance admins. If an instance shuts down, it shouldn't just be that active users who care about their account get to keep their data, because that means even if user migration is possible there's still going to be a big loss of online culture and data. I think this should probably be a separate issue and come with a host of problems by itself, but without it I think we'll see Mastodon content regularly disappearing forever as instances come and go. |
Yeah even I didn't expect this sad news. mastodon.cloud banned me randomly and without explanation after a few months of using it... at the time I was upset, now I'm glad I escaped this event. It's sad to hear even Japan is becoming a dangerous place for internet freedom and has a government failing to understand technology and the flow of information. I know mastodon.cloud was one of the largest instances out there. If it can go down, the fediverse as a whole is at serious threat of significant data loss. That's why I would further implore that we please support FULL account migration now... even if as an option disabled by default which the instance admin can enable if they so wish, in case performance is such a big concern. An equally better idea would be a node-based approach to storage, similarly to how IPFS / DAT in concept; Instead of an instance being stored on and served by one physical server, it can be stored in a decentralized cloud where anyone may run a node to serve a copy of the data. Sadly this would require such a rewrite that we'd be talking about a whole new project over the existing Mastodon. |
A reminder that we're supposed to "Own Your Own Data With Mastodon!" If we don't find a way to do that with toots, we are bound to be sued for false advertising. I smell a frivolous lawsuit that's bound to cripple the Fediverse and we need to get ahead of it. While I may not ideologically agree with Gab, which accounts for 25% of the Fediverse's mass, they have every right to be here as we do, and we need to prepare for pointed media smear campaigns that could potentially harm our reputation. We need to emphasize our content controls, and remind users that they can ban who they want from their own console, and that moderators do not have to do it for them. We have that advantage over Twitter and we need to advertise th out of it. |
Mastodon's own code is under the AGPL, right? That's the situation you're talking about, right? I'll agree the AGPL would have no effect on media outlets saying whatever charged things about mastodon they want to, though. |
That is an interesting point. I think they forked Mastodon, though (as of 2.8.5) and created Gab out of it. Ironically, it seems they were able to preserve their "toots" during the migration. My comment revolved around the idea that you can "Own Your Own Data With Mastodon". If this is the case, we must ensure that people have the capability to export their toots otherwise that slogan could be misconstrued as false advertising. We must have a provision that allows users to migrate to another server if the one they're on is being shut down through no fault of its own (regulatory crackdowns in the case of Mastodon.cloud). What happened in Minneapolis is a distraction. The attacks against the Fediverse and all other entities like it will intensify. We must fortify our infrastructure and address any outstanding issues so that we're ready for any exodus. |
Even a static representation of said toots is fine with me, similar to Twitter's archive, which you can navigate offline. They don't need to be integrated into the post chronology. Perhaps we could create a designation for such toots, and also prohibit their editing or removal to ensure a smooth migration. |
there's an issue for that #9461 (export including some bare html-version of your toots) With regard to european GDPR, there's the rule that you must have access to a machine readable version of you data. This is satisfied already. While this satisfies the letter of this particular law, I do wish it was easier for everyone to use that data (hence the ticket above). I also think re-importing would be nice. Yet, I do not believe there's any grounds to your "false advertising" point. Sorry. (edit: I linked to the wrong issue before, this has been corrected.) |
Thank you for clarifying that. I'll follow that issue too. ^_^ |
I strongly support this. I have my own independent instance that I don't want to maintain anymore so I moved over to another instance. DigitalOcean doesn't let you download complete images as a backup so I can't wait to upgrade to a later version that supports it and transfer the old content over. |
This is a feature that it's impressive that Mastodon does not have in 2020, honestly. Without the ability to export and import post history not only can accounts be lost but also identities - as even if an account is migrated to another server, at the moment this is a partial "in name only" process. Any content that the old account had posted is still bound to the lifetime of the old service (and its FQDN). After having tested some scripts for extracting the information from the exported data dumps, I think there's enough of those scripts and projects wandering about that it's perfectly feasible to grab one (1), integrate it into Mastodon, and offer it as an utility that eg.: generates full sessions / scripts for clients like |
@lmachucab do the existing scripts deal with not treating the imported posts as new posts that would be picked up for federation? From what I'm reading above, that's what should be avoided to reduce load on servers. I would be happy if my posts were only viewable from my account page/feed rather than immediately (or ever) federated (i.e., they would only be federated if someone was looking at my account). All I want to be able to do is change my username or move to another instance if needed. |
I thought I'd bring this up: One of the SPC admins revived bofa.lol, a previously discontinued instance, only for it to regurgitate its old content over TWKN, some of which was excessively vile. bofa.lol aside, I'd like to focus on the mechanism that caused those toots to refederate and see if we can harness that mechanism to archive them for other instances. As of this writing, bofa.lol was taken down less than a day after it was briefly revived. *Edited for context and some spelling mistakes at 21:50-0500 |
2021* |
This is going to be incredibly important to ensuring Fediverse platforms don't centralize around the most popular instances. If I have the ability to completely move my account from one instance to another with minor hassle, chances are I'm going to be more willing to take risks on smaller instances. Some of my first accounts on Masto/Peertube were on small instances, but they ended up closing down on me unexpectedly. |
@Gargron, do you consider this to be a feature worth pursuing? if so, what is needed to make it happen? it's been open for nearly 2 years without any comments from mastodon's lead, so i thought i would ask. |
@omentic yeah mostly, the only thing i'd point out is that using GUIDs isn't enough because anyone can claim any id. so you need to bind it to some authority -- either an HTTP origin (DNS hostname) or some public key cryptography alternative (which is its own big breaking change, but would ultimately grant more portability). "identity" in general has to be rooted in one of the two. you'd basically need an indirection layer along the lines of something like #10745 would make it easier to get there but would not solve the issue entirely on its own. it just allows the username to change. you'd still need a way to allow the authority to change (or otherwise defer to some other authority). |
@trwnh could you elaborate on this? What's wrong with having a Really Big GUID namespace and assigning ids randomly? If I understand things, then right now definitionally that authority is |
the problem is bad actors that intentionally use a conflicting id. it doesn't matter how big the namespace is. you need some authority, like a domain or otherwise signing the guid. right now the authority is the best we can do right now is you could mint identifiers against some stable authority, like a PURL service. basically, if changing something is a problem, you have two possible solutions:
|
@trwnh In FEP-ef61 the authority is a DID (so it's a breaking change), but I think the interoperability with existing software can be preserved if implementations will generate IDs as HTTPS URLs containing a DID URL instead of just DID URLs (this idea is discussed in the "Compatibility" section of the proposal). Similar to how IPFS objects can be referenced either by |
@trwnh I still don't see the problem. Where do bad actors come into play? You have to trust the migrated-from instance to correctly point at your new profile, and subsequently point other instances to update their internal id mapping. Another instance's GUID is only relied upon by the migrated-to instance when importing posts, so that they have the same GUID (actually, going forth they wouldn't even need to use the old authority In general I think that fully decoupling identity from instances requires a significant amount of additional complexity and could be done later anyway, so if it isn't strictly necessary for post migration I'd like to avoid thinking about it. |
(you might have to kind of hand-hold me through an explanation of why bad actors are a problem here - i only starting reading through these issues last night, and have no experience running an instance or dealing with federation) |
Once a post's UID becomes known, any bad-acting instance admin can claim that post for their own just by adding a post with that ID to their server. At best, there is a duplicate ID now floating out there in the fediverse. At worst, the forged post is now considered to be the real post. I believe the commenters above are correct; there has to be some tether to an authority to validate a post identifier. |
OK, so here's an example:
The problem as it relates to migration of posts is that |
Ah, I see now. I don't think this needs to be a problem. Posts currently have an internal UUID: there are duplicates ids between instances floating abouts, and it doesn't matter because remote posts are not resolved by relying entirely on the UUID. What I thought was being proposed with the GUIDs (and what I think should be proposed) is that the "global" aspect of the GUID only becomes relevant upon post import. Things can function as they currently function otherwise: despite ostensibly being a GUID were all actors trusted, the GUID is treated exactly how internal post identifiers are currently treated - as a UUID - with the sole exception being during post import, where you trust both parties. So this would mean that upon a move to a new username/domain the old |
(@trwnh if you've got the time, i'm curious if my thinking here is accurate? if so i might move ahead with a GUID proposal) |
@omentic How can Intstance A, the moved-from instance, communicate the migration of a user to all other relevant instances in the network. At the end of the day, all relevant instances need to change the information of the users origin in their databases. You can't just assign a GUID, because in fedi, there is no single Authority. Nothing ensures a GUIDs uniqueness, nor can you resolve it uniquely to an origin server or a public key. So, sure, using GUIDs would mean that all TRUSTED servers wouldn't have to worry about id collisions between posts. However, any malicious server can claim: "I generated this GUID for this post. It has the content: 'i am a moron, signed, Donald Duck'" Even though Mr. Duck has never posted something like that. The mechanism we use to keeo ids globally unique right now is the Domain Name System. The AUTHORITY who can delete a post, recieves replies for the post, cab edit the post, etc, is baked right in. A "full move" of a post is essentially a transfer of authority, and given that ids are tied to the hostname, IDS MUST CHANGE in order for the move to take place. The only way around this would be cryptography. This is a massive breaking change with its own problems - regarding key management and key ownership- as trwnh said. But with cryptography, you could theoretically spin up a new instance, dig out your private key, and sign a thousand messages saying 'i'm here now' And then all existing fedi servers would recieve the proof, and add a redirection entry into their database, or change the ids of existing objects, though that would be a greater risk to data integrity. |
My guess would be: Solution 1:
Solution 2:
|
Solution 2 also doesnt work as a proof to remote instances that are newer than your move - They have never seen your uncompromised actor jsonld |
Your solution 1 seems like the least complicated system of migration, if I am understanding correctly - the 301s create authority over the post with the notification of the change federating in a simliar way to edit notifications. That way any further actions taken on the post originating from the new instance will be trusted. |
@kartonrad I think you're missing what I'm saying. I am saying that
this is not true, and I am also saying that
this is not relevant, and that
this is false. (the transfer of authority part is true. the ids needing to change and the only way around this being cryptography is false.) |
I keep getting notifications here. Stop yapping and write the code that will make this happen. Update: This is not to the core team (who I hope will some day get to it), this is to everybody who's commenting here. This is open source and it thrives off contributions. The amount of energy wasted here is incredible. |
@kartonrad Nomadic identity already works in Hubzilla and Streams, both are Fediverse projects. They use different protocols alongside ActivityPub, but a similar solution can be implemented in pure ActivityPub too. That solution is described in proposal FEP-ef61: https://codeberg.org/fediverse/fep/src/branch/main/fep/ef61/fep-ef61.md It introduces a new kind of object ID where authority is indicated by a cryptographic key instead of a domain name. |
@silverpill But... is there really any project that is fully activity pub compliant? Mastodon could, of course, move to towards this system, theoretically. But, as stated in the document you linked:
I guess i did say "fedi" not "AP" |
Unsubscribed IMO this should be a discussion Good luck! |
The suggestion / feature request is valid. The problem is that it is repeately derailed by posts making in-depth and obscure technical arguments implying that as long as there is no perfect solution, no solution should be implemented. Not only does this blind everyday users visiting this thread with technical language, it has also left them without a way to move their content for five years and counting — on a platform that claims to offer mobility between instances. Meanwhile, other Fediverse platforms offer the functionality, and third-party tools have existed for long enough that even they are beginning to age. The technical discussion of how a perfect solution could be implemented could well be better as a discussion, but the request for this functionality to exist should remain here, and, in the absence of consensus on a perfect solution, a practical workaround so users are not held to ransom with their content should be put in place at the earliest opportunity. Without this, the claim that Mastodon offers users genuine mobility between instances is significantly flawed. How many years is it going to take? The perfect is the enemy of the good. |
The ability of user to migrate without friction, loss of connection and so on, would undermine the power of the mastodon owner class. The technical hurdles are excuses for not granting users this level of autonomy to simply escape the influence of their instance owner and their delegates (moderators). Control of the means of communication is the name of the game and users being able to simply pack up and leave with their relationships and history intact really puts a dent in that. I don't think we're going to see progress on this front until the way moderators shape public discourse with no accountability is recognized and their power is checked. I have very little confidence of that happening anytime soon, this issue has had no discussion in a year. There is no willingness to acknowledge the problem by those who have made it their mission to build mastodon, let alone resolve it. Every avenues of this problem have already been discussed and I don't think there is much left to say, it's not a "how" question. I think twitter users fleeing came here and saw they would become prisoners here and if they are going to be prisoners they chose to go back to Rasputin instead, the devil you know. At this point it's been so long that this discussion is more for the design of whatever will replace mastodon than for mastodon itself. Hopefully the next one will make frictionless user migration a day one priority. |
Hearing Mastodon repeatedly touted as a platform that offers users mobility, and that this is one of its major advantages over legacy centralized social media and a sign of great progress, while this glaring omission in what's needed to deliver actual mobility remains unaddressed is curious, for sure, but I refuse to be defeatist about it. Here is what would work:
This gives users the ability to preserve their own content when migrating between instances, while doing the processing at a rate that can be controlled by the administrator of the new, target instance (and at a priority that will not interfere with other instance processing). Migrated posts don't flood timelines. Original post dates are retained. While interactions with other users are, sadly, lost, content posted as threads by the migrating user account are retained. User tags and hashtags can be retained in a way that functions normally (without having to mangle them with special characters so that they look like tags but don't trigger notifications etc.), without flooding users with notifications from old posts. This is better than what can be achieved with a third-party client approach, gives instance administrators more control and preserves data that can't reliably be preserved when exporting and re-importing. It would be possible to build a comparable mechanism that re-imports data from an earlier export from a source instance, but it's possible for a malevolent user to adjust the data in an exported file to fake things like post dates so personally I think dates should only be preserved when the migration takes place directly between instances (even though this does mean both instances need to be online at the same time, for a period until the post migration is complete). It is also better than expecting users to migrate when that means abandoning potentially years of content — often creative work — that they want to keep. It's a substantial piece of programming work, but not huge. And it's absolutely possible — it could be implemented as soon as someone has the skills and availability to code it. |
I second this proposed approach by @MastodonContentMover! |
Don't forget about relationships And that means server owners and their delegates can leverage that against you, you will have to accept their condition or be un-personned if you try to leave. It's not just about backing up a couple jpegs, it means the comment thread, the whole metadata package. Other than copying the bulk data, the relationships could be preserved by leaving behind a redirect, or having a distributed signed table of known redirect, so that even if the server caught on fire with no backups, you can still broadcast to the fediverse where you are leaving to. The freedom of users should be equivalent as if every mastondon user ran their one single user server. |
Yes, that's it, that's the sentence! May I quote you in a public post to get some more eyes and attention on this issue? |
Relationships already migrate.
It would be nice to have that. It demands a solution that isn't straightforward, and we've spent the last five years not giving anyone a solution even for what is straightforward because we can't agree on how to give them what isn't. This request, as per the original post, is for users to be able to migrate their own post content to new accounts. The solution described here achieves that. It would be helpful for requests for more sophisticated functionality (such as "true" portability of posts that retains interactions seamlessly) to be spun off into a separate discussion, rather than spend another five years not giving users what they've needed for the last five years because we can't yet decide on how to give them more than they're asking for. |
100% this! |
#177 – Support Account Migration – was closed after implementing follower migration, but this is only one small part of a true migration. To really be able to change instances, you need to be able to take your posts with you. There was some good discussion on this over there; I'm opening a new issue to make it clear that this is a separate concern from that issue, which seemed to evolve into only being about followers.
Personally, I couldn't care less about migrating my followers/following list. I can refollow people and have them refollow me. It'll all work out. My posts, however, are currently impossible to restore.
The text was updated successfully, but these errors were encountered: