Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Something went wrong during account migration - lost followers #23594

Open
vitobotta opened this issue Feb 14, 2023 · 22 comments
Open

Something went wrong during account migration - lost followers #23594

vitobotta opened this issue Feb 14, 2023 · 22 comments
Labels
bug Something isn't working

Comments

@vitobotta
Copy link

Steps to reproduce the problem

  1. Created an alias for vitobotta@ruby.social - where ruby.social is the old instance - on vito@botta.social (where botta.social is the new instance)
  2. Migrated account from vitobotta@ruby.social to vito@botta.social

...

Expected behaviour

Migration completed with no loss of followers

Actual behaviour

Something went wrong during the migration and I lost most of the followers

Detailed description

When I did the migration almost 3 days ago 41 followers were migrated right away, and then the migration of followers (totalling around 350 or something like that, can't remember exactly) stopped for some reason. I knew you need to wait for the migration to complete but 3 days seems a long time.

In the meantime in my Nginx's logs (for the botta.social instance, the new one) I see requests with a 404 status code for the path /.well-known/webfinger?resource=acct:botta.social@botta.social - see the account name botta.social@botta.social is wrong here, and I don't know why the domain was used as account name because when I did the migration I definitely entered vito@ruby.social as the new account. Also because botta.social is not even a valid username since it contains a dot, and that account didn't exist on the new instance so there was no alias on it for the old account vitobotta@ruby.social, so the migration would have not been allowed if I had mistakenly entered botta.social@botta.social as the new account name.

I still see those requests to the wrong webfinger URL, so in an attempt to "fix" I configured a redirect in Nginx from that URL to the one for the correct account. The requests get now redirected correctly, but I don't see any change in my followers yet.

What else can I try? The old account vitobotta@ruby.social shows just 12 followers now as if most of them had already been migrated. I even tried cancelling the redirect so that I could attempt a new migration, but it seems I'd have to wait for 28 days so I created just a redirect for now.

So, to recap:

  • did a migration from vitobotta@ruby.social to vito@botta.social
  • migration seemed completed from the PoV of the old account, but only 41 out of ~350 followers were migrated
  • in the Nginx logs for the new instance I see requests from several Mastodon domains for the path /.well-known/webfinger?resource=acct:botta.social@botta.social where botta.social@botta.social is not the correct account, which is vito@botta.social
  • I tried redirecting (in Nginx) the wrong webfinger URL to the one for the correct account but it didn't help.
  • I am still seeing those webfinger requests for the invalid account although not as frequently now

Specifications

Mastodon 4.1.0 on Docker, latest image

@vitobotta vitobotta added the bug Something isn't working label Feb 14, 2023
@vitobotta vitobotta changed the title Something went wrong during account migration - wrong account name used Something went wrong during account migration - wrong account name used and lost followers Feb 14, 2023
@vitobotta vitobotta changed the title Something went wrong during account migration - wrong account name used and lost followers Something went wrong during account migration - lost followers Feb 14, 2023
@Cassolotl
Copy link

Yes, there is a frustrating lack of feedback when migrating followers. I want a list of the followers who couldn't be migrated and why, and what the system is going to do about it (if anything)!

@ClearlyClaire
Copy link
Contributor

In the meantime in my Nginx's logs (for the botta.social instance, the new one) I see requests with a 404 status code for the path /.well-known/webfinger?resource=acct:botta.social@botta.social - see the account name botta.social@botta.social is wrong here, and I don't know why the domain was used as account name because when I did the migration I definitely entered vito@ruby.social as the new account. Also because botta.social is not even a valid username since it contains a dot, and that account didn't exist on the new instance so there was no alias on it for the old account vitobotta@ruby.social, so the migration would have not been allowed if I had mistakenly entered botta.social@botta.social as the new account name.

botta.social@botta.social would have been the internal actor's name if you had installed Mastodon prior to 4.1.0. Now it's mastodon.internal@botta.social. In both cases, it's represented by https://botta.social/actor

Its purpose is to sign “anonymous” queries.

I'm really surprised botta.social@botta.social would be used anywhere unless you re-installed your instance from scratch after initially setting it up.

@vitobotta
Copy link
Author

Hi @ClearlyClaire and thanks for the explanation :) So is it normal to see 404s for those requests? Sounds like it would mean that those instances are running Mastodon < 4.1.0 but most of the instances still generating those requests show 4.1.0 in the user agent. Is there anything I can do to "instruct" those instance which is the correct user so not to lose those followers?

@vitobotta
Copy link
Author

vitobotta commented Feb 14, 2023

@ClearlyClaire btw this is a totally new instance

Edit: but I had an instance with the same domain 1-2 months ago. is that why?

@ClearlyClaire
Copy link
Contributor

What else can I try? The old account vitobotta@ruby.social shows just 12 followers now as if most of them had already been migrated. I even tried cancelling the redirect so that I could attempt a new migration, but it seems I'd have to wait for 28 days so I created just a redirect for now.

I'm very sorry, everyone but those 12 followers has stopped following your old account, and by now should have given up following your new account. #21957 would fix that behavior

Hi @ClearlyClaire and thanks for the explanation :) So is it normal to see 404s for those requests? Sounds like it would mean that those instances are running Mastodon < 4.1.0 but most of the instances still generating those requests show 4.1.0 in the user agent. Is there anything I can do to "instruct" those instance which is the correct user so not to lose those followers?

No, that wouldn't be it, it's the actor your instance is advertising. So the other servers' version has no bearing there.

Edit: but I had an instance with the same domain 1-2 months ago. is that why?

ohh… yeah, that would be why, I think!

I think as a band-aid you can redirect the webfinger replies for botta.social@botta.social to mastodon.internal@botta.social, but I need to think about how to avoid such issues in the future.

@vitobotta
Copy link
Author

@ClearlyClaire Is it enough to do the redirection you suggest directly in nginx?

@ClearlyClaire
Copy link
Contributor

I think so.

@vitobotta
Copy link
Author

OK I added this redirection in Nginx. Will this help the instances who are trying to follow me at the new account?

@ClearlyClaire
Copy link
Contributor

OK I added this redirection in Nginx. Will this help the instances who are trying to follow me at the new account?

It should, but all instances that tried to follow you as a result of the migration have probably given up by now, unfortunately.

@vitobotta
Copy link
Author

OK I added this redirection in Nginx. Will this help the instances who are trying to follow me at the new account?

It should, but all instances that tried to follow you as a result of the migration have probably given up by now, unfortunately.

I did the migration 3 days ago. How long do instances keep trying before giving up? If it's the standard Sidekiq job retry thing I think it would be 21 days max

@ClearlyClaire
Copy link
Contributor

How long do instances keep trying before giving up? If it's the standard Sidekiq job retry thing I think it would be 21 days max

It's about 2 days IIRC. It's using default delay and jitter but “only” 16 tries. See #21956

@vitobotta
Copy link
Author

How long do instances keep trying before giving up? If it's the standard Sidekiq job retry thing I think it would be 21 days max

It's about 2 days IIRC. It's using default delay and jitter but “only” 16 tries. See #21956

Ah :(

@vitobotta
Copy link
Author

https://raw.githubusercontent.com/mastodon/mastodon/66f715550e575129e5d8b093a15aa67527136bd2/app/workers/activitypub/move_distribution_worker.rb

Is MoveDistributionWorker the relevant worker? It doesn't specify the number of retries etc so perhaps it uses the default values from Sidekiq? I think the default is 25 retries

@vitobotta
Copy link
Author

vitobotta commented Feb 14, 2023

I hope that's the case https://github.com/sidekiq/sidekiq/wiki/Error-Handling#automatic-job-retry

So they should keep trying with exponential back off for around 20 days. Is my assumption that no explicit retry number = default correct?

Edit: just realized that that worker uses the Delivery worker which does have the 16 limit set. I guess I'm screwed :D

@ClearlyClaire
Copy link
Contributor

For the record, I believe the issue regarding the botta.social@botta.social VS mastodon.internal@botta.social confusion is the following:

  1. these servers knew your instance actor as botta.social@botta.social (with ActivityPub id https://botta.social/actor)
  2. you basically re-created your server from scratch, so it's using the new naming scheme mastodon.internal
  3. when receiving a signed request from your server, these servers will fail to verify the signature with the known account (since it's the old one) and try to refresh the actor's key in SignatureVerification#actor_refresh_key!, which goes through Account#refresh! which unfortunately queries the actor by its known acct (so the old one, botta.social@botta.social)
  4. your new server does not know that botta.social@botta.social is supposed to be https://botta.social/actor

Mastodon has some code to handle username changes, although that is not considered normal behavior. However, I think because of Mastodon's behavior in step 3, this code is very difficult to actually trigger, and the behavior in step 3 needs to be changed to make it possible.

And I think by adding a redirect as you did, this would allow Mastodon to go through the account renaming code that didn't happen in 3.

@vitobotta
Copy link
Author

Gotcha. I wish I had opened this issue 2 days ago. I would have probably recovered the lost followers...

@ClearlyClaire
Copy link
Contributor

https://raw.githubusercontent.com/mastodon/mastodon/66f715550e575129e5d8b093a15aa67527136bd2/app/workers/activitypub/move_distribution_worker.rb

Is MoveDistributionWorker the relevant worker? It doesn't specify the number of retries etc so perhaps it uses the default values from Sidekiq? I think the default is 25 retries

MoveDistributionWorker is for distributing the Move from the instance you're moving from. This has succeeded. To be honest I'm not too sure which step has failed here 🤔

@kingprawn22
Copy link

I know this is old but I seem to be having the same or similar issue. I created a new Mastodon instance earlier today and went through the migration process where only 1/3 of my followers moved. I've previously setup an instance using the same domain, so it seems like some other instances are trying to connect with my new instance using old information. In the access logs of the nginx server I'm seeing the following:

10.10.10.1 - - [19/Jul/2023:20:28:31 +0100] "GET /.well-known/webfinger?resource=acct:mastodon.internal@server.com HTTP/2.0" 404 20 "-" "http.rb/5.1.1 (Mastodon/4.1.4+glitch; +https://other.server/)"

I replaced my server with 'server.com' and the other instance as 'other.server' above. I'm unable to follow anybody on other.server, and they're unable to follow me. I have this error showing up every few seconds from a variety of instances.

It looks like my new sever has the location acct:server.com@server.com, so I'm trying to find a way off redirecting people from acct:mastodon.internal@server.com to that, but I'm not sure how. I've tried various redirects in the nginx configuration file but have got nowhere.

@ClearlyClaire
Copy link
Contributor

Hi! You seem to have a related issue indeed, but I'm pretty confused: a recently-reinstalled server would have mastodon.internal@server.com and allow redirections from server.com@server.com, while an old server would have server.com@server.com and not announce mastodon.internal@server.com to anybody… did you perhaps install a new server then restore an old database?

@kingprawn22
Copy link

kingprawn22 commented Jul 20, 2023

I installed a brand new server without restoring an old database. I've now installed a new instance at 'new.server.com', the domain of which hasn't been used before so it works perfectly well. To be able to return to using the original domain, I'm now running the 'tootctl self-destruct' command to try and remove any connections to other instances (and vise versa) before reinstalling the server.

I used this guide on setting up the Mastodon server, if it helps: https://cyberhost.uk/mastodon-docker-compose/

@ClearlyClaire
Copy link
Contributor

I installed a brand new server without restoring an old database

That is weird, a new server should be using mastodon.internal@server.com and thus replying to those queries, not yielding a 404. Unless you manually changed something?

I used this guide on setting up the Mastodon server, if it helps: https://cyberhost.uk/mastodon-docker-compose/

This seems to use a non-official docker image, I don't know if there's been any change to it.

@kingprawn22
Copy link

I didn't change anything. Any reference to mastdon.internal@server.com is receiving a 404, whereas anything referring to server.com@server.com or username@server.com is receiving 200 and working. Do you know if the tootctl self-destruct should work in removing any reference to the current and previous instances?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants