Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rails 7.0 update #25668

Merged
merged 11 commits into from
Jul 13, 2023
Merged

Rails 7.0 update #25668

merged 11 commits into from
Jul 13, 2023

Conversation

mjankowski
Copy link
Contributor

Re-do of #24241

Background on merge/revert here: #24241

This contains the same changes as the previous PR, rebased against main after the merge/revert commits and a few other recent ones.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 2, 2023

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 2, 2023

This pull request has resolved merge conflicts and is ready for review.

@renchap renchap added dependencies Pull requests that update a dependency file ruby Pull requests that update Ruby code labels Jul 3, 2023
@renchap renchap requested a review from a team July 3, 2023 09:01
@mjankowski mjankowski force-pushed the rails-7-0-update branch 3 times, most recently from fe9a2aa to b95097c Compare July 9, 2023 21:35
@github-actions
Copy link
Contributor

This pull request has merge conflicts that must be resolved before it can be merged.

After running, pick through changes and keep sensible defaults while
preserving previous intentional deviations.
The method now takes `records` and `association` keyword args.
Changes the monkey-patched AR Persistence code to match Rails 7 changes.
@github-actions
Copy link
Contributor

This pull request has resolved merge conflicts and is ready for review.

Copy link
Contributor

@ClearlyClaire ClearlyClaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not actually tested the changes yet, but I made a new review pass.

Also, I'm a bit worried about https://guides.rubyonrails.org/upgrading_ruby_on_rails.html#key-generator-digest-class-changing-to-use-sha256 as I don't see it being addressed in this PR.

In either case, I will very shortly proceed to test this PR in a real environment!

config/environments/production.rb Outdated Show resolved Hide resolved
config/environments/test.rb Outdated Show resolved Hide resolved
@mjankowski
Copy link
Contributor Author

Also, I'm a bit worried about https://guides.rubyonrails.org/upgrading_ruby_on_rails.html#key-generator-digest-class-changing-to-use-sha256 as I don't see it being addressed in this PR.

Will review and report back.

In either case, I will very shortly proceed to test this PR in a real environment!

Cool.

@ClearlyClaire
Copy link
Contributor

ClearlyClaire commented Jul 13, 2023

A behavior change I just noticed is IIRC, assets:precompile used to install yarn dependencies before running webpack, but that does not seem to happen anymore.

This is because railties dropped the yarn:install dependency from the assets:precompile task. This is not necessarily a bad thing, but it is a noteworthy change.

@vmstan
Copy link
Sponsor Contributor

vmstan commented Jul 13, 2023

Deployed to vmst.io, my own session was briefly logged out (like 5 seconds) as it rolled between k8s pods, but it's working fine now.

@mjankowski
Copy link
Contributor Author

A behavior change I just noticed is IIRC, assets:precompile used to install yarn dependencies before running webpack, but that does not seem to happen anymore.

This is because railties dropped the yarn:install dependency from the assets:precompile task. This is not necessarily a bad thing, but it is a noteworthy change.

Probably same issue - rails/rails#46148

@mjankowski
Copy link
Contributor Author

Deployed to vmst.io, my own session was briefly logged out (like 5 seconds) as it rolled between k8s pods, but it's working fine now.

Just to confirm -- the session was preserved and you stayed signed in without needing to re-auth ... but you just had a normal connectivity drop during the k8s rollover?

@vmstan
Copy link
Sponsor Contributor

vmstan commented Jul 14, 2023

The session was preserved. I was watching Sidekiq queues show up and then the whole page went blank, which doesn't normally happen. I went to the main page and I was logged out but then when I refreshed it was fine.

However I'm seeing issues with 2fa codes no longer working. Hardware token is fine but code generators are not working.

@WhiskeyOmega
Copy link

Also confirming 2FA is not working.
I had/have to go into the DB for all users with 2FA and remove OTP manually

@mjankowski
Copy link
Contributor Author

The session was preserved. I was watching Sidekiq queues show up and then the whole page went blank, which doesn't normally happen. I went to the main page and I was logged out but then when I refreshed it was fine.

Thanks for confirming.

However I'm seeing issues with 2fa codes no longer working. Hardware token is fine but code generators are not working.

These are cases where the OTP was setup and working prior to update, and now post update even if initial password auth works fine, the otp verification step fails?

If so, is the request blowing up with error, or just normal failure as though they had incorrect code?

@WhiskeyOmega
Copy link

If so, is the request blowing up with error, or just normal failure as though they had incorrect code?

Initial password works
Incorrect code is then thrown up.

@ClearlyClaire
Copy link
Contributor

ClearlyClaire commented Jul 14, 2023

That is weird, I have no issue logging in using TOTP. I'm unsure what could be different between our setups. The gem used for generating/checking TOTP codes also does not have any Rails-related dependency. So my only guess is that something is amiss with attr_encrypted?

@ClearlyClaire
Copy link
Contributor

@vmstan @WhiskeyOmega if you haven't cleared TOTP parameters for your account yet, can you try a few things?

From Rails 7, in a Rails console (RAILS_ENV=production bundle exec rails c) check the output of Account.find_local('your_username').user.otp_secret. Is there any output? If there is any output, DO NOT SHARE THIS VALUE

If there is some output, can you run the same thing from a pre-Rails 7 copy of the code (do not start mastodon-web or mastodon-sidekiq with it though) and compare they are the same value?

If the first value was empty or the two did not match, can you share the output of Rails.configuration.x.otp_secret&.size in a Rails console?

@vmstan
Copy link
Sponsor Contributor

vmstan commented Jul 14, 2023

Ok so my bad. I was using the TOTP for a different app when I tested.

Sorry 🫢

@ClearlyClaire
Copy link
Contributor

One thing I missed with the cookie situation is that when doing zero-downtime migrations we may mix Rails 7 and Rails 6 workers, in which case it will lead to temporary hiccups with unreadable cookies… is there something we can do to make the transition smoother?

@mjankowski
Copy link
Contributor Author

One thing I missed with the cookie situation is that when doing zero-downtime migrations we may mix Rails 7 and Rails 6 workers, in which case it will lead to temporary hiccups with unreadable cookies… is there something we can do to make the transition smoother?

Hmm, right ... I assume this is around the scenario where you are doing a rolling deploy across multiple app servers and someone:

  • Makes a request with their old-format cookies
  • Req gets routed to an already restarted app server running 7.0 code, and cookie format gets rotated
  • Next req in gets routed to an app server NOT yet restarted in the rolling restart, and thus the upgraded format can't be read and they are signed out

There's probably a solution there at the IP/http/routing layer in a datacenter (ie, change config to track which connections have been routed to an updated app server, and keep them in the pool of updated servers for the duration of the rollout), but that's going to be specific to each setup and might reduce some of the benefit of the rolling deploy.

@ClearlyClaire
Copy link
Contributor

ClearlyClaire commented Jul 14, 2023

I'm wondering if we couldn't flip the thing around: write Rails 6.1 compatible cookies and support Rails 7 cookies, then in a later version switch that around to support both cookies but emit Rails 7 compatible cookies, before finally dropping Rails 6 cookies in a later release?

@renchap
Copy link
Sponsor Member

renchap commented Jul 15, 2023

I'm wondering if we couldn't flip the thing around: write Rails 6.1 compatible cookies and support Rails 7 cookies, then in a later version switch that around to support both cookies but emit Rails 7 compatible cookies, before finally dropping Rails 7 cookies in a later release?

This seems to be the sensible solution

@mjankowski
Copy link
Contributor Author

I'm wondering if we couldn't flip the thing around: write Rails 6.1 compatible cookies and support Rails 7 cookies, then in a later version switch that around to support both cookies but emit Rails 7 compatible cookies, before finally dropping Rails 7 cookies in a later release?

I think there might be a typo in the last part here? ("dropping rails 7 cookies" should be the SHA1 Rails 6 cookies?). That aside, I think what you are suggesting is something like:

  • Drop the cookie rotation code, and don't plan to rotate the existing cookies
  • Change the main/rails-7 branch to issue session cookies using a new name, so they dont collide
  • Change the session auth logic to something like "first look for cookie with new name/SHA256, if present attempt to auth with it; if not present, look for the old-name/SHA1 cookie, if present, attempt to auth with it". I'm assuming that "attempt to auth with it" is going to be possible/testable but havent confirmed this. Presumably its a matter of passing in an a digest strategy instead of relying on the default.
  • Let this code run during the initial upgrade rolling deploy

This would allow the workers still running the pre-7 code to keep working during the rolling deploy, because even if a browser had hit an upgraded worker in a prior request, the SHA1/old-name cookie would still be there unmodified. The updated workers would try to read the new name first and use it if present, but fall back on the old ones (but not change them).

In a future release, after some value of "enough time", we could drop the support for reading the old format on the assumption that frequent enough users would have gotten the new format in the new name by then.

I have not researched how straightforward either of:

  • The fallback/backup code would be? I'm assuming Rails makes it possible to drop in a different digester, but haven't checked
  • Whatever configuration is needed between devise and other auth code to configure the new cookie name, or whether having the two at once approach would be complex or not there.

@ClearlyClaire
Copy link
Contributor

ClearlyClaire commented Jul 17, 2023

I think there might be a typo in the last part here? ("dropping rails 7 cookies" should be the SHA1 Rails 6 cookies?).

Definitely a typo, I meant “dropping Rails 6 cookies”, sorry!

That aside, I think what you are suggesting is something like: […]

No, sorry, that's not what I was suggesting! I meant, if possible, continue to use SHA1 in the next release, but also support reading from SHA256 (which wouldn't be used in the next release). Then, in a later release, switch to SHA256, with a rotator supporting SHA1 cookies (switch to the current code). There does not need to be any significant delay between those two releases, as they could safely run concurrently. Then a later release would drop SHA1 cookies support altogether.

EDIT: See #26023

@mjankowski
Copy link
Contributor Author

I meant, if possible, continue to use SHA1 in the next release, but also support reading from SHA256 (which wouldn't be used in the next release).

I'll comment on #26023 to try to work through this more, but I didn't think that "continue to use SHA1 in next release" and "also read from SHA256" ... are actually possible at the same time?

My mental model was that there are sort of two discrete components:

  • What cookie digester is being used? This applies both to reading inbound cookies on each request, and also on writing/setting outbound on responses. In my head these were both keyed off of that key_digest_hash_generator_class setting, and could not be separate things from each other. Maybe that's not the right model and they can be different?
  • Is there a rotator in place? If so, that's going to apply to how inbound cookies are handled and whether they are actively modified on each request when they need to be, but it's a somewhat isolated issue from the digest generator class setting.

Will comment more on the PR.

@WhiskeyOmega
Copy link

Still having users mention their 2FA is broken so its not like it fixes it over time in changing from Rails 6 to 7 workers.
Was anything else found that could possibly be the issue for 2FA not working ?

@ClearlyClaire
Copy link
Contributor

ClearlyClaire commented Jul 20, 2023

Still having users mention their 2FA is broken so its not like it fixes it over time in changing from Rails 6 to 7 workers. Was anything else found that could possibly be the issue for 2FA not working ?

We haven't seen any other report. If you get the consent of one of your affected users, can you try the steps listed in #25668 (comment)?

Also, have you possibly made unrelated changes to your .env.production?

@WhiskeyOmega
Copy link

Still having users mention their 2FA is broken so its not like it fixes it over time in changing from Rails 6 to 7 workers. Was anything else found that could possibly be the issue for 2FA not working ?

We haven't seen any other report. If you get the consent of one of your affected users, can you try the steps listed in #25668 (comment)?

Also, have you possibly made unrelated changes to your .env.production?

I mean all my setups were brought upto date the night i brought the main one up to rails 7 so dont run a 6 code anymore but maybe there wont be any output. Ill try if i get another user but not many actually run it.
On .env.production I have the full text search scope and deepl translation apart from that nothing else out the norm.

@ClearlyClaire
Copy link
Contributor

On .env.production I have the full text search scope and deepl translation apart from that nothing else out the norm.

Is it possible that when recently editing it you changed OTP_SECRET?

@WhiskeyOmega
Copy link

On .env.production I have the full text search scope and deepl translation apart from that nothing else out the norm.

Is it possible that when recently editing it you changed OTP_SECRET?

Bah! This was it ! i accidently deleted .env.production last week and copied it over from an already bad paste so the code was shorter than it should have been. That and Secret_Key_Base. Should I change it back and just remove peoples 2fa so they can add them back ?

@ClearlyClaire
Copy link
Contributor

Bah! This was it ! i accidently deleted .env.production last week and copied it over from an already bad paste so the code was shorter than it should have been. That and Secret_Key_Base. Should I change it back and just remove peoples 2fa so they can add them back ?

That's a tricky question to answer. I'd say you should change it back, but people who enabled or changed it in the meantime will have to reset it again

@WhiskeyOmega
Copy link

I dont think there were many to be honest. So ive disabled it on the ones i took off last week and alot havent actually added it back anyway.

jsgoldstein pushed a commit to jsgoldstein/mastodon that referenced this pull request Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file ruby Pull requests that update Ruby code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants