Prefer a simple counter-based lock. #47257

ioquatix · 2023-02-05T09:46:02Z

This allows acquire_read_lock and release_read_lock to be called from different fibers/threads (which can be the case for falcon).

See 9a4c1e2 which introduced the current implementation.

Motivation / Background

This is causing problems as outlined in detail here: socketry/falcon#166

Checklist

Before submitting the PR make sure the following are checked:

This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
Tests are added or updated if you fix a bug or add a feature.

simi · 2023-02-05T13:24:07Z

actionview/lib/action_view/cache_expiry.rb

@@ -2,9 +2,61 @@

 module ActionView
  class CacheExpiry
+    class ExecutionLock


Wouldn't be possible to contribute this kind of lock into the concurrent-ruby itself? 🤔

I don't have a strong opinion about it, but I don't think it's enough of a general implementation to make sense to upstream to concurrent-ruby. That being said, we could ask? It's quite specific to the use case of the cache layer.

This allows `acquire_read_lock` and `release_read_lock` to be called from different fibers/threads (which can be the case for falcon).

ioquatix · 2023-02-06T02:54:53Z

This PR specifically includes tests for the faulty behaviour in the linked issue, so it definitely fixes that specific issue and can prevent regressions. Additionally, I wrote a few more tests to confirm the correct behaviour of the execution lock.

eregon · 2023-02-06T11:05:46Z

actionview/test/unit/cache_expiry_test.rb

+
+    execution_lock.with_write_lock do
+      assert_equal 1, execution_lock.write_count
+      assert_equal 1, execution_lock.read_count


Shouldn't the read_count always be 0 inside the with_write_lock block?

No, because you can upgrade a read lock to a write lock (the reentrant part of the original design).

Right, having the write lock is considered having both and here it's tracked explicitly in both counts.

eregon · 2023-02-06T11:14:50Z

actionview/test/unit/cache_expiry_test.rb

+    end
+
+    # Wait for both the threads to be waiting for the lock:
+    Thread.pass until reader.status == "sleep" && writer.status == "sleep"


Why would the reader ever wait/sleep? It can acquire the read lock straight away and the same for the rest of its body, no?

Ah, you are right, I should ensure the write lock has acquired first.

eregon · 2023-02-06T11:16:26Z

Which test would fail with the previous lock and falcon? test_invoke_cache_expiry_on_different_fibers?

matthewd · 2023-02-06T11:34:54Z

The lack of safety checks here makes me nervous (e.g. release_read_lock when no such lock is held). While a basic "make sure it can't go negative" would be a slight improvement, it seems generally bad to not ensure we are releasing the same locks we're acquiring.

This also feels more like a symptom of a more general possible concern. Is there something we should be doing to ensure both calls occur in the same execution context? Beyond any locks like this, it seems like it would also break thread/fiber variables, and thus IsolatedExecutionState and all of its dependents.

eregon · 2023-02-06T11:57:44Z

Indeed, I also wonder about the need to change this.
Maybe ActionView::CacheExpiry::Executor's run and complete should be a single method taking a block, to ensure the complete part is done on the same Fiber?
What's the need to call them on different Fibers/Threads?

ioquatix · 2023-02-06T20:39:35Z

The lack of safety checks here makes me nervous (e.g. release_read_lock when no such lock is held). While a basic "make sure it can't go negative" would be a slight improvement, it seems generally bad to not ensure we are releasing the same locks we're acquiring.

I agree, but this is the design of Rail's Executor mechanism (which I assume we would prefer not to change). In addition, this code path is only used in development environments so the risk is low.

This also feels more like a symptom of a more general possible concern. Is there something we should be doing to ensure both calls occur in the same execution context? Beyond any locks like this, it seems like it would also break thread/fiber variables, and thus IsolatedExecutionState and all of its dependents.

There is no guarantee that the app.call(env) and response.close will occur on the same thread or fiber by Rack. Falcon may choose to stream the response body in a separate fiber if it makes sense. I don't know if this breaks other assumptions in Rails but all the tests are passing, and multiple things break (in the Rails test suite) if you don't get this behaviour right. We've also got positive reports from users and my own testing confirms this fix.

Maybe ActionView::CacheExpiry::Executor's run and complete should be a single method taking a block, to ensure the complete part is done on the same Fiber?

It's impossible to change this due to the way it's used and designed.

https://github.com/rails/rails/blob/4208cf2234988969a485a21233e932a8cb120ccc/actionpack/lib/action_dispatch/middleware/executor.rb

Personally, I don't think the executor model should be taking long-lived locks, it's a bad design IMHO, but I also understand what is trying to be achieved.

The long lived lock can cause problems in many ways... the design of the caching layer to depend on that seems like a mistake to me. Rather than reloading before each request, why not just not doing any caching if it's not requested for example. There are no other use case in Rails like this one so if we could change this behaviour it would be vastly simpler.

What's the need to call them on different Fibers/Threads?

Sometimes async-http needs to move a response body into a separate fiber, e.g. to turn a response body which only responds to #each into one that is streaming, a little bit similar to how Enumerator sometimes creates an internal Fiber.

- Ensure test writer thread is waiting before reader.

jhawthorn · 2023-02-08T22:27:32Z

I think we should not do this. This reloader really should not have its own lock in the first place (it's not significantly enough different from other things we need to reload). I'll open a PR shortly to move us to reusing the same reloader infrastructure as elsewhere.

It's possible that will work around the immediate error, but I agree with the concerns about other issues. In general I think everything expects the request to remain on the same thread/fiber (DB connection, CurrentAttributes, much more) it may just be that this is the most obvious way in which it fails.

ioquatix · 2023-02-09T01:07:37Z

Thanks for the review. I'll close this PR. Can you please link me to your proposed fix? Thanks!

eregon · 2023-02-09T11:41:43Z

There is no guarantee that the app.call(env) and response.close will occur on the same thread or fiber by Rack. Falcon may choose to stream the response body in a separate fiber if it makes sense.

I think it may be more OK to write/stream the response in a separate fiber, but in terms of structured concurrency and safe handling of resources, it seems important/valuable that whatever "opens" the resource closes it on the same Fiber.
So you could have F1 opens, F2 streams, F3 streams, F1 closes.

ioquatix · 2023-02-10T00:50:11Z

I think it may be more OK to write/stream the response in a separate fiber, but in terms of structured concurrency and safe handling of resources, it seems important/valuable that whatever "opens" the resource closes it on the same Fiber.
So you could have F1 opens, F2 streams, F3 streams, F1 closes.

Can you elaborate on why that's important?

In my case, passing the stream between different clients and servers (in the case of a proxy etc) is fairly useful. The server that's writing the response may not be the same as the one that generated it, etc.

matthewd · 2023-02-10T06:25:27Z

Can you elaborate on why that's important?

That's this bit:

it would also break thread/fiber variables, and thus IsolatedExecutionState and all of its dependents

eregon · 2023-02-10T19:10:04Z

Yeah, and I also meant it does break structured concurrency, i.e., it's unstructured concurrency if the "parent" Fiber which "opens" the resource doesn't close it, and Fibers inside do not end before that parent Fiber.

rails-bot bot added the actionview label Feb 5, 2023

ioquatix force-pushed the execution-lock branch from 9bb3c91 to 6e612c7 Compare February 5, 2023 10:43

simi reviewed Feb 5, 2023

View reviewed changes

ioquatix mentioned this pull request Feb 5, 2023

Allow rack >= 3 in Rails. #46594

Closed

ioquatix force-pushed the execution-lock branch from 2377eb6 to 4b25293 Compare February 6, 2023 02:26

rails-bot bot added the activesupport label Feb 6, 2023

ioquatix force-pushed the execution-lock branch 2 times, most recently from 0f6445d to d24cbb0 Compare February 6, 2023 02:28

Prefer a simple counter-based lock.

6fdc8d8

This allows `acquire_read_lock` and `release_read_lock` to be called from different fibers/threads (which can be the case for falcon).

ioquatix force-pushed the execution-lock branch from d24cbb0 to 6fdc8d8 Compare February 6, 2023 02:38

ioquatix mentioned this pull request Feb 6, 2023

Unexpectedly and often getting Concurrent::IllegalOperationError socketry/falcon#166

Closed

ioquatix requested a review from simi February 6, 2023 03:02

eregon reviewed Feb 6, 2023

View reviewed changes

- Prevent read_lock from being released without being held.

a08188c

- Ensure test writer thread is waiting before reader.

ioquatix closed this Feb 9, 2023

jhawthorn mentioned this pull request Feb 10, 2023

Use "main" reloader and cache resolvers from {append,prepend}_view_path #47347

Merged

ioquatix deleted the execution-lock branch February 10, 2023 08:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefer a simple counter-based lock. #47257

Prefer a simple counter-based lock. #47257

ioquatix commented Feb 5, 2023 •

edited

Loading

simi Feb 5, 2023

ioquatix Feb 5, 2023

ioquatix commented Feb 6, 2023

eregon Feb 6, 2023

ioquatix Feb 6, 2023

eregon Feb 9, 2023

eregon Feb 6, 2023

ioquatix Feb 6, 2023

eregon commented Feb 6, 2023

matthewd commented Feb 6, 2023

eregon commented Feb 6, 2023

ioquatix commented Feb 6, 2023 •

edited

Loading

jhawthorn commented Feb 8, 2023 •

edited

Loading

ioquatix commented Feb 9, 2023

eregon commented Feb 9, 2023

ioquatix commented Feb 10, 2023

matthewd commented Feb 10, 2023

eregon commented Feb 10, 2023

Prefer a simple counter-based lock. #47257

Prefer a simple counter-based lock. #47257

Conversation

ioquatix commented Feb 5, 2023 • edited Loading

Motivation / Background

Checklist

simi Feb 5, 2023

Choose a reason for hiding this comment

ioquatix Feb 5, 2023

Choose a reason for hiding this comment

ioquatix commented Feb 6, 2023

eregon Feb 6, 2023

Choose a reason for hiding this comment

ioquatix Feb 6, 2023

Choose a reason for hiding this comment

eregon Feb 9, 2023

Choose a reason for hiding this comment

eregon Feb 6, 2023

Choose a reason for hiding this comment

ioquatix Feb 6, 2023

Choose a reason for hiding this comment

eregon commented Feb 6, 2023

matthewd commented Feb 6, 2023

eregon commented Feb 6, 2023

ioquatix commented Feb 6, 2023 • edited Loading

jhawthorn commented Feb 8, 2023 • edited Loading

ioquatix commented Feb 9, 2023

eregon commented Feb 9, 2023

ioquatix commented Feb 10, 2023

matthewd commented Feb 10, 2023

eregon commented Feb 10, 2023

ioquatix commented Feb 5, 2023 •

edited

Loading

ioquatix commented Feb 6, 2023 •

edited

Loading

jhawthorn commented Feb 8, 2023 •

edited

Loading