Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permit disabling fail safe #94

Closed
wants to merge 5 commits into from
Closed

Conversation

lewispb
Copy link
Contributor

@lewispb lewispb commented Jul 11, 2022

Makes the fail safe mechanism optional. When fail safe is disabled, errors will be raised as per usual.

This covers scenarios where the global fail safe is undesirable.

Usage

config.kredis.fail_safe = false

Makes the fail safe mechanism optional. When fail safe is disabled,
errors will be raised as per usual.

Add line to Readme about fail safe config

Remove unuused test case

Simplify failsafe proxying and supression

Refactor fail safe to express intentions more clearly
Copy link
Collaborator

@jorgemanrubia jorgemanrubia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @lewispb!

test/proxy_test.rb Outdated Show resolved Hide resolved
lib/kredis/railtie.rb Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
lib/kredis/types/proxy/failsafe.rb Outdated Show resolved Hide resolved
Copy link
Member

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing reflection on the role of a proxy/connection-level fail-safe…

At the center of this is the blanket fail-safe in Kredis::Types::Proxy#method_missing. That applies to all proxied methods, which means all Redis commands that each type invokes.

Do all our data structures appropriately anticipate that the Redis commands they invoke could silently do nothing? On an initial perusal, they do not, and the consequences range from incorrect behavior to data corruption.

Furthermore, hiding errors means that other error handlers (like circuit breakers and exception reporters) never have the chance to see that something's going wrong.

I think it's worth upping the scrutiny here and probably removing the current failure-safety implementation.

Fail-safe behavior should generally be a characteristic of the data types, not the Redis connection, and it needs to be consistently applied. Slots#reserve and #available? do this, for example, yet Slots#reset does not.

Furthermore, in the context of data types (rather than the proxy/Redis connection) this looks more like an "offline mode" than low-level failure-safety, which suggests a simple design path: when fail-open/closed-when-offline behavior is warranted, implement it by composing or subclassing the basic Kredis type and wrapping it with appropriate exception handling.

So, for the purposes of this PR, I think we should park it 🥺 and not introduce new global config since there really isn't coherent, toggleable failure-safety behavior to expose. Doing so would ultimately complicate efforts to rectify this situation since we'd need to ditch the config afterward.

@@ -14,6 +14,10 @@ class Kredis::Railtie < ::Rails::Railtie

initializer "kredis.configuration" do
Kredis::Connections.connector = config.kredis.connector || ->(config) { Redis.new(config) }

unless config.kredis.fail_safe.nil? do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing do


You can configure how the redis client is created by setting `config.kredis.connector` in your `application.rb`:

```ruby
config.kredis.connector = ->(config) { SomeRedisProxy.new(config) }
```

The fail-safe mechanism supports silently rescuing or returning a default value in the event that the Redis client returns an error (e.g. Redis is down). You can disable the default fail-safe mechanism:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering what kind of guidance to share with users here and realizing this is our first mention of fail-safe behavior—and of error handling at all! 😆

Which leads to some broader reflection that a global, config-level flag that changes internal behavior is hard to reason about or explain.

Continuing in review comment…

@lewispb
Copy link
Contributor Author

lewispb commented Aug 16, 2022

Great assessment, thank you @jeremy. You're right. Let's close this for now and instead move to rethink the fail-safe behaviour.

@lewispb lewispb closed this Aug 16, 2022
@maxim
Copy link

maxim commented Feb 9, 2024

@jeremy I agree as well that most error handling of this kind is ideally decided by the caller, but on the other side, most people just don't want their apps to crash because, say, a limiter is not working.

That said, I think most people would want to find out quickly that Redis is not working, for which normally they'd use some logger or notifier. We probably don't want to encourage subclassing and overriding the failsafe every time someone wants to add a notifier. It would make future changes/refactorings of kredis likely to break codebases.

Which leads me to think that maybe the right approach is for some types to accept an optional callback that passes in the error. In the callback we can send a notice. This gives people a stable api, allowing internals to change freely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants