Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: resolve Dalli::Server deprecation in 3.0+ #1015

Merged
merged 2 commits into from
Nov 16, 2021
Merged

fix: resolve Dalli::Server deprecation in 3.0+ #1015

merged 2 commits into from
Nov 16, 2021

Conversation

tierra
Copy link
Contributor

@tierra tierra commented Nov 10, 2021

Dalli 3.0 has deprecated Dalli::Server in petergoldstein/dalli#760

I believe the fix is simply to use Dalli::Protocol::Binary instead, when Dalli >=3.0 is detected (while still using Dalli::Server in <3.0).

I've created a new Appraisal config to test this with Dalli 3.0, while still also testing 2.7.

Related: Dalli 3.0 Upgrade Guide

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 10, 2021

CLA Signed

The committers are authorized under a signed CLA.

@tierra
Copy link
Contributor Author

tierra commented Nov 10, 2021

I am seeing these failures locally, but was hoping it was just bad local setup/configuration:

>> BUNDLE_GEMFILE=/home/runner/work/opentelemetry-ruby/opentelemetry-ruby/instrumentation/dalli/gemfiles/dalli_3.0.gemfile bundle exec rake test
Run options: --seed 64944

# Running:

...W, [2021-11-10T19:54:06.506858 #7395]  WARN -- : localhost:11211 failed (count: 0) RuntimeError: External timeout
W, [2021-11-10T19:54:06.609232 #7395]  WARN -- : localhost:11211 failed (count: 1) RuntimeError: External timeout
W, [2021-11-10T19:54:06.609384 #7395]  WARN -- : localhost:11211 is down
F...

Finished in 0.115292s, 60.7153 runs/s, 216.8402 assertions/s.

  1) Failure:
OpenTelemetry::Instrumentation::Dalli::Instrumentation::tracing#test_0006_after error [/home/runner/work/opentelemetry-ruby/opentelemetry-ruby/instrumentation/dalli/test/opentelemetry/instrumentation/dalli/instrumentation_test.rb:87]:
Expected: 1
  Actual: 2

7 runs, 25 assertions, 1 failures, 0 errors, 0 skips

I'm not really sure what's broken here. I've done this same change on a different instrumentation tool (not OTEL) that used the same approach to prepend Dalli::Server, and that worked flawlessly.

Is there something I'm missing related to configuring Memcached with the second appraisal bundle gemfile maybe?

@arielvalentin
Copy link
Contributor

@tierra Has something changed between versions? Does the Binary Protocol version have built in retries where the older versions do not?

@tierra
Copy link
Contributor Author

tierra commented Nov 11, 2021

petergoldstein/dalli#783 made a reconnect change that I think affects get_multi / get_multi_cas calls. Might be related.

It's possible the difference here is just in how the unit tests mock the connection failure behaviors.

@tierra
Copy link
Contributor Author

tierra commented Nov 11, 2021

As far as I understand it, Dalli::Server was always the "binary" protocol version, and it's been renamed (to explicitly state so) in 3.0 since Dalli 4.0 plans to add a new "meta/text" protocol version, and eventually switch to it.

I'm not entirely clear about any of this though.

@arielvalentin
Copy link
Contributor

@tierra
Copy link
Contributor Author

tierra commented Nov 11, 2021

It probably helps if I mention that I have an application in production that's already upgraded to Dalli 3.0, while also configured to use this instrumentation, and I've already verified that it's still properly recording spans for Dalli calls from this (it's just the addition of the deprecation warnings in logs that has brought me here).

So that's another reason I suspect that the test failures here are likely due to the way the unit tests here mock internal implementation behaviors (and are failing due to Dalli internal implementation changes in 3.0).

@arielvalentin
Copy link
Contributor

It probably helps if I mention that I have an application in production that's already upgraded to Dalli 3.0, while also configured to use this instrumentation, and I've already verified that it's still properly recording spans for Dalli calls from this (it's just the addition of the deprecation warnings in logs that has brought me here).

So that's another reason I suspect that the test failures here are likely due to the way the unit tests here mock internal implementation behaviors (and are failing due to Dalli internal implementation changes in 3.0).

It looks like I was not clear that I agreed with you. Since this tests is mocking the library itself and generating an error and retries where introduced, then what I believe is happening here is that the system under test generating multiple spans, one for each retry.

Based on the warning output from the test log it does indeed seem to be retrying:

...W, [2021-11-10T19:54:06.506858 #7395]  WARN -- : localhost:11211 failed (count: 0) RuntimeError: External timeout
W, [2021-11-10T19:54:06.609232 #7395]  WARN -- : localhost:11211 failed (count: 1) RuntimeError: External timeout
W, [2021-11-10T19:54:06.609384 #7395]  WARN -- : localhost:11211 is down

I think that means you will have to update the test to satisfy this scenario. That may mean changing the test to mock it in a different way or triggering the error without using a mock and updating the assertions.

Does that help clarify what I was alluding to in my previous comments?

@ericmustin
Copy link
Contributor

ericmustin commented Nov 15, 2021

Hello friends,

Took a quick look here and then took a longer look because I too was confused by this. Anyway I think what happens is that actually the underlying behavior of get_multi and Dalli::NetworkError has changed in a recent 3.0.x release, specifically this PR does legitimately update the behavior here to retry on network errors, petergoldstein/dalli#754. So, the test is failing because our test is too rigid. Rather than try to write multiple test cases depending on the Dalli Gem major/minor version, it's probably quicker to just mock using a different raised error, such as Dalli::DalliError which i've done in the diff below.

Anyway @tierra if you wouldn't mind adding that patch to your branch that should get all the tests working and we can get this merged. ty for the pr here as well!

diff --git a/instrumentation/dalli/test/opentelemetry/instrumentation/dalli/instrumentation_test.rb b/instrumentation/dalli/test/opentelemetry/instrumentation/dalli/instrumentation_test.rb
index 866a47d9..05f431de 100644
--- a/instrumentation/dalli/test/opentelemetry/instrumentation/dalli/instrumentation_test.rb
+++ b/instrumentation/dalli/test/opentelemetry/instrumentation/dalli/instrumentation_test.rb
@@ -80,7 +80,7 @@ describe OpenTelemetry::Instrumentation::Dalli::Instrumentation do
       dalli.set('foo', 'bar')
       exporter.reset
 
-      dalli.instance_variable_get(:@ring).servers.first.stub(:write, ->(_bytes) { raise Dalli::NetworkError }) do
+      dalli.instance_variable_get(:@ring).servers.first.stub(:write, ->(_bytes) { raise Dalli::DalliError }) do
         dalli.get_multi('foo', 'bar')
       end
 
@@ -93,8 +93,8 @@ describe OpenTelemetry::Instrumentation::Dalli::Instrumentation do
 
       span_event = span.events.first
       _(span_event.name).must_equal 'exception'
-      _(span_event.attributes['exception.type']).must_equal 'Dalli::NetworkError'
-      _(span_event.attributes['exception.message']).must_equal 'Dalli::NetworkError'
+      _(span_event.attributes['exception.type']).must_equal 'Dalli::DalliError'
+      _(span_event.attributes['exception.message']).must_equal 'Dalli::DalliError'
     end
 
     it 'omits db.statement' do
  • updated the diff, had it backward originally

@tierra
Copy link
Contributor Author

tierra commented Nov 15, 2021

Thank you @ericmustin! Your change does fix the tests in local when using Dalli 3.0. It also still works with 2.7 as well. I've updated the PR. I think it requires review to run PR builds, but should be good to go now.

@fbogsany fbogsany merged commit 9428b4e into open-telemetry:main Nov 16, 2021
@tierra tierra deleted the dalli-3.0-support branch November 25, 2021 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants