Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddtrace client error during HTTP request #999

Open
Supy opened this issue Apr 9, 2020 · 6 comments
Open

ddtrace client error during HTTP request #999

Supy opened this issue Apr 9, 2020 · 6 comments
Assignees
Labels
bug Involves a bug community Was opened by a community member core Involves Datadog core libraries
Projects

Comments

@Supy
Copy link

Supy commented Apr 9, 2020

Apologies for the vague title. Unfortunately the error does not yield much more information. After upgrading from ddtrace (0.33) to ddtrace (0.34.1), the following errors have started popping up in our logs:

Internal error during HTTP transport request. Cause: 784: unexpected token at 'msgp: attempted to decode type "map" with method for "str"
' Location: /usr/local/bundle/gems/json-1.8.6/lib/json/common.rb:155:in `parse'

The location of the log line appears to be ddtrace/transport/http/client.rb#46. We log the trace IDs with each log line and in these logs, the trace_id is 0, so I gather from that that there is no active correlation at the point where this is logged.

@delner
Copy link
Contributor

delner commented Apr 9, 2020

This looks like a MsgPack encoding error to me; if I had to guess, it looks like it was attempting to encode something that could not be converted to string (probably a Hash key?)

This shouldn't happen under normal operation, and we didn't make any changes to transport/encoding or add new tags in the last release that I can think of right now. It'd be helpful if you can share your ddtrace configuration file, Gemfile.lock, and any custom modifications/instrumentation you may have implemented. Even better would be a code sample that replicates the problem.

If we can reproduce this, then we should be able to debug and address it.

@delner delner self-assigned this Apr 9, 2020
@delner delner added bug Involves a bug core Involves Datadog core libraries labels Apr 9, 2020
@Supy
Copy link
Author

Supy commented Apr 14, 2020

Unfortunately, as you can see from the error, it's not clear which line is generating the error. We have many locations of instrumentation in our code. We do implement a custom log formatter that is mostly just backwards compatibility with the ActiveSupport::TaggedLogging::Formatter, which looks as follows:

require 'logger'
require 'json'
require_relative './payload'

module Logging
  class DatadogFormatter < ::Logger::Formatter
    # This method is invoked when a log event occurs.
    def call(severity, timestamp, _progname, msg)
      correlation = Datadog.tracer.active_correlation
      details = case msg
                when Payload
                  message_hash_from_payload(msg, correlation)
                when Exception
                  message_hash_from_exception(msg, correlation)
                else
                  message_hash_from_object(msg, correlation)
                end

      details.tap do |hash|
        # If "level" is present in the message body, Datadog will override this attribute
        # with the value from the message.
        hash[:level] = severity
        hash[:timestamp] ||= timestamp.utc.iso8601(3)

        hash['dd.trace_id'] = correlation.trace_id if correlation.present?
      end

      JSON.dump(details) + "\n"
    rescue StandardError => e
      JSON.dump(
        level: 'ERROR',
        message: "[DATADOG_FORMATTER_ERROR] message='#{msg}'",
        DD_ERROR_KIND_ATTRIBUTE => e.class.name,
        DD_ERROR_MESSAGE_ATTRIBUTE => e.message,
        DD_ERROR_STACK_ATTRIBUTE => e.backtrace.take(10).join("\n")
      ) + "\n"
    end

    def tagged(*tags, **attributes)
      new_tags = push_tags(*tags)
      new_attributes = add_attributes(**attributes)
      yield self
    ensure
      pop_tags(new_tags.size)
      remove_attributes(*new_attributes.keys)
    end

    def push_tags(*tags)
      tags.flatten.reject(&:blank?).tap do |new_tags|
        current_tags.concat new_tags
      end
    end

    def pop_tags(size = 1)
      current_tags.pop size
    end

    def clear_tags!
      current_tags.clear
    end

    def current_tags
      Thread.current[:datadogformatter_tagged_logging_tags] ||= []
    end

    def add_attributes(**attributes)
      attributes.compact.tap do |new_attributes|
        current_attributes.merge!(new_attributes)
      end
    end

    def remove_attributes(*attribute_keys)
      attribute_keys.each { |key| current_attributes.delete(key) }
    end

    def clear_attributes!
      current_tags.clear
    end

    def current_attributes
      Thread.current[:datadogformatter_tagged_logging_attributes] ||= {}
    end

    private

    MAX_BACKTRACE_LINES = 10

    def message_hash_from_payload(payload, correlation)
      {}.tap do |hash|
        exception = payload.exception
        tags = current_tags + (payload.tags || [])
        attributes = current_attributes.merge(payload.attributes)
        if exception.present?
          if exception.is_a?(Exception)
            attributes.merge!(
              error_kind: exception.class.name,
              error_message: exception.message
            )
          else
            attributes.merge!(
              error_message: exception.to_s
            )
          end
        end

        message = "trace_id=#{correlation&.trace_id} " + build_message(tags, attributes)
        hash[:message] = message

        if exception.present?
          if exception.is_a?(Exception)
            hash[DD_ERROR_KIND_ATTRIBUTE] = exception.class.name
            hash[DD_ERROR_MESSAGE_ATTRIBUTE] = exception.message
            hash[DD_ERROR_STACK_ATTRIBUTE] = format_backtrace(exception.backtrace)
          else
            hash[DD_ERROR_MESSAGE_ATTRIBUTE] = exception.to_s
          end
        end
      end
    end

    def message_hash_from_exception(exception, correlation)
      message = "trace_id=#{correlation&.trace_id} "
      attributes = current_attributes.merge(
        error_kind: exception.class.name,
        error_message: exception.message
      )
      message += build_message(current_tags, attributes)

      {
        message: message,
        DD_ERROR_KIND_ATTRIBUTE => exception.class.name,
        DD_ERROR_MESSAGE_ATTRIBUTE => exception.message,
        DD_ERROR_STACK_ATTRIBUTE => format_backtrace(exception.backtrace)
      }
    end

    def message_hash_from_object(msg, correlation)
      message = "trace_id=#{correlation&.trace_id} "
      tags_and_params = build_message(current_tags, current_attributes)
      message += tags_and_params + ' ' unless tags_and_params.blank?
      message += msg.to_s

      { message: message }
    end

    def build_message(tags, params)
      formatted_tags = tags.map { |tag| format_tag(tag) }
      joined_params = params.map { |args| format_param(args[0], args[1]) }.join(' ')

      if formatted_tags.empty?
        joined_params
      else
        ("tags=[#{formatted_tags.join(' ')}] " + joined_params).strip
      end
    end

    def format_tag(tag)
      tag.to_s.gsub(/[^\w]/i, '').upcase
    end

    def format_param(key, value)
      key = key.to_s.gsub(/[^\w]/i, '').downcase
      value = value.is_a?(Numeric) ? value : value.to_s.gsub(/['"]/, '')

      "#{key}='#{value}'"
    end

    def format_backtrace(backtrace)
      backtrace.take(MAX_BACKTRACE_LINES).join("\n")
    end
  end
end

Ultimately, though, it always returns a JSON string, so I doubt this would be the source of the issue.

@marcotc
Copy link
Member

marcotc commented Oct 7, 2020

👋 @Supy, I just wanted to touch base and ask if you still see the same issue in your environment. I understand changes might have been deployed to your environment since then, and many released of ddtrace have been released, so I want to make sure this issue still persists.

@Supy
Copy link
Author

Supy commented Oct 7, 2020

@marcotc - appreciate the ping. I will need to check the state of things to answer your question. I think in the end we resorted to excluding those errors from our log index as an interim solution after not being able to reach a root cause. Perhaps now is a good time to bump our ddtrace gem version and remove the exclusion to see if the issue has been passively resolved. Will post an update in the next week or two.

@Supy
Copy link
Author

Supy commented Oct 14, 2020

@marcotc some feedback: bumped our gem version from 0.34.1 to 0.41.0 and live-tailed the logs to bypass the index exclusion policy. Within a few seconds I received the error logs, so it looks like the issue still persists.

@delner delner added this to Needs triage in Planning via automation Apr 15, 2021
@delner delner moved this from Needs triage to Medium Term Priority in Planning Apr 19, 2021
@delner delner added the community Was opened by a community member label Apr 21, 2021
@pkopac
Copy link

pkopac commented May 4, 2021

We've been experiencing a very similar problem lately, with 180k occurrences a day. We use the latest gem version and the latest agent (v7.27.0).

[ddtrace] (***/bundle/ruby/2.7.0/gems/ddtrace-0.48.0/lib/ddtrace/transport/http/client.rb:35:in `rescue in send_request') Internal error during HTTP transport request. Cause: Empty input at line 1, column 1 [parse.c:926] in 'msgp: attempted to decode type "array" with method for "str" at 0/4/Service

Any way we could intercept what's being sent to the tracer, so we could see what's actually failing to parse?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug community Was opened by a community member core Involves Datadog core libraries
Projects
Planning
Medium Term Priority
Development

No branches or pull requests

4 participants