Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 400: data validation error: Problem validating JSON document against schema #295

Closed
jaap3 opened this issue Oct 2, 2018 · 11 comments
Closed

Comments

@jaap3
Copy link

jaap3 commented Oct 2, 2018

Sometimes when sending an event to the APM instance a HTTP 400 response is returned:

HTTP 400: data validation error: Problem validating JSON document against schema: 
I[#] S[#] doesn't validate with "error#" I[#/errors/0/log/message] 
S[#/properties/errors/items/properties/log/properties/message/type] 
expected string, but got array

I've also seen a similar, but subtly different message:

HTTP 400: data validation error: Problem validating JSON document against schema: 
I[#] S[#] doesn't validate with "error#" I[#/errors/0/log/param_message] 
S[#/properties/errors/items/properties/log/properties/param_message/type] 
expected string or null, but got array

This is then followed by a log message:

Failed to submit message: '<no message value>'

I'm using the latest elastic-apm client available from PyPI (3.0.1) with Python 3.6.5 and the APM server is using version 6.4.1.

(This seems similar to #135, but not quite the same).

@beniwohli
Copy link
Contributor

Hi @jaap3, thanks for opening the issue! Do you use capture_message manually somewhere, or have you set up our logging handlers?

@jaap3
Copy link
Author

jaap3 commented Oct 3, 2018

Good question, I might have omitted some relevant information.

I've configured the client in a Django project (version 2.1.2) and pretty much use "all the things" (no manual calls to the client however), in short I've:

  • appended 'elasticapm.contrib.django' to INSTALLED_APPS:
    INSTALLED_APPS.append('elasticapm.contrib.django')
  • inserted the TracingMiddleware (at the top):
    MIDDLEWARE = ['elasticapm.contrib.django.middleware.TracingMiddleware'] + MIDDLEWARE
  • Configured the client:
    ELASTIC_APM = {'SERVICE_NAME': APM_SERVICE_NAME, 'SECRET_TOKEN': APM_TOKEN, 'SERVER_URL': 'https://apm.example.com'}
  • Setup logging (as a root handler):
    LOGGING['handlers']['elasticapm'] = {'level': 'WARNING', 'class': 'elasticapm.contrib.django.handlers.LoggingHandler'}
    LOGGING['loggers']['']['handlers'].append('elasticapm')

Note that the error happens rarely (in 2 days ~10 times). If I look at the APM data it all looks normal (as far as I can tell, this is my first time trying APM).

The reason I know this error is happening is because I'm also using Sentry to monitor this project (using the Raven client version 6.9.0).

For Sentry I got basically the same setup as APM, meaning 'raven.contrib.django.raven_compat' is added to INSTALLED_APPS and a logging handler is set up ('raven.handlers.logging.SentryHandler'). Not sure if any of this is relevant.

@jaap3
Copy link
Author

jaap3 commented Oct 31, 2018

Any update on this? I've now seen this error ~130 times and it's a reason for me to hold back a rollout of APM on other projects. It's unclear to me if this error is fatal, caused by a misconfiguration etc. I would really appreciate some insight.

@axw
Copy link
Member

axw commented Nov 1, 2018

I think there may be a bug in elasticapm.handlers.logging.LoggingHandler and/or elasticapm.Message.capture.

In the Message.capture, we set the message field like:

message = param_message["message"] % params if params else param_message["message"]

So if you logged a message with no parameters, then it would take the message as it is, without any interpolation. This code assumes that params_message["message"] is a string, which is not necessarily true. If the message event originates from LoggingHandler, then that value comes from the log record's msg attribute, which is not necessarily a string: https://docs.python.org/3/howto/logging.html#using-arbitrary-objects-as-messages

I'm not sure what the best fix is, but I think it might just a matter of changing LoggingHandler's call to client_capture like this:

diff --git a/elasticapm/handlers/logging.py b/elasticapm/handlers/logging.py
index b2dbd4d6..15b04436 100644
--- a/elasticapm/handlers/logging.py
+++ b/elasticapm/handlers/logging.py
@@ -130,7 +130,7 @@ class LoggingHandler(logging.Handler):
 
         return self.client.capture(
             "Message",
-            param_message={"message": record.msg, "params": record.args},
+            param_message={"message": str(record.msg), "params": record.args},
             stack=stack,
             custom=custom,
             exception=exception,

@jaap3
Copy link
Author

jaap3 commented Nov 1, 2018

A quick grep through the project didn't find any logging calls where the message param isn't a string. But who knows, maybe a library we're using is doing this.

The proposed fix looks fairly harmless (strdoes behave differently on Python 2, so that might be an issue), I could try it for a few days and see if the error still pops up. Is there a branch I can install?

@axw
Copy link
Member

axw commented Nov 1, 2018

@jaap3 I've just pushed a branch and proposed it as a PR. Branch: https://github.com/axw/apm-agent-python/tree/logging-msg-arbitrary-object. If you could test it out, that would be great.

@jaap3
Copy link
Author

jaap3 commented Nov 5, 2018

@axw After updating to your branch I haven't seen the error again. However, a new, never before seen error appeared (only once for now):

HTTP 400: data validation error: Problem validating JSON document against schema: 
I[#] S[#] doesn't validate with "error#" I[#/errors/0/context/custom] 
S[#/properties/errors/items/properties/context/properties/custom/additionalProperties]
additionalProperties "sentry.interfaces.Message" not allowed

This is obviously related to me having both Sentry and APM clients enabled and Sentry's client (raven) maybe doing something funky?

@axw
Copy link
Member

axw commented Nov 5, 2018

After updating to your branch I haven't seen the error again.

Great!

However, a new, never before seen error appeared (only once for now):

The error is due to how additional fields of log records are being recorded as "custom" context, and probably is caused by having both Sentry and APM as you suggest. The error arises because a log record contains a key ("sentry.interfaces.Message") with dots, which the schema disallows. So we'll need another fix for that.

@jaap3
Copy link
Author

jaap3 commented Nov 6, 2018

OK, should I open a new ticket for that? And should this be fixed on the client or server side?

@beniwohli
Copy link
Contributor

@jaap3 it's an agent issue. We probably need to apply some kind of "de-dotting" (aka replace the dots with something that doesn't have semantic meaning for Elasticsearch field names, e.g. a dash) for keys in the context dictionary.

I will merge @axw's branch with the fix for this issue (#295) some time today and release it to PyPI soon thereafter

@beniwohli
Copy link
Contributor

beniwohli commented Nov 7, 2018

@jaap3 version 3.0.3 of the agent is out with a fix for the logging issue.

I opened #317 to track the issue you reported in #295 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants