Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to change prediction log format #1464

Closed
gregd33 opened this issue Feb 19, 2021 · 10 comments
Closed

Unable to change prediction log format #1464

gregd33 opened this issue Feb 19, 2021 · 10 comments
Labels
bug Something isn't working

Comments

@gregd33
Copy link
Contributor

gregd33 commented Feb 19, 2021

Describe the bug

I've tried various methods to set prediction_log_json_format:

  • via env variable
  • via a custom cfg file
  • via the bentoml config set
  • via the service itself with bentoml.config()

Some of them seem to work at least when looking at bentoml config view-effective but none actually give the expected log format.

To Reproduce

  • bentoml config set logging.prediction_log_json_format="%%(service_name)s" OR in the service itself via bentoml.config().set('logging', 'prediction_log_json_format', "%%(service_name)s")
  • bentoml config view-effective should show that it is there
  • run bentoml serve-gunicorn --workers 1 MyService:latest --debug
  • send something through it (using swagger for my testing) and you'll see much more verbose logs (the full result etc)

It's worth mentioning that when you run in debug mode and you see the config, it doesn't show the one I set (default or otherwise)

Expected behavior

Logs should be of the format I specify

Screenshots/Logs

Environment:

  • OS: CentOS7
  • Python Version: 3.7.9
  • BentoML Version: 0.11.0

Additional context

The service itself is using ImageInput and JsonOutput if that matters

I've confirmed that in https://github.com/bentoml/BentoML/blob/4c825fd359267ee333925514e144ef1b39e5f47e/bentoml/utils/log.py#L32 it sees the right format.

@gregd33 gregd33 added the bug Something isn't working label Feb 19, 2021
@parano
Copy link
Member

parano commented Feb 23, 2021

Hi @gregd33 - have you had a chance to try out the new logging config as documented here? https://docs.bentoml.org/en/latest/guides/logging.html

@gregd33
Copy link
Contributor Author

gregd33 commented Feb 24, 2021

I just gave that a try. I ran into an issue with {BENTOML_HOME}/logging.yml - $BENTOML_HOME doesn't seem to be defined in the container... but I tried with /bento/logging.yml.

With that, same behaviour - it didn't work

@gregd33
Copy link
Contributor Author

gregd33 commented Feb 24, 2021

Just to be sure it wasn't me, I built the IrisClassifier in https://github.com/bentoml/BentoML/blob/master/guides/quick-start/ and ran a bentoml containerize on it.

Then ran
docker run -p 5000:5000 -v /home/xxxx/logging.yml:/bento/logging.yml iris-classifier:20210224234335_3AF8FD --workers 1

where the logging file was the default with one line changed

[...]
formatters:
    [...]
    prediction:
        (): pythonjsonlogger.jsonlogger.JsonFormatter
        #fmt: '%(service_name)s %(service_version)s %(api)s %(request_id)s %(task)s %(result)s %(asctime)s'
        fmt: '%(service_name)s'
[...]

I got this in the logs:

[2021-02-24 23:47:14,101] INFO - {'service_name': 'IrisClassifier', 'service_version': '20210224234335_3AF8FD', 'api': 'predict', 'task': {'data': '0,1,2,3\n5.1,3.5,1.4,0.2\n4.9,3.0,1.4,0.2\n4.7,3.2,1.3,0.2\n4.6,3.1,1.5,0.2\n5.0,3.6,1.4,0.2\n5.4,3.9,1.7,0.4\n4.6,3.4,1.4,0.3\n5.0,3.4,1.5,0.2\n4.4,2.9,1.4,0.2\n4.9,3.1,1.5,0.1\n5.4,3.7,1.5,0.2\n4.8,3.4,1.6,0.2\n4.8,3.0,1.4,0.1\n4.3,3.0,1.1,0.1\n5.8,4.0,1.2,0.2\n5.7,4.4,1.5,0.4\n5.4,3.9,1.3,0.4\n5.1,3.5,1.4,0.3\n5.7,3.8,1.7,0.3\n5.1,3.8,1.5,0.3\n5.4,3.4,1.7,0.2\n5.1,3.7,1.5,0.4\n4.6,3.6,1.0,0.2', 'task_id': '97fddc43-b7a5-4196-b580-83e3930055cc', 'batch': 23, 'http_headers': (('Host', 'localhost:5000'), ('User-Agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'), ('Accept', '*/*'), ('Accept-Language', 'en-CA,en-US;q=0.7,en;q=0.3'), ('Accept-Encoding', 'gzip, deflate'), ('Referer', 'http://localhost:5000/'), ('Content-Type', 'text/csv'), ('Origin', 'http://localhost:5000'), ('Content-Length', '375'), ('Connection', 'keep-alive'))}, 'result': {'data': '[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': '97fddc43-b7a5-4196-b580-83e3930055cc'}

When I would only expect service_name.

So this should be fully reproducible by anyone.

@yubozhao
Copy link
Contributor

@ssheng Any ideas?

@ssheng
Copy link
Collaborator

ssheng commented Feb 26, 2021

Thanks for reporting, @gregd33. I was able to reproduce the case and confirm the format string got configured correctly. However, it doesn't seem like pythonjsonlogger.jsonlogger.JsonFormatter is respecting the format string. Instead, it just outputs the entire dict in log.info(). I am going to troubleshoot more tomorrow.

@ssheng
Copy link
Collaborator

ssheng commented Mar 1, 2021

Thanks for the patience, @gregd33. It turned out that the APIs in pythonjsonlogger.jsonlogger is rather unintuitive. Simply removing the attributes we wish to exclude from the format string is insufficient. The attributes also need to be added to reserved_attrs as a tuple. After the fix in #1478, you can try one of the follow approaches as you mentioned in the earlier messages.

Python:

bentoml.config().set('logging', 'prediction_log_json_format', "%%(service_name)s")
bentoml.config().set('logging', 'prediction_log_json_reserved', "service_version,api,request_id,task,result")

Through https://docs.bentoml.org/en/latest/guides/logging.html:

[...]
formatters:
    [...]
    prediction:
        (): pythonjsonlogger.jsonlogger.JsonFormatter
        #fmt: '%(service_name)s %(service_version)s %(api)s %(request_id)s %(task)s %(result)s %(asctime)s'
        fmt: '%(service_name)s'
        reserved_attrs: ('service_version', 'api', 'request_id', 'task', 'result', 'args', 'asctime', 'created', 'exc_info', 'exc_text', 'filename', 'funcName', 'levelname', 'levelno', 'lineno', 'module', 'msecs', 'message', 'msg', 'name', 'pathname', 'process', 'processName', 'relativeCreated', 'stack_info', 'thread', 'threadName')
[...]

@gregd33
Copy link
Contributor Author

gregd33 commented Mar 2, 2021

Awesome, thanks so much! Would never of figured that out.

@ssheng
Copy link
Collaborator

ssheng commented Mar 2, 2021

@gregd33, could you please share your use case for customizing the prediction log? I had a chance to discuss #1478 with the other contributors of BentoML. We think the solution would introduce some unwanted complexities. So we'd like to understand your use case more to make the right tradeoffs.

@gregd33
Copy link
Contributor Author

gregd33 commented Mar 2, 2021

So the primary reason for raising this was that it wasn't working as expected so I wanted to ensure I wasn't doing something wrong / flag it as a bug. My use case was moreso seeing if it was possible rather than having a firm need for it. That said, the two reasons I might have:

  1. Volume. The outputs of the service can be verbose so it could greatly reduce the volume (from what I've seen in my services, the result can be 90% of the logs). It also makes it harder to actually read the logs when debugging

  2. Sensitivity. If there is sensitive data being processed then it could be the case where the actual predictions are sensitive (since it is the output on the data) whereas the other logging is not. So that might impact where/how logs could be stored.

It is not necessarily a crucial issue especially if it introduces a lot of complexity to the code. I'm fine if it is not fixed (other than the documentation to reflect that :) )

Alternatively, a simpler, more narrow approach might be to make outputting the result configurable:

result=result.to_json(),

So omitting that line, if some config argument is present would make the change much easier. But then it is config change for a specific thing as oppose to a generic one, so that can be a bad thing too.

In short: not a big deal if it isn't fixed.

@ssheng
Copy link
Collaborator

ssheng commented Mar 3, 2021

Thanks for the feedback. I agree with the two reasons you shared. Given the urgency, the solution in #1478 as a workaround is not ideal. We will take a step back and think about the logging requirements as a whole.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants