-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable logging for all components, remove print statements #80
Enable logging for all components, remove print statements #80
Conversation
I have next notes in mind
I have slightly adjusted version with mentioned stuff. Let me know what you think. |
Thank you for the excellent feedback! What I delivered was an evolution of odd behaviors I encountered - probably due to do the unnecessary. I agree with your suggestions and your branch looks good. Using a logger per module makes sense and I had recently discovered the hierarchical nature - similar to log4j. Since the logger(s) can be accessed by "hosted applications", they can adjust the formatting also as you suggest, so I would agree we can drop the env-based approach. Thanks. I'll revisit this first thing tomorrow and will update the title to WIP now. |
Hmm - looks like because there are handlers set for every logger access, we now get 5 entries for each log statement (this is produced using the submit code attached to #75):
I think we need to check if there are handlers defined and, if not, only create one then. I'll give this a shot. |
87b59d4
to
d1a89ab
Compare
@dimon222 - I've experimented with enabling formatting from the calling application and have greatly simplified the logger acquisition logic. The previous attempt had a couple issues.
If I add a method similar to the following to the hosting applications that use the api, I get the desired formatting (relative to the application)... def configure_yarn_api_client_logger():
logger = logging.getLogger("yarn_api_client")
logger.setLevel(logging.INFO)
if not logger.handlers:
my_app_log_format = "[%(levelname)1.1s %(asctime)s.%(msecs).03d %(name)s] %(message)s"
my_app_date_format = "%Y-%m-%d %H:%M:%S"
_log_handler = logging.StreamHandler()
_log_handler.setFormatter(logging.Formatter(fmt=my_app_log_format, datefmt=my_app_date_format))
logger.addHandler(_log_handler)
logger.propagate = False which then produces the formatting the developer wants for their application. P.S., I've added you as a co-author on this PR. If you're uncomfortable with that, please let me know and I'll amend the PR. |
I've briefly checked. Looks good. Only some things, but I'm not sure of their importance
|
Understood. Without this we’re going to get essentially debug statements for every request. That said, I’m happy to handle this from my calling application. I’ll remove this first thing tomorrow. I’m away from my keyboard until tomorrow.
Good to know. Let’s punt on that for now. |
We could also consider this kind of workaround to make sure we initialize logger with proper hierarchy: def getLogger(logger_name):
logger = logging.getLogger(logger_name)
logger.setLevel(logging.INFO)
if not logger.handlers:
my_app_log_format = "[%(levelname)1.1s %(asctime)s.%(msecs).03d %(name)s] %(message)s"
my_app_date_format = "%Y-%m-%d %H:%M:%S"
_log_handler = logging.StreamHandler()
_log_handler.setFormatter(logging.Formatter(fmt=my_app_log_format, datefmt=my_app_date_format))
logger.addHandler(_log_handler)
logger.propagate = False
return logger In each of modules we would call |
I think deferring the formatting to the calling application is the right thing to do. In my two applications, I configure "yarn_api_client" and the hierarchy of the sub loggers use that format. I do see merit in having a general |
This change enables a logger for each module of yarn-api-client. By default, nothing will be logged unless the calling application has configured logging. Once configured, entries will be logged relative to the configured log level. The calling application is also responsible for specifying the appropriate log formats, etc. An accessor method, get_logger(logger_name), has been created in case we need to add configuration for the loggers. In those cases, we'd probably configure the logger "yarn_api_client", so that each of the submodule loggers would inherit from that logger. Likewise, applications can configure "yarn_api_client" themselves and those changes will be picked up by the individual loggers. The previous INFO log statement has been changed to DEBUG and I've added timing results for the request. In addition, the previous print statements have been replaced with calls to apiLogger().warning(). Fixes: gateway-experiments#73 Co-authored-by: Dmitry Romanenko <Dmitry@Romanenko.in>
d1a89ab
to
ceade37
Compare
Ok - I've gone ahead with what I hope is sufficient for the time being. I've removed the URLLIB setting (leaving that to the calling applications) and created an accessor in case we ever want to add configuration ourselves. This has been much more challenging to get right than I had originally anticipated. @dimon222 - I really appreciate all your help and hope you're happy with the solution as well. |
Looks good for me |
Overall, it looks good, and I only have one suggestion. Looks like we have removed the logger property from the base class, which caused the side effect of every module, even the ones that inherit from base class, to have to instantiate a global log. I would like to suggest that we leave the logger property on the base class, and that would call the new get_logger function to initialize the log and that property would be used in the child classes, and the get_logger would be used only in the modules that have helper functions that don't inherit from base class. Thoughts ? |
I think if we were to adopt an approach of using a fixed logger name (e.g., "yarn_api_client"), then using a class instance property to access the logger for instances and a "global" accessor for "helper" functions would be okay, although I prefer to have every log call "look" the same. When (re)introducing the class instance property approach, you're going to have some methods use If we want to retain the notion of the hierarchical nature of loggers, where "yarn_api_client" represents our "root" (and configuration occurs against that level) and the "leaf" portion (e.g., "yarn_api_client.hadoop_conf") represents the location from which the log statement was issued, then defining the property on the base class (as before), will lead to all class instances using a logger name of "yarn_api_client.base", irrespective of where the class instance performed the log statement. Of course, you could address this by passing Based on this, I'm not sure what using a class instance property buys you other than not using a global accessor in files which consist solely of class methods. And it seems to introduce inconsistencies at both the calling and logging levels - unless we were to forgo the hierarchical nature of loggers and use a fixed logger name, at the expense of some finer grained control capabilities. One thing is for sure about this logging exercise, it surfaces philosophies that are important to developers and it's great that we can have these discussions. I've just presented what I believe are counter-arguments to using an instance property along with a global accessor, but am willing to revisit this if others agree that's how we should proceed. If so, please propose how we'd address some of the inconsistencies or reasons why they shouldn't be a concern. Thank you. |
FWIW, I just happened to look at urllib3 since "urllib3.connectionpool" is something we (in Enterprise Gateway) adjust the logging level for, and found they use a global |
I think global accessor is no big harm and somewhat adopted approach across python community (what obviously doesn't always mean correct). However, it will help in using loggers for writing integration & unit tests since we don't have to reimplement the wheel for test classes. Also, same logger won't be created 20 times if u want to communicate to 20 different clusters. |
Ok, based on the comments above it LGTM. |
This change enables a logger for each module of yarn-api-client.
By default, nothing will be logged unless the calling application has
configured logging. Once configured, entries will be logged relative
to the configured log level. The calling application is also responsible
or specifying the appropriate log formats, etc.
An accessor method, get_logger(logger_name), has been created in case we
need to add configuration for the loggers. In those cases, we'd probably
configure the logger "yarn_api_client", so that each of the submodule
loggers would inherit from that logger. Likewise, applications can
configure "yarn_api_client" themselves and those changes will be picked
up by the individual loggers.
The previous INFO log statement has been changed to DEBUG and I've
added timing results for the request. In addition, the previous print
statements have been replaced with calls to apiLogger().warning().
Fixes: #73