-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CT-1405: Refactor event logging code #6291
Conversation
@peterallenwebb Could you merge main and resolve the existing errors? I'll start looking at this meanwhile. Also -- I'm not sure why the tests aren't running, and it's a shorter list than usual. Is something changed or broken in CI? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EventManager code looks nice and neat and much better than the existing code. I'm a little unclear about how somebody using dbt (such as dbt-server or shipments) would specify a custom logger. How would that work?
There's no protobuf logger config or code. Do you have plans for doing that?
Also I think we'll have to get the use_colors config from the existing flags.
@gshank I am not sure why CI is having issues, but I'll merge in main and see if that helps. The |
When merge conflicts exists it won't run because it is running off the workflows in |
And yes, I'm planning to address the protobuf logging in my next issue, #6268. You're right that this change did not include a mechanism for client code to include its own logger. I'm definitely interested in discussing how we should do that. |
@gshank I added a commit to this PR last night because some adapter test checks were failing. The issue was that pytest redefines sys.stdout stream so it can capture test output, but it closes the stream between tests. So if dbt functions are called during a test run, but the logging setup from the previous run is still in place, events may fire and write to a closed stream. I added code to close the streams at the end of a run, and put in an additional fixture to reset logging at the beginning of each test. If you think there is a better way to deal with the issue, let me know. |
I've always wanted to be able to get text logging to stdout and json logging to a file. That would require being able to configure the loggers differently. @jtcohen6 Is that something we'd have appetite for or not? Particularly once we have protobuf logging set up, I do not think anybody will want to have protobuf logs sent to stdout. I took a quick look at what you're doing for capturing test output and it does have a bit of a hacky feeling to it. It's possible that there might not be a better way to do it, but I do wonder if we couldn't configure the test logger to do things differently once we have a clear idea of how the custom loggers will work. |
I started discussing this with @ChenyuLInx and @MichelleArk yesterday. The idea taking shape, as I understand it, is that each dbt command would be executed with a new function replacing handle()/handle_with_args(). That function would take a more general context object as a parameter, and that context object would likely be a Click context, which also has the capacity to include things like a callback function, or a configured Event Manager. The hack (it definitely is a hack) to deal with stdout capture could be avoided if the calling code had the ability to control the execution context in this way, but the Click code is still coming into focus, and I don't want to delay merging this. That said, everyone seems to be thinking in this direction, so I'll provide add an additional commit to this PR with a proposed method of configuring the EventManager more elegantly. |
I buy this. Would the challenge here be around finding the right UX for managing that more complex / granular configuration? I'm open to jamming on some flag/yaml specs. |
Yeah, we'd have to have flags to set them separately, I guess, and it might get a big muddled with trying to preserve compatibility with the old versions... New flags, maybe? |
@gshank Maybe something like: $ dbt --log-format json run # sets default for all, backwards-compatible
$ dbt --file-log-format json --stdout-log-format text run # json file, text stdout
$ dbt --log-format json --stdout-log-format text run # same: json file (default), text stdout |
That seems reasonable. Though I did start to wonder if we'd want to enable custom log handlers via config |
Hm. At that point, it's probably more appropriate for someone to start invoking dbt-core via its Python API instead (i.e. the callback mechanism added in this PR). Good news is, we're hard at work (re)building a sane, defensible, documented Python API for dbt-core :) |
After today's discussions, I think we have concluded that further work on logging integration points will be deferred to future issues, and some of that may be done in the Click feature branch. @gshank or @emmyoop Do you have any remaining concerns about merging this work? I will open an issue for adding official API support for callback registration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little unclear about why you chose to copy attributes from the LoggerConfig objects into the _Logger subclasses individually. Wouldn't it be simpler to just set LoggerConfig as an attribute or a superclass or something? It seems like some of the code in the _create_logger function could be handled by the individual logger classes. I'm guessing that it's in the function in an attempt to make creating additional loggers simpler, but since we're not going to be doing that for a while, this feels harder to read. Perhaps a "from_config" constructor in the _JsonLogger and _TextLogger classes?
In addition, you're not actually setting the "use_colors" in the json logger, and for compatibility sake with the current cloud use of the logger, I think we need to have that set until they've changed that.
In addition, I've made some changes in my current branch to the "event_to_dict" function, so I think we'd be better off leaving that as a function call for now, otherwise merging is going to get weird. It's also used in the tests...
@gshank You are correct. I meant for this to be a factory function which simplified log creation, but I'll try the more direct approach you suggest.
This part is intentional. The current code never uses this setting when JSON logging is being used, unless there is something I'm missing, and I think I would have caught a change of behavior in my testing.
Yeah, I saw your change this morning. My intention was to avoid circular imports, but I'll find another way around that. |
Yeah, you're right about not using colors in the new json logging. It was in the legacy logger with json logging... I do think that it's possible that cloud might want colors in the new json logging, but I'm okay with not doing it now. If it becomes an issue we can put it in then. |
@gshank I believe my last commit resolves the two issues you wanted to see fixed. It gets rid of the factory function for logger creation, moving the logic to constructors. And it uses the version of event_to_dict in the functions file. Also, for the record, I re-ran the log comparison tests to make sure I wasn't changing behavior from the current main branch to this branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to fix the EventBufferFull thing, but otherwise it looks good!
Approving so we don't have to go around again.
core/dbt/events/proto_types.py
Outdated
@@ -2159,7 +2159,6 @@ class TrackingInitializeFailure(betterproto.Message): | |||
exc_info: str = betterproto.string_field(2) | |||
|
|||
|
|||
@dataclass | |||
class EventBufferFull(betterproto.Message): | |||
"""Z045""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a generated file. You need to remove the EventBufferFulll message from types.proto and run the protoc compiler (as described in the events README.md).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Will take care of this!
resolves #6139
Description
This PR is intended to make the logging code simpler to reason about, and easier to use in future multi-threaded scenarios. It does the following:
Suggested Future Work
Testing
Since the code was substantially rearranged and rewritten, I used the following test strategy:
Checklist
changie new
to create a changelog entry