Avoid log spam & have more meaningful log when pull image in DockerOperator#12763
Conversation
…erator Fixing issue reported in apache#12576 This change actually also makes the log more meaningful
|
The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease. |
Mainly for the final two lines
{'status': 'Digest: sha256:589cc12df79de86631d447e09bf131791c661814ee3e235eaa81389f0778d6a0'}
{'status': 'Status: Downloaded newer image for python:latest'}
|
Hi @turbaszek , I added b543036 to address cases in the screenshot below Mind taking another look? |
| if 'id' in output: | ||
| if latest_status.get(output['id']) != output['status']: | ||
| self.log.info("%s: %s", output['id'], output['status']) | ||
| latest_status[output['id']] = output['status'] |
There was a problem hiding this comment.
| if 'id' in output: | |
| if latest_status.get(output['id']) != output['status']: | |
| self.log.info("%s: %s", output['id'], output['status']) | |
| latest_status[output['id']] = output['status'] | |
| if 'id' in output and latest_status.get(output['id']) != output['status']: | |
| self.log.info("%s: %s", output['id'], output['status']) | |
| latest_status[output['id']] = output['status'] |
There was a problem hiding this comment.
There's a chance that latest_status.get(output.get("id")) != output["status"] will also work
There was a problem hiding this comment.
This change you proposed may not work, because if this if is not matched, it goes to else and print the status. Then it's no difference from the earlier "spamming" status.
This is why I have the nested ifs here. may you let me know if this clarification makes sense to you?
There was a problem hiding this comment.
The maybe we should revert the ifs:
if 'id' not in output:
self.log.info("%s", output['status'])
continue
output_id, output_status = output["id"], output["status"]
if latest_status.get(output_id) != output_status:
self.log.info("%s: %s", output_id, output_status)
latest_status[output_id] = output_statusI'm not a fan of too many nested ifs as pylint likes to complain about them 😉
There was a problem hiding this comment.
Actually we already cannot avoid having to apply disable=too-many-nested-blocks, but your suggestion above is definitely fair.
Have addressed it in 23e81de (with extremely minor change)
Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>

Closes #12576
statusis not very meaningful. It's much better to print thestatusbyid.Sample output when we pull image using
docker.APIClient.pull(stream=True)^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.