Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply flake8-logging-format changes to providers #24933

Closed

Conversation

josh-fell
Copy link
Contributor

@josh-fell josh-fell commented Jul 8, 2022

Related: #23597 #24910

This is some pre-work for hopefully/maybe including the flake8-logging-format extension in CI. There are roughly 88 file changes so splitting these up into smaller chunks.

After the initial sweeps are completed we can do a final pass when formally implementing the extension in CI.

I'm not entirely certain on if there are impacts to useful info that will be missed with log handlers but would love feedback on general patterns. Core PR to come later based on feedback.

  • Fix failing tests

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@@ -107,9 +106,8 @@ def test_hook_raises(self):

mock_error.assert_called_once_with(
'Could not create an S3Hook with connection id "%s". Please make '
'sure that apache-airflow[aws] is installed and the S3 connection exists. Exception : "%s"',
Copy link
Contributor Author

@josh-fell josh-fell Aug 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This an example of where I'm unsure if accounting for G200 error ("Logging statements should not include the exception in logged string [use exception or exc_info=True]") will negatively affect what's logged and, I suppose more importantly, what the log handlers will ingest.

Here the "Exception: " is part of the logging line directly but disallowed as part of the flake8 plugin. But does this matter when exc_info=True or when there is a following raise? Is there really any lost information in the logs? Maybe there is a slick workaround that I'm missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An option could be ignoring the G200 error as well as part of flake8-logging-format should we want to introduce it. @kaxil Do you have an opinion here since you'd taken a crack at using this flake8 plugin elsewhere?

@@ -172,7 +172,7 @@ def test_failed_write_to_remote_on_close(self, mock_blob, mock_client, mock_cred
(
"airflow.providers.google.cloud.log.gcs_task_handler.GCSTaskHandler",
logging.ERROR,
"Could not write logs to gs://bucket/remote/log/location/1.log: Failed to connect",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example of a potential impact to missing log information that I'm unsure if it's actually an issue.

@josh-fell josh-fell marked this pull request as ready for review August 1, 2022 19:36
@josh-fell josh-fell force-pushed the flake8-logging-format-providers branch from 4c80aab to b141bb3 Compare August 1, 2022 19:36
self.log.error(e)
raise AirflowException(f"Errors when deleting: {key}")
self.log.error("Errors when deleting %s", key)
raise AirflowException(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a provider, I wonder if we should just switch to re-raise the original exception. Coercing the error to AirflowException offers no benefits at all

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree :). We used to use that pattern in the past but unless you use dedicated AirflowSkip or AirflowFailException it's better to raise the original exception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, this totally makes sense. Agreed. I'll do a clean sweep of all the provider files I'm touching for this as well.

except Exception as general_error:
self.log.error("Failed to create aws glue job, error: %s", general_error)
except Exception:
self.log.error("Failed to create aws glue job.")
Copy link
Member

@uranusjr uranusjr Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use exception here? (same for many below)

Copy link
Member

@potiuk potiuk Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Sounds strange. We shoud catch specific exception that we know about and let all the rest buble up directly, there is no point in extra logging here unless we want to provide any "specific" information resulting in helping the user to react to some known exceptions. There is no point in logging meaningless log here - the user knows, Glue job failed to be created already and providing this extra line with no actual "Help" for the user and without instructions on what to do is borderline harrasing the user "Hello you already know we failed, so let us repeat it here").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use exception here? (same for many below)

Generally if there was a raise I stuck to using error so the traceback wasn't logged twice.

Oof yeah @potiuk there are a lot of generic exceptions in these files. I'll clean them up.

@josh-fell
Copy link
Contributor Author

Not stale. Just need to find some time to tackle #24933 first. There are similar patterns and want to make sure there is consensus and consistency.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Oct 14, 2022
@github-actions github-actions bot closed this Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging area:providers provider:amazon-aws AWS/Amazon - related issues provider:cncf-kubernetes Kubernetes provider related issues provider:google Google (including GCP) related issues stale Stale PRs per the .github/workflows/stale.yml policy file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants