Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-6530] Allow Custom Statsd Client #7227

Merged
merged 9 commits into from Mar 15, 2020

Conversation

envious
Copy link
Contributor

@envious envious commented Jan 21, 2020

This change allows Airflow users to utilise their own custom Statsd client as discussed here.

Many companies have their own custom Statsd clients and this change should allow for easier adoption of Airflow by large corporations.

Example usage based off of this change:

  1. User adds a module path to their airflow.cfg file next to the 'statsd_custom_client_path' key in the scheduler section of the config.

As such:
statsd_custom_client_path = company.statsdclient.customclient

  1. Ensure the newly added client exists on the PYTHONPATH.

Issue link: AIRFLOW-6530

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@envious
Copy link
Contributor Author

envious commented Jan 22, 2020

@kaxil thanks for your suggested changes, i've applied them. I'll figure out how to best test these changes today. Any clues?

@codecov-io
Copy link

codecov-io commented Jan 22, 2020

Codecov Report

Merging #7227 into master will decrease coverage by 0.66%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7227      +/-   ##
==========================================
- Coverage   86.83%   86.16%   -0.67%     
==========================================
  Files         897      915      +18     
  Lines       42751    44164    +1413     
==========================================
+ Hits        37121    38055     +934     
- Misses       5630     6109     +479
Impacted Files Coverage Δ
airflow/stats.py 86.07% <100%> (+2.51%) ⬆️
...flow/providers/apache/cassandra/hooks/cassandra.py 21.51% <0%> (-72.16%) ⬇️
...w/providers/apache/hive/operators/mysql_to_hive.py 35.84% <0%> (-64.16%) ⬇️
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/providers/postgres/operators/postgres.py 50% <0%> (-50%) ⬇️
airflow/providers/redis/operators/redis_publish.py 50% <0%> (-50%) ⬇️
airflow/kubernetes/volume.py 52.94% <0%> (-47.06%) ⬇️
airflow/providers/mongo/sensors/mongo.py 53.33% <0%> (-46.67%) ⬇️
airflow/kubernetes/pod_launcher.py 47.18% <0%> (-45.08%) ⬇️
airflow/providers/mysql/operators/mysql.py 55% <0%> (-45%) ⬇️
... and 86 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ea9278...0bc17c3. Read the comment docs.

@envious envious requested a review from kaxil January 30, 2020 11:58
@kaxil
Copy link
Member

kaxil commented Feb 2, 2020

@envious - You can add another test at https://github.com/apache/airflow/blob/master/tests/test_stats.py by creating a new Statsd client and passing it using conf_vars like

@mock.patch('airflow.utils.email.send_email_smtp')
def test_custom_backend(self, mock_send_email):
with conf_vars({('email', 'email_backend'): 'tests.utils.test_email.send_email_test'}):
utils.email.send_email('to', 'subject', 'content')
send_email_test.assert_called_once_with(
'to', 'subject', 'content', files=None, dryrun=False,
cc=None, bcc=None, mime_charset='utf-8', mime_subtype='mixed')
self.assertFalse(mock_send_email.called)

and asserting that the client is the same that you were expecting.

@mik-laj
Copy link
Member

mik-laj commented Feb 2, 2020

Many companies have their own custom Statsd clients and this change should allow for easier adoption of Airflow by large corporations.

Can you say something more about that? Do you mean any public software or services?

@kaxil kaxil changed the title [AIRFLOW-6539] Allow Custom Statsd Client [AIRFLOW-6530] Allow Custom Statsd Client Feb 3, 2020
Usman Arshad and others added 5 commits March 9, 2020 13:35
…sd client - Updated metrics.rst to reflect new custom metric capabilities - Updated the default airfow configuration file to reflect new key which can be added in order to utilise custom statsd client configuration
Co-Authored-By: Kaxil Naik <kaxilnaik@gmail.com>
Co-Authored-By: Kaxil Naik <kaxilnaik@gmail.com>
@craigrosie
Copy link
Contributor

While tests are being worked on for this, do you have any advice on how/if we can get this "backported" to the next 1.10 release?

@kaxil
Copy link
Member

kaxil commented Mar 9, 2020

While tests are being worked on for this, do you have any advice on how/if we can get this "backported" to the next 1.10 release?

marked it for 1.10.10. We will cherry-pick and backport your changes after merging it to master

@craigrosie craigrosie force-pushed the AIRFLOW-6530-CUSTOM-STATSD-CLIENT-MASTER branch 2 times, most recently from 1c1f5a1 to 18db742 Compare March 10, 2020 15:01
@craigrosie craigrosie force-pushed the AIRFLOW-6530-CUSTOM-STATSD-CLIENT-MASTER branch from 18db742 to b909492 Compare March 10, 2020 18:03
@craigrosie
Copy link
Contributor

Hey @kaxil / @mik-laj, are you able to give this another review now that we've added tests? 😄 Thank you in advance!

Comment on lines 1408 to 1409
Note: The module path must exist on your PYTHONPATH for Airflow to pick it up
for Airflow to pick it up
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate wording

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed! 😄

@@ -200,10 +200,35 @@ def __init__(self, *args, **kwargs):
self.__class__.instance = DummyStatsLogger()
except (socket.gaierror, ImportError) as e:
log.warning("Could not configure StatsClient: %s, using DummyStatsLogger instead.", e)
self.__class__.instance = DummyStatsLogger()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a bug when adding tests for the new functionality. The log message above the line I added states that the class used for the statsd client should fall back to DummyStatsLogger if the defined statsd client class cannot be imported, or cannot connect to a statsd instance, but this never happens, and self.__class__.instance remains set to the default value of None

For example, if I try to load an invalid custom statsd class using the functionality that @envious added, triggering an ImportError, when I then try and call Stats.incr() (which should still work because the underlying class has fallen back to DummyStatsLogger), I get errors like:

cls = <class 'airflow.stats.Stats'>, name = 'incr'

    def __getattr__(cls, name):
>       return getattr(cls.instance, name)
E       AttributeError: 'NoneType' object has no attribute 'incr'

airflow/stats.py:188: AttributeError

Co-Authored-By: Kaxil Naik <kaxilnaik@gmail.com>
airflow/stats.py Outdated Show resolved Hide resolved
airflow/stats.py Outdated Show resolved Hide resolved
airflow/stats.py Outdated Show resolved Hide resolved
@kaxil kaxil merged commit 8bf6a90 into apache:master Mar 15, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 15, 2020

Awesome work, congrats on your first merged pull request!

kaxil pushed a commit that referenced this pull request Mar 25, 2020
Co-authored-by: Usman Arshad <usman.arshad@skyscanner.net>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Craig Rosie <craigrosie7@gmail.com>
(cherry picked from commit 8bf6a90)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants