-
Notifications
You must be signed in to change notification settings - Fork 467
[core] add a Configuration system for integrations #441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ddtrace/config.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main and only API, since it's a plain dictionary that makes everything easy.
ddtrace/config.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should log.debug() something here? the reason why we return a new dictionary is to avoid crashing users application. On the other hand, we fail silently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes worth a log.debug
ddtrace/pin.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't set anything, made exception of _target that is an internal attribute and not used by users.
ddtrace/pin.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a caveat. The current implementation allows to:
- set a pin in a global variable (i.e.
requestslibrary to set global settings) - when an instance is created from a class that has a
Pinattached, the pin reference is copied to the instance, meaning they both point to the same object - because the
configobject is a mutable value, it means that changing the configuration for an "instance" pin, changes also the global configuration for all other pins - this results in an error
All of this is done to keep the API simple, while reusing the Pin class. Another approach is to continue to expose and suggest the Pin API, that I would rather to hide / remove from the public facing API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we review our integrations to be sure this mechanism wasn't used in any of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's part of the task and it seems we don't do that. Mostly because the caveat is present only for the config object and in the current implementation only the tags attribute is affected (so it's a bugfix in that sense). Indeed, if we change a service name for an instance of the main class, we must use the override() method that clones the Pin.
This behavior protects only the tags and config object from that change, so a bugfix and a new feature. We should be good here.
ddtrace/pin.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everytime a Pin is attached onto an object, we set the pin's owner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any change it can badly affect the garbage collector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'm investigating if it's better to just use the memory address instead of the object pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we use id(obj) we avoid any kind of reference counting for the GC. This introduces a little overhead:
In [53]: timeit.timeit("id({'a':5})", number=10000)
Out[53]: 0.0017991065979003906
In [34]: timeit.timeit("{'a':5}", number=10000)
Out[34]: 0.0015370845794677734
that I think it's not relevant. The only caveat is that we have these results for CPython, and it could be different for other interpreters, but at the moment we officially support only CPython.
ddtrace/pin.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very fast operation, considering our settings have a limited cardinality (<10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to be a deepcopy? Don't we risk to have one day some nested config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something we have to consider, but here some benchmarks:
# from Django config, just to simulate a small set of configurations with an empty dict
DEFAULTS = {'AGENT_HOSTNAME': 'localhost',
'AGENT_PORT': 8126,
'AUTO_INSTRUMENT': True,
'DEFAULT_DATABASE_PREFIX': '',
'DEFAULT_SERVICE': 'django',
'DISTRIBUTED_TRACING': False,
'ENABLED': True,
'INSTRUMENT_CACHE': True,
'INSTRUMENT_DATABASE': True,
'INSTRUMENT_TEMPLATE': True,
'TAGS': {},
'TRACER': 'ddtrace.tracer'
}
# test 1
DEFAULTS.copy()
# test 2
copy.deepcopy(DEFAULTS)And here the results:
test 1: 0.00654911994934082
test 2: 0.2787208557128906
The huge overhead is because of the string copy, so if nested dictionaries are used and we go with a deepcopy, we should take in consideration such overhead.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's a big difference. If for now copy is enough for all configuration options we want to provide, no reason to do more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I will add a comment to explain why we use a copy for our use case
ddtrace/pin.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never expose config to users from the Pin class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is putting this underscore _config hiding it from the public API?
Technically it is still part of the __init__ which is publicly exposed ; I'm afraid it mislead people / allow bad things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it's a convention to hide attributes and elements that are prefixed with an underscore. Of course anyone can use it, but if it has an underscore it's considered extremely dangerous. All frameworks do that, though because of the nature of Python, anyone can access to any attribute that is prefixed.
tests/test_config.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the patch for the caveats described above, this test fails.
ddtrace/pin.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LotharSee probably I may proxy this value to the self._config['service_name'] so that we don't lose the only setting available in the Pin.
70eb0ae to
7a667c9
Compare
| from .span import Span | ||
| from .tracer import Tracer | ||
| from .configuration import Config | ||
| from .settings import Config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the name to avoid long name and obfuscation of config
| self._target = None | ||
| # keep the configuration attribute internal because the | ||
| # public API to access it is not the Pin class | ||
| self._config = _config or {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to copy it here, since any operation to use the pin, calls the clone() method once. Indeed, it's possible that we:
- initialize the global config with our defaults
- first set the
Pinin one of our integrations - users update the configuration
- when the
Pinis used in the integration for the first time (viaget_from) it gets cloned
If we copy the dictionary, we'll have always our defaults.
| cfg['service_name'] = 'metrics' | ||
| eq_(global_config['service_name'], 'service') | ||
|
|
||
| def test_configuration_copy_upside_down(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/test_pin.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test before was ok_(got is pin). We now clone the pin once, when get_from is used to safe-protect mutable structure like config and tags. Not doing that means introducing a bug where an instance can update global configurations (and we don't want that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed using another approach.
17a2ea8 to
cce600a
Compare
9fd89a2 to
f34941b
Compare
cce600a to
51aa8c3
Compare
51aa8c3 to
885f4f9
Compare
Overview
Currently, the Tracer doesn't have a configuration system for each single integration. This means that each integration has a different way to set a specific flag/option: sometimes we use application settings (i.e. Django settings) and some others we use attributes (i.e.
client.distributed_tracingfor requests).This patch aims to add to the
Pinclass a configuration object that can be used and retrieved with a simple API. The goal of the patch, is:Session)Pinclass, so that a nicer API can be used; thePinwill stays as is because it will be used as underlying mechanism