Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDOG Integration #644

Merged
merged 9 commits into from
Jan 17, 2024
Merged

DDOG Integration #644

merged 9 commits into from
Jan 17, 2024

Conversation

elijahbenizzy
Copy link
Collaborator

@elijahbenizzy elijahbenizzy commented Jan 16, 2024

Iterating -- plan of attack/thoughts:

  • Opted to store a cache myself and prune rather than using context as I want to leave the door open for more clever uses of this. I just link parents and store a dict
  • Should be easy to get tasks working, just need to expose the API
  • Should also be possible to add causal links between spans as well. TBD on how to use the DDOG API exactly.
  • Add all node tags to DDOG cause why not 🤷
  • do OpenTel (out of scope for this, that's next)

What this looks like:

image

Changes

  • DDOG
  • An extra (easter-egg) hook
  • Improvements to running hooks in the case of exceptions

How I tested this

  • ran ddog multiple times locally under different contexts
  • tests passed for existing ones (what I was fixing for was not worth testing for TBH, parity with current behavior is good)

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

We were missing this -- this adds it in.
This is a bit of a joke -- it just injects latency into the DAG. Purely
for testing, but perhaps someone can do something fun with it.
This is functionally the same as before, it just hits a weird edge case
in which a failure to deliver the hook would be called twice, as it
would fail when running it and fail when reacting to the failure (in the
catch). We need to do this for graph/node.
for adapter in self.adapters:
for cls in inspect.getmro(adapter.__class__):
sync_hook = getattr(cls, SYNC_HOOK, None)
if sync_hook is not None:
sync_hooks[sync_hook].add(adapter)
if adapter not in sync_hooks[sync_hook]:
sync_hooks[sync_hook].append(adapter)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is not ironed into the contract, and we'll likely want to switch around order, but I think for the future we need to think about this a bit -- my instinct says to do pre_ in order and post_ in reverse order, but that's complicated and ambiguous to implement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we should do it that way ideally

from hamilton import lifecycle


class DDOGTracer(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add testing but I don't think its worth it, I'll just fix forward and if its a pain add testing

time.sleep(sleep_time)


class SlowDownYouMoveTooFast(NodeExecutionHook):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easter egg

hamilton/plugins/h_ddog.py Outdated Show resolved Hide resolved
) -> Tuple[Optional[str], ...]:
return run_id, task_id, node_id
def _serialize_span_dict(span_dict: Dict[str, Span]):
"""Serializes to a readable format. We're not propogating links, but that's fine (for now).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean link?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Span links -- see notes elsewhere

@elijahbenizzy elijahbenizzy force-pushed the ddog-tracing branch 4 times, most recently from 68343d0 to 319234f Compare January 17, 2024 18:09
This adds basic tracing support for the datadog plugin. We elect to
manage the state of spans/traces ourselves, and there is a decent amount
of complexity around serialization. It works on Ray, Dask,
Multithreading, and Synchronous mode, as well as vanilla Hamilton
(non-task-based). It can optionally store links, although datadogs
functionality is pretty suboptimal and has large UI query times,
so that is turned off by default.
This ensures that it is only run once, and cleans up the code.
This ensures we don't accidentally run it twice if it fails, and cleans
up the code.
This makes it slightly easier to filter by node name, across, say,
different tasks. We namespace it under hamilton to ensure it doesn't
clash.
We were doing the wrong ones for required_dependencies. This fixes that,
and also uses input_types (rather than dependencies) for the
required/optional dependencies, which is guarenteed to exist when the
node is created (as opposed to dependencies which is set when we create
the DAG.
@elijahbenizzy elijahbenizzy merged commit c642615 into main Jan 17, 2024
22 checks passed
@elijahbenizzy elijahbenizzy deleted the ddog-tracing branch January 17, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants