Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(profiling): new memory profiler #1673

Merged
merged 5 commits into from
Sep 29, 2020
Merged

Conversation

jd
Copy link
Contributor

@jd jd commented Sep 22, 2020

feat(profiling): introduce memalloc collector

This is a new memory allocation profiler that directly leverages the CPython
memory API. It tracks memory allocation events.

feat(profiling): export MemoryAllocSampleEvent to pprof

feat(profiling): allow to use memalloc collector

@jd jd requested a review from a team as a code owner September 22, 2020 12:12
@jd jd force-pushed the profiling/memalloc branch 8 times, most recently from 5b1dde8 to bb119be Compare September 24, 2020 09:05
@jd jd added this to the 0.43.0 milestone Sep 24, 2020
This is a new memory allocation profiler that directly leverages the CPython
memory API. It tracks memory allocation events.
This adds a environment variable to opt-int to the new memory profiler.
Copy link
Collaborator

@majorgreys majorgreys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor questions

# _memalloc buffer to our Recorder. This is fine for now, but we might want to store the nanoseconds
# timestamp in C and then return it via iter_events.
return (
tuple(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason here to use tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're smaller and we don't need to modify the result, therefore tuple is good enough :)

)

nevents = len(events)
sampling_ratio_avg = sum(event.capture_pct for event in events) / nevents / 100.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could shave off some effort by collapsing these three iterations over events into one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd think so…

l = list(range(10_000_000))

def v1():
    return sum(l), sum(l), sum(l)

def v2():
    x = y = z = 0
    for v in l:
         x += v
         y += v
         z += v
    return x, y, z


assert v1() == v2()


import timeit

print(timeit.timeit("v1()", number=10, globals=globals()))
print(timeit.timeit("v2()", number=10, globals=globals()))

v2 is 6 times slower:

$ python3 t.py
2.568303665
14.879553143

So I'm not sure it'd be a win even here 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have actually tested out my suggestion before making it! 🔬

Funny but my initial thought was to use functools.reduce but that's even slower:

def v3():
    return functools.reduce(lambda x,y: (x[0]+y, x[1]+y, x[2]+y), l, (0,0,0))

Which takes ~21s.

@majorgreys majorgreys merged commit 4a5fe1d into DataDog:master Sep 29, 2020
@jd jd deleted the profiling/memalloc branch September 29, 2020 15:27
Kyle-Verhoog pushed a commit that referenced this pull request Oct 1, 2020
* feat(profiling): introduce memalloc collector

This is a new memory allocation profiler that directly leverages the CPython
memory API. It tracks memory allocation events.

* feat(profiling): export MemoryAllocSampleEvent to pprof

* feat(profiling): allow to use memalloc collector

This adds a environment variable to opt-int to the new memory profiler.

Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Kyle-Verhoog pushed a commit that referenced this pull request Oct 1, 2020
* feat(profiling): introduce memalloc collector

This is a new memory allocation profiler that directly leverages the CPython
memory API. It tracks memory allocation events.

* feat(profiling): export MemoryAllocSampleEvent to pprof

* feat(profiling): allow to use memalloc collector

This adds a environment variable to opt-int to the new memory profiler.

Co-authored-by: Tahir H. Butt <tahir.butt@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants