Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AzureFunctionsDataset2019 trace discrepancies #23

Closed
sk1tter opened this issue Aug 5, 2022 · 1 comment
Closed

AzureFunctionsDataset2019 trace discrepancies #23

sk1tter opened this issue Aug 5, 2022 · 1 comment

Comments

@sk1tter
Copy link

sk1tter commented Aug 5, 2022

Hello,
we(@HongyuHe) at eth-easl found some discrepancy in the AzureFunctionsDataset2019 trace.
Looking at each of the 14-day traces, we have found many duplicate apps and functions, some missing duration or memory stats.

day app_memory_percentiles.anon function_durations_percentiles.anon invocations_per_function_md.anon
d01 - 10 dups (20 rows) - 16 dups (32 rows)
- 377 apps missing memory stats
- 622 functions missing duration stats
- 422 apps missing memory stats
d02 - 13 dups (20 rows) - 18 dups (31 rows)
- 380 apps missing memory stats
- 603 functions missing duration stats
- 425 apps missing memory stats
d03 - 8 dups (16 rows) - 11 dups (22 rows)
- 386 apps missing memory stats
- 633 functions missing duration stats
- 429 apps missing memory stats
d04 - 415 apps missing memory stats - 623 functions missing duration stats
- 465 apps missing memory stats
d05 - 2 dups (4 rows) - 4 dups (8 rows)
- 397 apps missing memory stats
- 615 functions missing duration stats
- 440 apps missing memory stats
d06 - 1 dup (2 rows)
- 705 apps missing memory stats
- 563 functions missing duration stats
- 750 apps missing memory stats
d07 - 332 apps missing memory stats - 532 functions missing duration stats
- 379 apps missing memory stats
d08 - 412 apps missing memory stats - 630 functions missing duration stats
- 453 apps missing memory stats
d09 - 1 dup (2 rows) - 7 dups (14 rows)
- 398 apps missing memory stats
- 640 functions missing duration stats
- 439 apps missing memory stats
d10 - 3 dups (6 rows) - 4 dups (8 rows)
- 394 apps missing memory stats
- 633 functions missing duration stats
- 444 apps missing memory stats
d11 - 2 dups (4 rows) - 2 dups (4 rows)
- 388 apps missing memory stats
- 652 functions missing duration stats
- 436 apps missing memory stats
d12 - 388 apps missing memory stats - 631 functions missing duration stats
- 440 apps missing memory stats
d13 Trace file missing - 1 dup (2 rows) - 576 functions missing duration stats
d14 Trace file missing - 524 functions missing duration stats
  • dup : duplicate Hash Owner, Hash App, Hash Function with different invocations, durations, or memory
  • missing stats : Function or App with Hash is present in one trace file but missing in another trace file

These discrepancies make it hard for us to accurately analyze the trace.
Is it reasonable to treat the duplicates as separate entities, or should we merge them?
Would discarding traces with missing data be the only way to clean up the traces?
We would appreciate it if you could provide a way to clean up these issues. Thanks.

@rfonseca
Copy link
Collaborator

rfonseca commented May 7, 2024

Sorry for the inconsistencies, we did our best to fetch this data from the existing records, and can't clean this after the fact.
Unfortunately I would suggest you ignore the functions for which there are missing duration and or memory stats, if you need these. For the duplicates, I would suggest you take the latest record, instead of merging.

@rfonseca rfonseca closed this as completed May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants