Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread id bug #5178

Merged
merged 33 commits into from
May 17, 2023
Merged

Thread id bug #5178

merged 33 commits into from
May 17, 2023

Conversation

Leo10Gama
Copy link
Member

Which issue(s) does this change fix?

N/A

Why is this change necessary?

Previously, Python's threading.thread_id() method did not provide an accurate representation of distinct Events running on the same thread. This change allows for a more accurate look at which methods are running on the same thread, and helps group long Events more accurately.

How does it address the issue?

The previous Event.thread_id was changed from an int to a UUID to act as a better identifier for Events run on the same thread.

What side effects does this change have?

The thread_id will now be a UUID, and its representation in JSON and __repr__ formats will be its hex value, as a string.

Mandatory Checklist

PRs will only be reviewed after checklist is complete

  • Add input/output type hints to new functions/methods
  • Write design document if needed (Do I need to write a design document?)
  • Write/update unit tests
  • Write/update integration tests
  • Write/update functional tests if needed
  • make pr passes
  • make update-reproducible-reqs if dependencies were changed
  • Write documentation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Leo10Gama and others added 27 commits June 7, 2022 11:05
* Implement project metadata metrics

* Implement metric metadata integration tests

* Tweak methods so tests don't fail on Linux machines

* Repair failing integration test

* Implement proper exception handling for subprocesses

* Format according to standard

* Refactor function names and encryption logic

* Remove unnecessary file

* Add type hints to new functions
* Implement event data structure and relevant unit tests

* Implement requested changes

* Remove unused (useless) file

* Remove unnecessary ABC class
* Implement tracker methods

* Improve readability of Event's __repr__ method

* Add method to convert Event to dictionary for JSON-compatibility
* Send Event metrics via Telemetry

* Add unit tests for event telemetry

* Simplify event telemetry unit tests

* Clean up unit tests
* Add BuildRuntime Events

* Remove unnecessary file

* Prevent tracked Event 'spillage' from previous tests

* Fix builder logic

* Redefine BuildRuntime to focus on function runtimes

* Remove unnecessary file (again)
* Add UsedFeature command and trackers

* Remove LocalTest events
Events regarding sam local will be implemented at a later time.

* Prevent methods from failing if track_event fails

* Reformat code to match current standard

* Use correct string formatting

* Use proper LOG message
* Add thread lock to EventTracker

* Place lock in appropriate methods
* Add new attributes to Event datastructure

* Fix non-JSON-serializable field in Event

* Fix repr to ensure SQL casting will work

* Add comment for timestamp method clarity
* Add framework for Event-sending method

* Perform some general refactors and cleanup

* Implement EventTracker Telemetry unit tests

* Cleanup tests and unused imports

* Update Metric name for consistency

* Repair failing unit test

* (Hopefully) repair broken integration test

* *Actually* repair broken integration test
I had the >= logic backwards lol

* Fix integration test
For sure this time

* Really fix the integration test
Windows development is not fun

* Properly fix integration test

* Reformat files

* Implement requested changes

* Add Lock logic to send_events method

* Implement requested changes

* Repair failing tests
* Upgrade encryption algorithm from SHA1 to SHA256

* Make return value of hash JSON-serializable
* Switch send_events to thread when at capacity

* Repair failing unit test
* Add sam sync Events

* Refactor methods for tracking sync events

* Improve SyncFlow tracking

* Add additional SyncFlow value
Merge `develop` into `customer-journey`
* Add workflow_config events

* Refactor workflow Events

* Add tests for verifying workflow events

* Remove redundant (UsedFeature, ESBuild) event

* Repair failing unit test

* Rename workflow event to prevent ambiguity
* Apply minor changes (some major)

* Add session_id for multithreaded send_events

* Adjust tests for new multithreading

* Fix failing integration test

* *Actually* fix failing integration test

* Properly fix integration test

* This time for sure

* Really really repair that test

* OK this one SHOULD work I swear

* This time for sure

* FIX INTEGRATION TEST

* IT'S A LIST

* Remove unused imports

* Add debug log for EventTracker session ID

* Place debug log in better position
@github-actions github-actions bot added pr/external stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at. labels May 16, 2023
@Leo10Gama Leo10Gama marked this pull request as ready for review May 16, 2023 18:35
@Leo10Gama Leo10Gama requested a review from a team as a code owner May 16, 2023 18:35
@Leo10Gama Leo10Gama requested review from jfuss and moelasmar May 16, 2023 18:35
Copy link
Contributor

@hawflau hawflau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM. Just one question - can you also check if some change is needed for the track_event calls in samcli/lib/sync/infra_sync_executor.py?

Copy link
Contributor

@mndeveci mndeveci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, left some comments there!

samcli/lib/sync/infra_sync_executor.py Show resolved Hide resolved
Comment on lines +97 to +98
if not thread_id:
thread_id = uuid4()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still use threading.get_ident() for default value or other fields will same for that purpose?
@hawflau what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think uuid4() is good. From what I've seen empirically, session_id and threading.get_ident() are producing the same grouping effects.

@hawflau hawflau added this pull request to the merge queue May 17, 2023
Merged via the queue into aws:develop with commit 1119525 May 17, 2023
65 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/external stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants