Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the trace garbage collector #2759

Merged
merged 27 commits into from Jul 28, 2020
Merged

Improve the trace garbage collector #2759

merged 27 commits into from Jul 28, 2020

Conversation

pbiggar
Copy link
Member

@pbiggar pbiggar commented Jul 28, 2020

This adds a number of initial fixes to the trace garbage collector. I'm not sure it really worked at all, and it certainly didn't support wildcards, 404s, paths with a huge number of stored events (eg 2 million in one case) and maybe it mixed up path/modifier as well (not sure). This:

  • combines the two garbage collectors which were the same code
  • removes some old garbage collection code
  • when loading 404s in the client, don't load the canvas data
  • fixes bugs that were due to mixing up path and modifier
  • adds ability to delete 404s
  • adds support for GCing HTTP paths with wildcards (plus index to make it fast)
  • reduces size of limit to 1000, but will repeat 100 times so long as there are 1000 deleted each time

@pbiggar
Copy link
Member Author

pbiggar commented Jul 28, 2020

Before merging, I should:

  • add tests for 404s
  • add a test for wildcards
  • check that the select distinct on has an index.

Note that I've already added the index in production.

@pbiggar
Copy link
Member Author

pbiggar commented Jul 28, 2020

the select distinct on is from Stored_events.list_events, which is used to generate the 404 list. On our biggest user, generating that take 300 seconds. If we drop the trace_id and the timestamp, it's fully indexed and takes 15 seconds. The traceID is used when converting a 404 to a handler, which is annoying.

I'm experimenting with adding an index. Since it's used in 404s, we can't just get rid of it.

@pbiggar pbiggar merged commit efac0b4 into main Jul 28, 2020
@pbiggar pbiggar deleted the paul/fix-gc branch July 28, 2020 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant