A concurrent version of trie index #334

bom-d-van · 2020-02-11T13:22:06Z

This is a follow up to #332

With this change, go-carbon no longer requires two versions of trie tree in memory. So memory overhead could be cut down even further in theory. More measurement needed to have some numbers.

This could also make new metrics available in "realtime" without waiting for re-scanning-and-indexing to be finished.

carbon/config.go

deniszh · 2020-07-08T10:36:43Z

Is this PR still relevant, @bom-d-van @azhiltsov ?

azhiltsov · 2020-07-08T17:39:26Z

We are currently running two types or indexes in production

Trie + trigram (majority of cases and gives quite significant performance boost on hosts storing 5KK+ metrics)
Pure trigram for clusters badly affected by improper metric namespace design + a lot of queries like foo.{sigma,delta}bar.*.bla.*bla.car

Since this feature is promising to cut the memory footprint and also remove the need in metric scan every 5 minutes or so, I see a lot value in it. However I do not know how much of the time and effort from the side of @bom-d-van will be required to finish it.

bom-d-van · 2020-07-08T20:02:32Z

@deniszh @azhiltsov Thanks for the input. If there is a chance of adoption. I am happy to pull it off.

Last time when I was testing it, there seemed to be some memory leaks going on. Again, gonna take some time to have updates. I think I was attacking too many issues/features in the go graphite stack at the same time. Sorry if it bothers you. (I was constantly looking for interesting stuff to work on but new ideas keep showing up).

And big thanks for @deniszh for doing all the following up and tough works.

deniszh · 2020-07-09T07:37:13Z

Again, much appreciated to @bom-d-van for pushing this forward, no rush, let's return to this later. Thanks!

* file nodes are no longer a global variable as pruning required a mark value (extra memory costs: o(n)) * pruning childrens on live trie nodes rather than the walking copy (this was causing memory leak as many old nodes are properly pruned)

* rename trieNode.mark to gen * prune: fix merging logics * insert: fix incorrect gen syncing for splits * imporve prune and real test (check both dump and node count after prune)

* fix ci errors * handle potential index out of bound in all the trie walking funcs * refactor test logs

bom-d-van · 2020-10-20T07:00:50Z

There are three new features introduced in this PR, two for trie index: realtime-index and concurrent-index. one for both index: file-list-cache.

edit: comment repost: #374 (comment)

README.md

deploy/go-carbon.conf

deniszh · 2020-10-20T15:45:46Z

Thanks, merging this, @bom-d-van !

azhiltsov reviewed Feb 11, 2020

View reviewed changes

carbon/config.go Outdated Show resolved Hide resolved

deniszh added the WIP label Jul 9, 2020

bom-d-van mentioned this pull request Jul 9, 2020

Next release date #333

Open

bom-d-van force-pushed the ctrie branch 2 times, most recently from 5414564 to 3de57e9 Compare August 20, 2020 08:01

bom-d-van force-pushed the ctrie branch from 3de57e9 to 2a3fad2 Compare September 17, 2020 11:28

bom-d-van mentioned this pull request Sep 26, 2020

[BUG] newly created metrics not fetchable even with cache-scan turned on #372

Open

bom-d-van added 8 commits October 18, 2020 16:34

trie: support concurrent trie updates

6a5a079

index: add file list cache support

d18e872

trie: fix incorrect prune logics

765f8c8

* file nodes are no longer a global variable as pruning required a mark value (extra memory costs: o(n)) * pruning childrens on live trie nodes rather than the walking copy (this was causing memory leak as many old nodes are properly pruned)

index: fix file-list-cache typo

b26d85f

ctrie: refactoring, bug fixes and tests

9ef9b5c

* rename trieNode.mark to gen * prune: fix merging logics * insert: fix incorrect gen syncing for splits * imporve prune and real test (check both dump and node count after prune)

ctrie: rebase master and more slice index checks in walking funcs

2016948

ctrie: support realtime discovery of new metrics (realtime index)

39e432f

ctrie: ci and bug fixes

6b5ff11

* fix ci errors * handle potential index out of bound in all the trie walking funcs * refactor test logs

bom-d-van force-pushed the ctrie branch from de7337e to 6b5ff11 Compare October 18, 2020 14:35

ctrie: fix linter issues and doc about file-list-cache

412b9f6

deniszh mentioned this pull request Oct 19, 2020

[BUG] Go-carbon open-files keeps increasing over time #374

Closed

bom-d-van marked this pull request as ready for review October 20, 2020 06:59

bom-d-van marked this pull request as draft October 20, 2020 08:05

bom-d-van self-assigned this Oct 20, 2020

bom-d-van marked this pull request as ready for review October 20, 2020 08:28

deniszh reviewed Oct 20, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

deniszh reviewed Oct 20, 2020

View reviewed changes

deploy/go-carbon.conf Outdated Show resolved Hide resolved

deniszh removed the WIP label Oct 20, 2020

ctrie: fix a few typos

ba63fa9

deniszh merged commit 65e955e into go-graphite:master Oct 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A concurrent version of trie index #334

A concurrent version of trie index #334

bom-d-van commented Feb 11, 2020 •

edited

deniszh commented Jul 8, 2020

azhiltsov commented Jul 8, 2020

bom-d-van commented Jul 8, 2020

deniszh commented Jul 9, 2020

bom-d-van commented Oct 20, 2020 •

edited

deniszh commented Oct 20, 2020

A concurrent version of trie index #334

A concurrent version of trie index #334

Conversation

bom-d-van commented Feb 11, 2020 • edited

deniszh commented Jul 8, 2020

azhiltsov commented Jul 8, 2020

bom-d-van commented Jul 8, 2020

deniszh commented Jul 9, 2020

bom-d-van commented Oct 20, 2020 • edited

deniszh commented Oct 20, 2020

bom-d-van commented Feb 11, 2020 •

edited

bom-d-van commented Oct 20, 2020 •

edited