Skip to content

batch tag upserts, remove randint in tag sampling#699

Merged
DanCech merged 3 commits intographite-project:masterfrom
DanCech:multiTag
Dec 4, 2017
Merged

batch tag upserts, remove randint in tag sampling#699
DanCech merged 3 commits intographite-project:masterfrom
DanCech:multiTag

Conversation

@DanCech
Copy link
Copy Markdown
Member

@DanCech DanCech commented Nov 23, 2017

This PR complements graphite-project/graphite-web#2125 and switches writer to batch tag updates.

The implementation is pretty basic, series created and updated are added to 2 queues which are then processed in batches in a separate thread. 2 queues are used so that new series are prioritized when writing to the TagDB, and so that they don't compete with updates for space in the queue. Each queue has a maximum length of TAG_QUEUE_SIZE.

The writer will write at most TAG_BATCH_SIZE paths in each request, and will process the queue as quickly as possible until the queue is empty. I'm not sure if we want to add a delay in between batches to avoid overloading graphite-web.

The other change here is to remove the use of random.randint in selecting which updates to send to the tagdb, instead just using a global counter and sending every TAG_UPDATE_INTERVALth path. As long as the total number of series being processed isn't an exact multiple or factor of TAG_UPDATE_INTERVAL this will still result in every series getting upserted periodically.

Finally I updated run_twistd_plugin to enable debug logging when passing --debug on the command line.

As always, any and all suggestions for improvement are appreciated.

H/T to @Civil for the suggestion of batching the tagdb updates.

@DanCech DanCech added this to the 1.1.0 milestone Nov 23, 2017
@DanCech DanCech self-assigned this Nov 23, 2017
@DanCech DanCech requested review from deniszh and iksaif November 23, 2017 17:58
def writeTagsForever():
if reactor.running:
try:
written = writeTags()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe stupid question - why not just call writeTags()? Why assign to written and never check result? And writeTags() should not return anything, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was a by-product of some refactoring that I need to clean up

@DanCech
Copy link
Copy Markdown
Member Author

DanCech commented Nov 24, 2017

I realized that state.database.tag() was previously set up not to wait for the http call to complete, the latest commit changes that so that state.database.tag() returns a Deferred and writeTags() waits for it to complete before firing off the next request.

@DanCech DanCech merged commit 190270d into graphite-project:master Dec 4, 2017
@DanCech DanCech deleted the multiTag branch December 4, 2017 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants