Skip to content

AIA-186 Add a delay for Historical Data#26

Merged
afiune merged 6 commits intomasterfrom
afiune/historical-data-delay
Mar 23, 2018
Merged

AIA-186 Add a delay for Historical Data#26
afiune merged 6 commits intomasterfrom
afiune/historical-data-delay

Conversation

@afiune
Copy link
Copy Markdown

@afiune afiune commented Mar 20, 2018

This PR is enabling chef-load to have a delay when we execute the chef-load generate --days_back X command. Before we were blasting the system with an infinite loop of requests, now the system is smart to detect failures and sleep for the provided amount of time so that A2 can release pressure in the ingestion pipeline.

Example of the execution:

$ chef-load generate --config chef-load.toml -n 10000 --threads 4000 --days_back 30 --sleep_time_on_failure 15

This command will ingest 10k nodes with 30 days of historical data, ingesting 4,000 threads and checking each batch for failures, if a failure was found, chef-load will wait 15 sec before resuming the ingestion.

Next steps

One thing we could do next is to make chef-load smarter so that it increases the sleep time depending of how often there are failures per batch. Think about it as if you were to find the right about of time that the system needs to ingest all the messages that it has buffered.

Signed-off-by: Salim Afiune afiune@chef.io

@afiune afiune force-pushed the afiune/historical-data-delay branch 2 times, most recently from 51d819e to 1ec566a Compare March 20, 2018 20:22
code := 999
res, err := client.Update(nodeName, body)
if res != nil {
code = res.StatusCode
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why get the status code only when there is an error?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it is not only when there is an error, it is only when there is a response 😄

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did not see this. Oh, I see now. The 'res' variable will be 'nil' if there is not an error.

lib/generator.go Outdated
ccrsPerDay = 1
ccrsTotal = 1
chefClient chef.Client
chanSize = 2500 // @afiune: we could play with this number
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea 🏅

lib/generator.go Outdated
if config.NumNodes > chanSize {
// If the number of nodes is bigger than the channel
// size, lets calculate how many loops we need to run
loops = config.NumNodes / chanSize
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be a whole number. Which the number would be truncated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now we can just pick numbers for NumNodes and chanSize that divisible.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see on line 122 you are dealing with the truncation of the division here. I think you would need to do a math.Ceil for that to be useful.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets chat because I would like to hear your idea.s

lib/generator.go Outdated
"nodes": config.NumNodes,
"days_back": config.DaysBack,
}).Info("Sleeping")
time.Sleep(time.Second * 3)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to make the sleep time configurable also.

lib/generator.go Outdated
for c = 0; c < ccrsTotal; c++ {

// Loops * chanSize = NumNodes (ish)
for j := 0; j < loops; j++ {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could rename loop to batches or numberOfBatches

Salim Afiune added 2 commits March 22, 2018 14:38
Signed-off-by: Salim Afiune <afiune@chef.io>
Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune afiune force-pushed the afiune/historical-data-delay branch from 1ec566a to 16bc4c0 Compare March 22, 2018 18:51
Signed-off-by: Salim Afiune <afiune@chef.io>
lib/generator.go Outdated
nodeNum := i + (j * config.Threads)
// The trick here is to stop the last loop when we reach
// the total number of nodes that we want to load
if nodeNum > config.NumNodes {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be >= instead. 🤔 cc @lancewf

Salim Afiune added 2 commits March 22, 2018 17:19
Signed-off-by: Salim Afiune <afiune@chef.io>
This commit is fixing a bunch of things:
* Use Ceil to round the batches
* Create the channels with the right size
* Sum the right number of ingested messages

Signed-off-by: Salim Afiune <afiune@chef.io>
Copy link
Copy Markdown

@kmacgugan kmacgugan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tenor-143762423

Copy link
Copy Markdown
Contributor

@lancewf lancewf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job handling all the threads.

image

code := 999
res, err := client.Update(nodeName, body)
if res != nil {
code = res.StatusCode
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did not see this. Oh, I see now. The 'res' variable will be 'nil' if there is not an error.

rand.Seed(time.Now().UTC().UnixNano())

// Lets try to use a smaller number of goroutines
if config.NumNodes > config.Threads {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great to see! We don't need 10,000 go routines for 10,000 threads. 🥇

lib/generator.go Outdated
if config.NumNodes > config.Threads {
// If the number of nodes is bigger than the channel
// size, lets calculate how many batches we need to run
batches = int(math.Ceil(float64(config.NumNodes) / float64(config.Threads)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes the float precision problem with large numbers.

batches = int(config.NumNodes / config.Threads)
if config.NumNodes % config.Threads != 0 { // if there is a reminder add one to do the same as Ceil
  batches++
}

Signed-off-by: Salim Afiune <afiune@chef.io>
@afiune afiune merged commit 99b017c into master Mar 23, 2018
@afiune afiune deleted the afiune/historical-data-delay branch March 23, 2018 00:43
@afiune
Copy link
Copy Markdown
Author

afiune commented Mar 23, 2018

Habitat package built and uploaded to the depot!

★ Upload of chef/chef-load/4.0.0/20180323004235 complete.

tenor-125715072

@afiune afiune changed the title Add a delay for Historical Data AIA-186 Add a delay for Historical Data Mar 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants