Conversation
51d819e to
1ec566a
Compare
| code := 999 | ||
| res, err := client.Update(nodeName, body) | ||
| if res != nil { | ||
| code = res.StatusCode |
There was a problem hiding this comment.
Why get the status code only when there is an error?
There was a problem hiding this comment.
Oh it is not only when there is an error, it is only when there is a response 😄
There was a problem hiding this comment.
Sorry, I did not see this. Oh, I see now. The 'res' variable will be 'nil' if there is not an error.
lib/generator.go
Outdated
| ccrsPerDay = 1 | ||
| ccrsTotal = 1 | ||
| chefClient chef.Client | ||
| chanSize = 2500 // @afiune: we could play with this number |
lib/generator.go
Outdated
| if config.NumNodes > chanSize { | ||
| // If the number of nodes is bigger than the channel | ||
| // size, lets calculate how many loops we need to run | ||
| loops = config.NumNodes / chanSize |
There was a problem hiding this comment.
This might not be a whole number. Which the number would be truncated.
There was a problem hiding this comment.
For now we can just pick numbers for NumNodes and chanSize that divisible.
There was a problem hiding this comment.
I see on line 122 you are dealing with the truncation of the division here. I think you would need to do a math.Ceil for that to be useful.
There was a problem hiding this comment.
Lets chat because I would like to hear your idea.s
lib/generator.go
Outdated
| "nodes": config.NumNodes, | ||
| "days_back": config.DaysBack, | ||
| }).Info("Sleeping") | ||
| time.Sleep(time.Second * 3) |
There was a problem hiding this comment.
You might want to make the sleep time configurable also.
lib/generator.go
Outdated
| for c = 0; c < ccrsTotal; c++ { | ||
|
|
||
| // Loops * chanSize = NumNodes (ish) | ||
| for j := 0; j < loops; j++ { |
There was a problem hiding this comment.
You could rename loop to batches or numberOfBatches
Signed-off-by: Salim Afiune <afiune@chef.io>
Signed-off-by: Salim Afiune <afiune@chef.io>
1ec566a to
16bc4c0
Compare
Signed-off-by: Salim Afiune <afiune@chef.io>
lib/generator.go
Outdated
| nodeNum := i + (j * config.Threads) | ||
| // The trick here is to stop the last loop when we reach | ||
| // the total number of nodes that we want to load | ||
| if nodeNum > config.NumNodes { |
Signed-off-by: Salim Afiune <afiune@chef.io>
This commit is fixing a bunch of things: * Use Ceil to round the batches * Create the channels with the right size * Sum the right number of ingested messages Signed-off-by: Salim Afiune <afiune@chef.io>
| code := 999 | ||
| res, err := client.Update(nodeName, body) | ||
| if res != nil { | ||
| code = res.StatusCode |
There was a problem hiding this comment.
Sorry, I did not see this. Oh, I see now. The 'res' variable will be 'nil' if there is not an error.
| rand.Seed(time.Now().UTC().UnixNano()) | ||
|
|
||
| // Lets try to use a smaller number of goroutines | ||
| if config.NumNodes > config.Threads { |
There was a problem hiding this comment.
This is great to see! We don't need 10,000 go routines for 10,000 threads. 🥇
lib/generator.go
Outdated
| if config.NumNodes > config.Threads { | ||
| // If the number of nodes is bigger than the channel | ||
| // size, lets calculate how many batches we need to run | ||
| batches = int(math.Ceil(float64(config.NumNodes) / float64(config.Threads))) |
There was a problem hiding this comment.
This removes the float precision problem with large numbers.
batches = int(config.NumNodes / config.Threads)
if config.NumNodes % config.Threads != 0 { // if there is a reminder add one to do the same as Ceil
batches++
}
Signed-off-by: Salim Afiune <afiune@chef.io>



This PR is enabling chef-load to have a delay when we execute the
chef-load generate --days_back Xcommand. Before we were blasting the system with an infinite loop of requests, now the system is smart to detect failures and sleep for the provided amount of time so that A2 can release pressure in the ingestion pipeline.Example of the execution:
This command will ingest 10k nodes with 30 days of historical data, ingesting 4,000 threads and checking each batch for failures, if a failure was found, chef-load will wait 15 sec before resuming the ingestion.
Next steps
One thing we could do next is to make chef-load smarter so that it increases the sleep time depending of how often there are failures per batch. Think about it as if you were to find the right about of time that the system needs to ingest all the messages that it has buffered.
Signed-off-by: Salim Afiune afiune@chef.io