AIA-186 Add a delay for Historical Data by afiune · Pull Request #26 · chef/chef-load

afiune · 2018-03-20T20:07:05Z

This PR is enabling chef-load to have a delay when we execute the chef-load generate --days_back X command. Before we were blasting the system with an infinite loop of requests, now the system is smart to detect failures and sleep for the provided amount of time so that A2 can release pressure in the ingestion pipeline.

Example of the execution:

$ chef-load generate --config chef-load.toml -n 10000 --threads 4000 --days_back 30 --sleep_time_on_failure 15

This command will ingest 10k nodes with 30 days of historical data, ingesting 4,000 threads and checking each batch for failures, if a failure was found, chef-load will wait 15 sec before resuming the ingestion.

Next steps

One thing we could do next is to make chef-load smarter so that it increases the sleep time depending of how often there are failures per batch. Think about it as if you were to find the right about of time that the system needs to ingest all the messages that it has buffered.

Signed-off-by: Salim Afiune afiune@chef.io

lancewf · 2018-03-20T21:17:07Z

lib/data_collector.go

+	code := 999
+	res, err := client.Update(nodeName, body)
+	if res != nil {
+		code = res.StatusCode


Why get the status code only when there is an error?

Oh it is not only when there is an error, it is only when there is a response 😄

Sorry, I did not see this. Oh, I see now. The 'res' variable will be 'nil' if there is not an error.

lancewf · 2018-03-20T21:18:25Z

lib/generator.go

-		ccrsPerDay = 1
-		ccrsTotal  = 1
+		chefClient   chef.Client
+		chanSize           = 2500 // @afiune: we could play with this number


Good idea 🏅

lancewf · 2018-03-20T21:23:13Z

lib/generator.go

+	if config.NumNodes > chanSize {
+		// If the number of nodes is bigger than the channel
+		// size, lets calculate how many loops we need to run
+		loops = config.NumNodes / chanSize


This might not be a whole number. Which the number would be truncated.

For now we can just pick numbers for NumNodes and chanSize that divisible.

I see on line 122 you are dealing with the truncation of the division here. I think you would need to do a math.Ceil for that to be useful.

Lets chat because I would like to hear your idea.s

lancewf · 2018-03-20T21:45:22Z

lib/generator.go

+					"nodes":                           config.NumNodes,
+					"days_back":                       config.DaysBack,
+				}).Info("Sleeping")
+				time.Sleep(time.Second * 3)


You might want to make the sleep time configurable also.

lancewf · 2018-03-20T21:48:21Z

lib/generator.go

+	for c = 0; c < ccrsTotal; c++ {
+
+		// Loops * chanSize = NumNodes (ish)
+		for j := 0; j < loops; j++ {


You could rename loop to batches or numberOfBatches

Signed-off-by: Salim Afiune <afiune@chef.io>

afiune · 2018-03-22T19:21:05Z

lib/generator.go

+				nodeNum := i + (j * config.Threads)
+				// The trick here is to stop the last loop when we reach
+				// the total number of nodes that we want to load
+				if nodeNum > config.NumNodes {


I think this should be >= instead. 🤔 cc @lancewf

Signed-off-by: Salim Afiune <afiune@chef.io>

This commit is fixing a bunch of things: * Use Ceil to round the batches * Create the channels with the right size * Sum the right number of ingested messages Signed-off-by: Salim Afiune <afiune@chef.io>

kmacgugan

lancewf

Great job handling all the threads.

lancewf · 2018-03-22T21:39:05Z

lib/data_collector.go

+	code := 999
+	res, err := client.Update(nodeName, body)
+	if res != nil {
+		code = res.StatusCode


Sorry, I did not see this. Oh, I see now. The 'res' variable will be 'nil' if there is not an error.

lancewf · 2018-03-22T21:40:33Z

lib/generator.go

 	rand.Seed(time.Now().UTC().UnixNano())

+	// Lets try to use a smaller number of goroutines
+	if config.NumNodes > config.Threads {


This is great to see! We don't need 10,000 go routines for 10,000 threads. 🥇

lancewf · 2018-03-22T22:11:39Z

lib/generator.go

+	if config.NumNodes > config.Threads {
+		// If the number of nodes is bigger than the channel
+		// size, lets calculate how many batches we need to run
+		batches = int(math.Ceil(float64(config.NumNodes) / float64(config.Threads)))


This removes the float precision problem with large numbers.

batches = int(config.NumNodes / config.Threads) if config.NumNodes % config.Threads != 0 { // if there is a reminder add one to do the same as Ceil batches++ }

Signed-off-by: Salim Afiune <afiune@chef.io>

afiune · 2018-03-23T00:45:13Z

Habitat package built and uploaded to the depot!

★ Upload of chef/chef-load/4.0.0/20180323004235 complete.

afiune force-pushed the afiune/historical-data-delay branch 2 times, most recently from 51d819e to 1ec566a Compare March 20, 2018 20:22

lancewf reviewed Mar 20, 2018

View reviewed changes

Salim Afiune added 2 commits March 22, 2018 14:38

Add a delay for Historical Data

a96b178

Signed-off-by: Salim Afiune <afiune@chef.io>

Add Threads and SleepTime to historical data feature

16bc4c0

Signed-off-by: Salim Afiune <afiune@chef.io>

afiune force-pushed the afiune/historical-data-delay branch from 1ec566a to 16bc4c0 Compare March 22, 2018 18:51

Rename loops for batches

ae81b1d

Signed-off-by: Salim Afiune <afiune@chef.io>

afiune commented Mar 22, 2018

View reviewed changes

Salim Afiune added 2 commits March 22, 2018 17:19

Fix generate flags for historical data

1a801b5

Signed-off-by: Salim Afiune <afiune@chef.io>

Fix the generate loop for historical data

d67ded6

This commit is fixing a bunch of things: * Use Ceil to round the batches * Create the channels with the right size * Sum the right number of ingested messages Signed-off-by: Salim Afiune <afiune@chef.io>

kmacgugan approved these changes Mar 22, 2018

View reviewed changes

lancewf approved these changes Mar 22, 2018

View reviewed changes

lancewf reviewed Mar 22, 2018

View reviewed changes

Fix the float precision problem with large numbers

e83e95f

Signed-off-by: Salim Afiune <afiune@chef.io>

afiune merged commit 99b017c into master Mar 23, 2018

afiune deleted the afiune/historical-data-delay branch March 23, 2018 00:43

afiune changed the title ~~Add a delay for Historical Data~~ AIA-186 Add a delay for Historical Data Mar 23, 2018

Conversation

afiune commented Mar 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Next steps

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kmacgugan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lancewf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

afiune commented Mar 23, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

afiune commented Mar 20, 2018 •

edited

Loading

kmacgugan left a comment •

edited

Loading