WIP: track and reuse empty slots in HTTP probe #578

benner · 2021-03-16T17:11:34Z

Dirty suggestion for: #575

Not intended for merge. Comments are welcome.

Open questions:

What actually to do if there are no free slots
Should this behavior be optional
Memory consideration with huge intervals and small gaps between targets

I'll improve (tests, speed etc) PR if this approach seems reasonable.

manugarg · 2021-03-23T17:31:58Z

@benner Sorry for the delay in reviewing. It's been a bit busy for past 1 week. I'll try to review in a couple of days.

benner · 2021-04-13T18:34:14Z

@manugarg, did you find time to look into it?

manugarg · 2021-04-18T09:13:54Z

@benner Sorry for the silence on this. I think I now understand the problem better. When we go through the targets list, we skip existing targets but don't advance "startWaitTime", which means that 1st new target will start in the first slot and overall there will be overlaps in goroutines.

cloudprober/probes/http/http.go

Line 495 in 1a77284

// This target is already initialized.

I think this problem will be much less pronounced if we just updated startWaitTime even for skipped targets:

cloudprober/probes/http/http.go

Line 511 in 1a77284

startWaitTime += gapBetweenTargets

It will not be a perfect distribution, because "gapBetweenTargets" will keep changing based on the targets size, and there may be some crowding over time (new target's n*gapBetweenTargets may still fall near an old target's start time), but it will be quite random, which is not so bad.

Slots is an interesting idea. I think if we go that way, we should probably think about more formal way to create and distribute slots. For example, if we get more targets than last time, we should probably double the number of slots, keeping half the slots still assigned to old goroutines, and picking the first free slot from the beginning.

What do you think about just advancing startTime for skipped targets for now and working on slots method later on.

benner · 2021-04-25T13:04:48Z

It may work. I need to look into exact stats to make more proper estimation if this helps.

Also I initially was thinking about something like memory allocators does - linked lists (for free or used, singled or doubled) but as demonstration just WIP'ed with an array :-)

manugarg · 2021-04-26T07:09:59Z

Also I initially was thinking about something like memory allocators does - linked lists (for free or used, singled or doubled) but as demonstration just WIP'ed with an array :-)

When I started thinking more deeply about it, that's where I was leading too. :-)

benner · 2021-04-27T17:12:16Z

Then I will back with free list soon. Still not sure what is best way to handle overfill. Maybe it can be custom metric for Cloudprober itself or per probe. Any suggestions?

manugarg · 2021-04-27T17:34:52Z

Then I will back with free list soon. Still not sure what is best way to handle overfill. Maybe it can be custom metric for Cloudprober itself or per probe. Any suggestions?

I think let's try the simple thing first. Distributing probing slots equally may not be a very important requirement. My main goal behind trying to distribute equally, was to minimize the overlap, which impacts stuff for smaller CPU instances trying to run many high frequency probes.

I am worried that if we try to do something clever, it may introduce bugs/corner-cases that may be hard to discover.

benner · 2021-04-28T05:32:51Z

I think let's try the simple thing first.
Do you mean just keep startWaitTime increasing?

Distributing probing slots equally may not be a very important requirement. My main goal behind trying to distribute equally, was to minimize the overlap, which impacts stuff for smaller CPU instances trying to run many high frequency probes.

My main goal is opposite 😄 - to protect targets (actually physical server which has these targets). Initial description and proposed solution was to limit concurrency: #510

To repeat problem in short: I have highly overbooked physical servers (business specifics) whose has thousands of services and by triggering some of them they triggers other dependencies (e.g. MySQL, Redis, memcached etc). I want to ensure that Cloudprober will not not probe to much targets on same physical server at once.

I am worried that if we try to do something clever, it may introduce bugs/corner-cases that may be hard to discover.
I agree with this possibility. Another suggestion is to keep same slot array with index to last used slot and in case of reaching end of slots do same walk trough from beginning

manugarg · 2021-10-28T17:58:30Z

@benner I'm leaving Google. To continue working on Cloudprober, I am forking it to github.com/cloudprober/cloudprober. As I was pretty much the only person maintaining this project from Google, I'll be archiving this repository (For some internal reasons, Google doesn't want to migrate the repository). I hope you'll be able to resubmit this PR. Sorry for not getting to it until now.

benner · 2021-10-28T18:28:18Z

Sure. I'll resubmit PR . Thank you for informing

manugarg · 2021-10-28T23:19:45Z

@benner Just a heads up, I know you starred the new location github.com/cloudprober/cloudprober, but I had to delete that repository to remove its "fork" relationship. I've recreated it now, and it should be pretty stable now.

benner force-pushed the feat/http_autodiscovery_slots branch 4 times, most recently from 1fddae7 to 6dc5ba0 Compare March 18, 2021 12:53

benner force-pushed the feat/http_autodiscovery_slots branch from 6dc5ba0 to 1f05536 Compare April 28, 2021 07:59

WIP: track and reuse empty slots in HTTP probe

3903a9f

benner force-pushed the feat/http_autodiscovery_slots branch from 1f05536 to 3903a9f Compare October 13, 2021 19:16

benner closed this Oct 28, 2021

benner deleted the feat/http_autodiscovery_slots branch October 28, 2021 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: track and reuse empty slots in HTTP probe #578

WIP: track and reuse empty slots in HTTP probe #578

benner commented Mar 16, 2021

manugarg commented Mar 23, 2021

benner commented Apr 13, 2021

manugarg commented Apr 18, 2021

benner commented Apr 25, 2021 •

edited

Loading

manugarg commented Apr 26, 2021 •

edited

Loading

benner commented Apr 27, 2021

manugarg commented Apr 27, 2021

benner commented Apr 28, 2021

manugarg commented Oct 28, 2021

benner commented Oct 28, 2021

manugarg commented Oct 28, 2021

WIP: track and reuse empty slots in HTTP probe #578

WIP: track and reuse empty slots in HTTP probe #578

Conversation

benner commented Mar 16, 2021

manugarg commented Mar 23, 2021

benner commented Apr 13, 2021

manugarg commented Apr 18, 2021

benner commented Apr 25, 2021 • edited Loading

manugarg commented Apr 26, 2021 • edited Loading

benner commented Apr 27, 2021

manugarg commented Apr 27, 2021

benner commented Apr 28, 2021

manugarg commented Oct 28, 2021

benner commented Oct 28, 2021

manugarg commented Oct 28, 2021

benner commented Apr 25, 2021 •

edited

Loading

manugarg commented Apr 26, 2021 •

edited

Loading