Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reafactor schedules and engine #53

Merged
merged 9 commits into from
Aug 21, 2017
Merged

Conversation

skipor
Copy link
Contributor

@skipor skipor commented Jul 11, 2017

Schedules are ready to use, but engine refactor is WIP, and need tests and debug.
Fix #11, fix #12
Should fix #46 before merge.
Start of #28 for engine. Fix it for components in another PR.
Initial work for #35

@direvius direvius added this to the Release 0.2 milestone Jul 11, 2017
@coveralls
Copy link

coveralls commented Jul 29, 2017

Coverage Status

Coverage increased (+10.2%) to 76.973% when pulling 3ee048d on skipor:feature/12 into 040b5c3 on yandex:develop.

@skipor
Copy link
Contributor Author

skipor commented Jul 29, 2017

@direvius Review please :)

@coveralls
Copy link

coveralls commented Jul 29, 2017

Coverage Status

Coverage increased (+10.4%) to 77.135% when pulling 2e2064b on skipor:feature/12 into 040b5c3 on yandex:develop.

@coveralls
Copy link

coveralls commented Jul 30, 2017

Coverage Status

Coverage increased (+10.3%) to 77.112% when pulling 5509db6 on skipor:feature/12 into 040b5c3 on yandex:develop.

// TODO(skipor): there is no guarantee, that we will run exactly after 1 second.
// So, when we get 1 sec +-10ms, we getting 990-1010 calculate intervals and +-2% RPS in reports.
// Consider using rcrowley/go-metrics.Meter.
for range time.NewTicker(1 * time.Second).C {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we just divide number_of_requests/milliseconds_passed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will make RPS floating point number.
Current solution is good enough for now. Let's postpone expvar part till metrics refactor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floating point is ok+) we can even specify RPS in this way, for examle, 0.5 RPS means 1 request per 2 seconds.

core/core.go Outdated
type Ammo interface{}

//go:generate mockery -name=Provider -case=underscore -outpkg=coremock

// Provider is routine that generates ammo for instance shoots.
// A Provider must goroutine safe.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"must be"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

core/core.go Outdated
// That means, that sample encode and reporting IO done in aggregator provider routine.
// Reported sample can be reused for efficiency.
// Report may be called before start, but may block until start is called.
Report(Sample)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we require non-blocking Report?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even channels with large buffer can't guarantee that - channel buffer can be filled up, if Aggreggator handle reported samples too slow. Sample blocking queue with dynamically growing buffer can in case of infinity memory size. But we have not infinity memory usually:)
We can throw samples away, in case buffer fill up, but I have no arguments for why this is better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top priority is supplying enough load. Then reporting. My suggestion is to throw samples away but then show BIG warning about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. Then let's print WARN about throw away samples, and if some Samples were thrown away, finish pandora process with non zero return code by returning non-nil error from Aggregator.

core/core.go Outdated
// Aggregator is routine that aggregates samples from all instances.
// Usually aggregator is shooting result reporter, that writes released samples
// to file in machine readable format for future analysis.
// An Aggregator must goroutine safe.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must be

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

core/core.go Outdated
type Schedule interface {
// Run starts schedule at passed time.
// Run may be called once, before any Next call. (Before, means not concurrently too.)
// If start is not called, schedule started at first Next call.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is started

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If start was not called, schedule is started"

core/core.go Outdated
// A Gun is owned by only instance, that use it for shooting in cycle: acquire ammo from provider ->
// wait for next shoot schedule event -> shoot with gun.
// Guns that also implements io.Closer will be closed after instance finish.
// Actually, Guns that creates resources that should be closed after instance finish,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guns that create resources which should be closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

core/core.go Outdated
// Bind passes dependencies required for shooting. Called once before shooting start.
Bind(Aggregator)
// Shoot makes one shoot. Shoot means some abstract load operation: web service or database request, for example.
// During shoot Gun acquires one or more samples and report them to bind Aggregator.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bound :)

}
}()
var (
toWait = 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this magic constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const subroutines = 4 // Provider, Aggregator, instance start, instance run.
var (
  toWait   = subroutines
)

for {
// Try get ammo before schedule wait, to be ready shoot just in time.
// Acquire should unblock in case of context cancel.
// TODO: we just throw away acquired ammo, if our schedule finished. Fix it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this: acquire schedule token -> if schedule is not finished, then acquire ammo and wait; else just exit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That variant a bit more simple, but it makes some strange race more probable.
Example: 3K instances started for shooting at 100 RPS, ammo could be provided at 150 ops.
They concurrently take schedule tokens for 30 sec and begin to wait for ammo concurrently.
In a case of acquiring schedule token before ammo, due high concurrency it's possible when instances with "later" tokens will get ammo earlier, but instances with early tokens will be waiting for ammo.

So, I consider to make the behaviour a bit more complex but do not worry about such races.

}

type compositeSchedule struct {
// Under read lock, goroutine can read shcheds and call Next,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scheds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it becomes "schedules"

@coveralls
Copy link

coveralls commented Aug 21, 2017

Coverage Status

Coverage increased (+10.5%) to 77.237% when pulling e96f6c4 on skipor:feature/12 into 040b5c3 on yandex:develop.

@direvius direvius merged commit e365af4 into yandex:develop Aug 21, 2017
@skipor skipor deleted the feature/12 branch January 2, 2018 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in console in the end of a test Refactor limiters Test limiters correctness
3 participants