New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi/Lo event ratio on scheduler #362

Open
przygienda opened this Issue May 12, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@przygienda

przygienda commented May 12, 2018

Based on @carllerche ask here's a first input on giving knobs on scheduler for high load scenarios:

  • HI/LO event classification (on registration?)
  • fixed ratio for HI/LO on the events returned where LO fills up the space if HI doesn't produce enough
  • even when HI is producing enough to fill up events LO is guaranteed min ratio, i.e. cannot be starved
  • when LOs pile up too high in the scheduler, selective front or end-of-queue dropping
  • alarm when HIs pile up too high in the scheduler
  • ideally time bounds @ which hanging LO event is considered "irrelevant" and can be chucked

This allows an application to play with events it gets in couple ways

  • shorter event set returned -> more responsive, more CPU churn
  • higher HI ratio -> better response to HI priority load on heavy load with longer delays on LO
@carllerche

This comment has been minimized.

Member

carllerche commented May 13, 2018

Thanks. Do you have any reading / background info to get more context?

Also, what do you think about how erlang does prioritization: https://github.com/erlang/otp/blob/master/erts/emulator/beam/erl_process.c

@przygienda

This comment has been minimized.

przygienda commented May 13, 2018

some terminology:

https://en.wikipedia.org/wiki/Work-conserving_scheduler

we normally use greedy, i.e. non-conserving schedulers

Load shedding (contrary to work preserving, i.e. never shed load) is mostly used in power grids actually

https://ieeexplore.ieee.org/document/8281976/

and I wouldn't remember any easy computer science research papers that dealt with that except some very deep real-time operating system & when you look @ more modern streaming

https://www.researchgate.net/publication/276832010_Towards_load_shedding_and_scheduling_schemes_for_data_streams_that_maintain_quality_and_timing_requirements_of_query_results

Erlang is really a full operating system scheduler with threads/signals/i/o which AFAIS MIO is a different (and much more comprehensive problem). Rust as architecture relies on thread scheduling for thread fairness (well, don't know how far the high-prio/lo-prio thread scheduling is working in Linux these days) and I/O over kqueue.

In practical experience what I describe to you is the simplest & most practical way to shed load (if building really high load applications). Thread scheduling hi/lo is a very difficult problem IME and kernel scheduling+libthread scheduling will dwarf any attempts to influence this. I/O scheduling unless drained is difficult to prioritize since often one channel has many priorities (e.g. M4 S- and I-frames).

Timer events have to be taken into scheduling as well because on large amounts you experience something called "timer slip" which is another kettle of fish. Let's say it's enough if the events as I described are not only I/O events but also timer hi/lo events.

Think deeply whether you want to go in here. This stuff is normally done on very high availability commercial systems only and quite difficult to get right or rather very difficult to get right @ performance ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment