Action plan for slooo #43

Essoz · 2022-03-05T16:47:24Z

Based on #39 #40, we propose the following steps for improving slooo.

Start with easier improvements that make the code base more structured, to fully make use of the OOP model and to get rid of some heritage from the DepFast project.

Logging useful information like node membership for each experiment to provide a stronger base for result reasoning.
Error detection in code: we want to identify command execution status in utility functions to avoid dumping meaningless outputs into terminal.
Support easy switch between multiple levels of fail-slow faults and allows users to specify multiple slowness configs.
Slooo code redesign using OOP model. We want to allow for easier feature integration and better code readability since it is external users that adapt the tool to the quorum system. The redesign will be considered in parallel with the above three points.

Then we work on

system data collection: Use extra threads to record system usage. We may allow the user to specify sample rates, but specific details need further discussion.

and,

The point break (auto-tuning) feature. Our current thought is that, the tool should run experiments using random levels of fail-slow faults. Then, based on previous results (a statistical approach), the tool narrows the range of fail-slow faults and run a new round of experiments and repeats the process. Finally, we narrow down to the "breaking" point. My concerns for this approach are that: (1) it can be costly to run multiple rounds of experiments, (2) the statistical approach may not work, and thus the tool may find nothing eventually.

finally,

rewrite the slooo documentation contributed by @tianyin

varshith15 · 2022-03-15T17:56:50Z

@Essoz We can take inspiration from big projects like https://github.com/xonsh/xonsh#projects-that-use-xonsh
for inspiration to write cleaner code.

varshith15 · 2022-03-15T18:01:55Z

For process info collection we can take the help of https://github.com/astrofrog/psrecord and add our own plugins to it

Essoz · 2022-03-15T18:18:15Z

These are great suggestions, I will look into them later!

varshith15 · 2022-04-01T15:18:18Z

@Essoz we should also add testcases for the common code like fault inject, general_utils and stuff

Essoz · 2022-04-01T17:51:53Z

This is long overdue. But I have not figured out how we can verify the amount of slowess injected. Do you have any ideas?

varshith15 · 2022-04-01T18:18:57Z

We just check if the pid has been added to the cgroup that's all.

tianyin closed this as completed Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Action plan for slooo #43

Action plan for slooo #43

Essoz commented Mar 5, 2022 •

edited by varshith15

Loading

varshith15 commented Mar 15, 2022

varshith15 commented Mar 15, 2022

Essoz commented Mar 15, 2022

varshith15 commented Apr 1, 2022

Essoz commented Apr 1, 2022

varshith15 commented Apr 1, 2022

Action plan for slooo #43

Action plan for slooo #43

Comments

Essoz commented Mar 5, 2022 • edited by varshith15 Loading

varshith15 commented Mar 15, 2022

varshith15 commented Mar 15, 2022

Essoz commented Mar 15, 2022

varshith15 commented Apr 1, 2022

Essoz commented Apr 1, 2022

varshith15 commented Apr 1, 2022

Essoz commented Mar 5, 2022 •

edited by varshith15

Loading