Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer's skeleton: use advisor to optimize config options #4169

Closed
wants to merge 20 commits into from

Conversation

poojam23
Copy link
Contributor

@poojam23 poojam23 commented Jul 23, 2018

In #3934 we introduced advisor scripts that make suggestions in the config options based on the log file and stats from a run of rocksdb. The optimizer runs the advisor on a benchmark application in a loop and automatically applies the suggested changes until the config options are optimized. This is a work in progress and the patch is the initial skeleton for the optimizer. The sample application that is run in the loop is currently dbbench.

PERF_CON = "db perf context"

# Map from Rocksdb option to its corresponding db_bench command-line arg
OPTION_CMD_LINE_FLAG = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these few options are excepted here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the case when options need to be given through the command-line instead of the options file, some options have a different name / value when given as command-line args. Such options are added to this map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this explanation to the inline comment?

'''


class BenchmarkRunner(ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not BenchmarkRunner be in a separate file? Is not db_benchmark_client.py the file that is specific to each new benchmark that we have?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shifted BenchmarkRunner to new file bench_runner.py
renamed file db_benchmark_client.py --> db_bench_runner.py to make it more specific to db_bench


@staticmethod
def get_info_log_file_name(db_path):
file_name = db_path[1:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the first letter is removed? I would put an example input line as an inline comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

with open(self.OUTPUT_FILE, 'r') as fp:
for line in fp:
if line.startswith(self.benchmark):
print(line) # print output of db_bench run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would inline a sample line here so we understand what the rest of parsing is doing,

perf_context = {
tk.split('=')[0].strip(): tk.split('=')[1].strip()
for tk in token_list
if tk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sample input line would clarify what this lines are doing.


return (logs_file_prefix, stats_freq_sec)

def _get_options_command_line_args_str(self, curr_options):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add inline comment to describe what this function is doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

data_sources = {
DataSource.Type.DB_OPTIONS: [db_options],
DataSource.Type.LOG: [db_logs],
DataSource.Type.TIME_SERIES: [db_log_stats, db_perf_context]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is perf context is time series?

))
return data_sources, parsed_output[self.THROUGHPUT]

def get_available_workloads(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seems to be only used for testing. If yes can we mark it as such.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return self.supported_benchmarks


# TODO: remove this method, used only for testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to remove this? Do we have other kind of testing?

if action is Suggestion.Action.increase:
if old_value:
old_value = float(old_value)
if (not old_value or old_value <= 0) and chosen_sugg_val:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not we have any legit config parameter with negative value? How about 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this.
but in case the chosen_sugg_val is None, then it might be handled by the (old_value<10) condition.

elif not old_value:
new_value = None
elif old_value < 10:
new_value = old_value + 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the rational behind numbers 10 and 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this condition to handle a negative or 0 or 'small' (between 0 and 10) old_value.
In case of 'small' old_value, I needed this condition to see a (substantial) difference in the option in a single iteration.
For example, if old_value is 1 / 2 / 3, then: int(1.3*old_value) will again give the same value.
(need the int() because most of Rocksdb options are integers afaik)

elif old_value < 10:
new_value = old_value + 2
else:
new_value = 1.3 * old_value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the rationale behind number 1.3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no strong preference, but to see a substantial difference in an option's value in a single iteration, I increase/decrease it's value by 30%.

new_value = int(new_value)
elif action is Suggestion.Action.set:
# don't care about old value of option
new_value = chosen_sugg_val
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that chosen_sugg_val be None here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No
There is a check in the class Suggestion that throws an error if the action is 'SET' but there is no suggested value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you assert it then?

new_value = ConfigOptimizer.apply_action_on_value(
old_value, action, suggested_values
)
if new_value:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a bit of debugging printf to clarify why new_value is None? If it is due to a bug, we silently skip a suggestion.
Also does "if new_value" return false if new_value is 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, if new_value == 0, if new_value returns false, thank you for pointing this out!
I am using AssertionExceptions now for handling this.

final_guidelines = copy.deepcopy(guidelines)
options_to_remove = []
acl = [action for action in Suggestion.Action]
# for any option, if there is any intersection of scopes between the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rationale behind this? What would happen if we skip this entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to explain (through an example) what will happen if we skip calling this method:

Say Rule_ABC was triggered for col_fam_A and it had a suggestion to INCREASE Option_1 and
Rule_XYZ was also triggered for col_fam_A and it had a suggestion to DECREASE Option_1.
This probably means that we should let Option_1 be as it is.

'disambiguate_guidelines' will remove Option_1 from the guidelines object, so that Option_1 is not picked by the optimizer in its optimization loop.

if sc1.intersection(sc2):
options_to_remove.append(option)
break
# if it's a database-wide option, only one action is possible on it,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I follow the logic here.

final_guidelines.pop(option, None)
return final_guidelines

def get_guidelines(self, rules, suggestions_dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an inline comment of what this function is doing?

triggered_rules, self.rule_parser.get_suggestions_dict()
)
# use the guidelines to improve the database configuration
working_config = new_options.get_options(list(guidelines.keys()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this line is doing? What does get_options do here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets the subset of options for which suggestions were triggered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add inline comments then?

new_options.update_options(updated_config)
return new_options

def run_v2(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an inline comment of what run_v2 is doing and how is it different than run?

import time


NO_FAM = 'DB_WIDE'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace it with NO_COL_FAMILY?

if not self.column_family:
self.column_family = NO_FAM

def get_human_readable_time(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you inline a sample output?

self.message = self.message + remaining_log
self.message = self.message + '\n' + remaining_log.strip()

def get_timestamp(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you inline a sample output?

trigger = cond.get_trigger()
if not trigger:
trigger = {}
if log.get_column_family() not in trigger:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure if I understand this. Why would a log have a column family?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some logs are printed for specific column families, example:
2018/07/20-14:21:37.111317 7feca0dff700 [WARN] [db/column_family.cc:743] [default] Stopping writes because we have 6 immutable memtables (waiting for flush), max_write_buffer_number is set to 6

for remove_cond in conditions_to_be_removed:
conditions.remove(remove_cond)
return conditions
trigger = cond.get_trigger()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does trigger exactly mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a comment to explain this in the trigger_conditions_for_log() method.

def get_column_families(self):
return self.column_families

def get_all_options(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add inline comments of what this function do.

reqd_options_dict[option] = {}
reqd_options_dict[option][NO_FAM] = self.misc_options[option]
else:
sec_type = '.'.join(option.split('.')[:-1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline sample to explain the format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not explain [:-1] in the code.

# List[option] -> Dict[option, Dict[col_fam, value]]
reqd_options_dict = {}
for option in reqd_options:
if '.' not in option:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline comments to explain the significance of not having . in the option.

)
return reqd_options_dict

def update_options(self, options):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: inline comment

copy.deepcopy(options[option][col_fam])
)

def generate_options_config(self, nonce):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we store the misc configs anywhere? How do we report the suggested misc configs to the user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We report it in the tool's output to stdout at the end of its run.
An example output is:

Final configuration in: /home/poojamalik/workspace/rocksdb/tools/advisor/advisor/../temp/OPTIONS_final.tmp
Final miscellaneous options: {'bloom_bits': 7, 'cache_size': '16000000', 'rate_limiter_bytes_per_sec': None}

def parse_log_line_for_stats(log_line):
# Note: case insensitive stat names
token_list = log_line.strip().split()
stat_prefix = token_list[0] + '.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline sample input please.

new_value = int(new_value)
elif action is Suggestion.Action.set:
# don't care about old value of option
new_value = chosen_sugg_val
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you assert it then?

This adds the ConfigOptimizer code, though testing the code is WIP.
It also adds some unit tests for the classes Log and DatabaseLog;
correspondingly fixes some issues in the 2 classes.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Changes per file:
db_benchmark_client.py - name of the LOG file
db_options_parser.py - not storing file_name, since update_options does
not update in the original Options file; removed some redundant
code; added code for testing (to be moved to unit tests later)
rule_parser.py - modified OptionCondition to always contain list of
options
db_timeseries_parser.py - added some checks for case when map keys
from the TimeSeriesCondition may not be available in the provided
data source
db_stats_fetcher.py - modified LogStatsParser to add statistic to
keys_ts map only when it is present in the LOG file; added code for
testing (to be moved to unit test later)

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
config_optimizer_example.py - changed the way the bench runner is being
initialized; also send it ODS arguments
db_benchmark_client.py - added OdsStatsFetcher object to data_sources
returned by run_experiment
db_config_optimizer.py - added code to output results
db_stats_fetcher.py - changed the way TimeSeriesCondition keys are
processed by LogStatsParser and OdsStatsFetcher
rule_parser.py - added code for more information in output of
rule_parser

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
config_optimizer_example: calling the new optimizer method, taking
stats_dump_period_sec, db_log_dir as command-line args
db_benchmark_client: logic for the LOG file name; removed timeout from
_run_command(); run_experiment() also returns the throughput obtained at
the end of the db_bench run
db_timeseries_parser: added support for the case when condition requires
evaluate_expression at each epoch instead of with only aggregated
values; fetch_aggregated_values() returns aggregated values of
statistics for a given entity, earlier it used to return the same for
all entities
db_config_optimizer: moved the code that applies suggestion from
improve_db_config to apply_action_on_value; added run_v2() and
improve_db_config_v2(): in this one rule is picked at a time, all its
suggestions are applied, then bench_runner.run_experiment(new_config)
called, if throughput improves, then use the new_data_sources returned
for checking for more triggered rules, else, backtrack to the previous
config and pick another rule to apply.
db_stats_fetcher: added a parser for the ods cli output; added some more
code for testing
rule_parser: remove the check for 'bursty' conditions in the Rule's
is_triggered() method; since now the evalutate_expression (without
aggregation_op) also returns a list of epoch where the expression
evaluates to true

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Changes per file:
config_optimizer_example- command-line args support for options that are
not supported by options.ini file, but can be given to bench_runner
db_benchmark_client- fixed the bugs in the location and name of the LOG
files; support for using the misc_options as db_bench command-line args;
some testing code
db_options_parser- method for finding diff between 2 option_configs;
support for misc_options; some testing code
rules.ini- support for misc_options
db_config_optimizer- per-method changes:
  * apply_action_on_value: handle cases when old_value is None
  * improve_db_config: modified to handle the case when a suggestion's
    option was not in the existing config
  * improve_db_config_v2: same as above
  * disambiguate_guidelines: handle the case of disambiguation
    when a guideline's option is not in the existing config
  * run_v2: shifted code for picking a rule and getting
    updated_config to new method: apply_suggestions()
  * apply_suggestions: new method to pick new rule, and get
    updated_config
  * get_backtrack_config: new method to get config to update
    options to so that the latest changes applied, are reversed

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Changes per file:
config_optimizer_example - removed the db_log_dir option, since making
changes to it might cause DBBenchRunner to crash when it tries to use it
as a command-line arg
db_options_parser - added a method to return all the options in the
DatabaseOptions object
db_benchmark_client - added method to fetch default Rocksdb options used
by db_bench in case the default OPTIONS file is not provided; shifted
the db_bench output parsing code to a separate method; created a build
the appropriate command for db_bench; added some testing code

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
changes per file:
db_stats_fetcher: add the DatabasePerfContext class; some code for
testing
db_benchmark_client: modified DBBenchRunner to parse its own output and
return a DatabasePerfContext too; changed the return type of
run_experiment; added some code for testing
rule_parser: changes in trigger_conditions to take into account the
changes in the data_sources object returned by DBBenchRunner
db_timeseries_parser: initialise stats_freq_sec in TimeSeriesData
constructor

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Changes per file:
config_optimizer_example: command line args ldb, base_db_path
db_benchmark_client: run_experiment takes db_path
db_config_optimizer: bootstraps database before each experiment run
db_timeseries_parser: performing a common (bursty/evaluate_expression)
check for entities with all required stats (per condition) in
check_and_trigger_conditions

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Changes in files:
config_optimizer_example: removed the ldb argument
db_benchmark_client: handle the compression option as a command-line
arg; bootstrap the database according to the current options
db_config_optimizer: don't set up the database with default options;
leave it to the benchrunner to do the same on its own with the
applicable options
rules.ini: added some more rules and corrected some

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Changes per file:
db_benchmark_client: add staticmethod is_metric_better() to BenchRunner;
implement the same in DBBenchRunner to compare throughput
db_config_optimizer: use bench_runner.is_metric_better() method to
decide whether to backtrack or not in the optimization loop

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Changes per file:
bench_runner: shifted BenchmarkRunner abstract class to this file; also
shifted method get_info_log_file_name to this class from DBBenchRunner
as other bench runners may need it too
db_bench_runner: removed the OPTIONS diff logic; assumption is that
db_bench should be able to handle Rocksdb OPTIONS given as command-line
args even when options_file arg is given; added more comments to the
code
ALL FILES: added user-id to the TODO comments

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@@ -0,0 +1,32 @@
from abc import ABC, abstractmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the copy-right header forgotten?

# these are options that are column-family agnostic and are not yet
# supported by the Rocksdb Options file: eg. bloom_bits=2
parser.add_argument('--base_db_path', required=True, type=str)
parser.add_argument('--misc_options', nargs='*')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is most likely to be read by non-experts. would be great to add an inline comment of what misc_option is. In fact would be great to add helper string for all these options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example of helper strings:

parser.add_argument('integers', metavar='N', type=int, nargs='+',
                    help='an integer for the accumulator')

CONFIG_OPT_NUM_ITER = 10


def main(args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an example command line here (without ods-related args)?

self.db_bench_binary = positional_args[0]
self.benchmark = positional_args[1]
self.db_bench_args = None
self.supported_benchmarks = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a TODO to move this line to unit test too?

if tk
}
# add timestamp information
timestamp = int(time.time())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO that is a hack and should be replaced with the timestamp that db_bench will provide per printed perf context.

if not log_dir_path.endswith('/'):
log_dir_path += '/'

logs_file_prefix = log_dir_path + log_file_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not prefix is it? then can we rename it to log_full_path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

talked offline. lets keep the name but explain that old files (which share the same prefix) are not currently checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added TODO's to the places where these names are being used to fetch logs:
-- in check_and_trigger_conditions() of DatabaseLogs class (db_log_parser.py)
-- in fetch_timeseries() of LogStatsParser class (db_stats_fetcher.py)

Changes per file:
bench_runner: added example in comment for explanation of
get_info_log_file_name
config_optimizer_example: added help strings to the argparser
command-line arguments
db_bench_runner: added TODO comments
db_config_optimizer: changed method apply_action_on_value() for better
readability, and it throws AssertionError if new_value is None; handling
AssertionError in places that call this method; added more comments to
the code; removed disambiguate_guidelines() and handling the scope of
database-wide options in get_guidelines()
db_log_parser: added comments for readability
db_stats_fetcher: added comments for readability

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
curr_options, curr_rule, suggestions_dict
)
conf_diff = DatabaseOptions.get_options_diff(curr_conf, updated_conf)
if not conf_diff: # the current and updated configs are the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a printf after this if-then so that we would notice if it got stuck in the loop?

self.rule_parser = rule_parser
self.base_db_path = base_db

def get_guidelines(self, rules, suggestions_dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this function as it is only used in run()?

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
all_options = []
# Example: in the section header '[CFOptions "default"]' read from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many section types we have? Can you point me to where it is documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum OptionSection : char {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put inline comment like: "Refer to OptionSection for other type of section types."

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!


def get_time(self):
self.column_family = None
for col_fam in column_families:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a sample string

self.message = self.message + '\n' + remaining_log.strip()

def get_timestamp(self):
# assumes that the LOG timestamp is in GMT; this means that if this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are extra and distracting. Can you remove them:

+        this means that if this
+        # timestamp is to be converted to a human-readable value, one could
+        # either use the method get_human_readable_time() in this class or
+        # one can use 'datetime.utcfromtimestamp(<timestamp>).isoformat()';
+        # in either case the value returned would match the Log time in the LOG
+        # file

# supported by the Rocksdb OPTIONS file, so it is not prefixed
# by '<section_type>.' and must be stored in the separate
# misc_options dictionary
if NO_COL_FAMILY not in options[option]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the case that would make this if condition to be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, let there be a Suggestion (in the Rules spec) in which the 'option' field is incorrectly specified, say option=write_buffer_size instead of option=CFOptions.write_buffer_size.
This is an undesirable situation and I am adding a print statement here to print a Warning to the user.

for ix, option in enumerate(cond.options):
if option not in reqd_options_dict:
missing_reqd_option = True
break # required option is absent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the cases that this could happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example: there is an OptionCondition that requires the 'bloom_bits' option in its expression to be evaluated, however, the bloom filter is not enabled yet and this option is not part of the misc_options dictionary.

stat_values = [
token
for token in token_list[1:]
if token != ':'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to the ones with :? like "rocksdb.db.get.micros P50 :"

def fetch_timeseries(self):
pass

def fetch_burst_epochs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add inline comments of what each function do? ditto for the next functions as well.

@@ -62,7 +62,7 @@ def get_element(line):
def get_key_value_pair(line):
line = line.strip()
key = line.split('=')[0].strip()
value = line.split('=')[1].strip()
value = "=".join(line.split('=')[1:])
if not value:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case value could be None?

Copy link
Contributor Author

@poojam23 poojam23 Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The OptionsSpecParser is a subclass of IniParser)
I added this check for checking for empty string, for example in the OPTIONS file we can have a line:
"db_log_dir="
Changing the if-condition to explicitly check for empty string for clarity

in_seconds *= (24 * 60 * 60)
self.overlap_time_seconds = in_seconds

def get_overlap_timestamps(self, key1_trigger_epochs, key2_trigger_epochs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: inline comments of what the function do.

from advisor.rule_parser import RulesSpec


def main(args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a test file? If yes, can we rename it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used this file to run and manually test the Rules parser when that was the only code. I am not using it at all now, so I am deleting the file.

@@ -29,7 +29,6 @@ conditions=stall-too-many-memtables
[Condition "stall-too-many-memtables"]
source=LOG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we explain the format to the users? If we do not have such place already should we do it as inline comment on top of this file?


[Condition "log-2-true"]
source=LOG
regex=Stalling writes because we have \d+ level-0 files
scope=column_family
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we remove the scope keyword?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the code should be able to figure out the scope on its own when the triggers are set for the conditions.

…ions

This commit improves the comments that were added to the parsing methods
in the previous commit.
It adds a new method is_misc_option() to the DatabaseOptions class that
isolates the logic for checking whether a given option is supported by
the Rockdb OPTIONS file or not (in which case it is a 'misc_option').
Added the word 'WARNING' where the eval() method might fail.
Added assert() for a Suggestion's option and action fields to be
non-empty in the db_config_optimizer.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@@ -232,6 +239,10 @@ def update_options(self, options):
# by '<section_type>.' and must be stored in the separate
# misc_options dictionary
if NO_COL_FAMILY not in options[option]:
print(
'WARNING(DatabaseOptions): check format of option: ' +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you print a more descriptive warning?

print(
'WARNING(DatabaseOptions): condition ' + cond.name +
' requires option ' + option + ' but this option is' +
' not available'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also lets say that we are skipping the condition.

@maysamyabandeh maysamyabandeh changed the title Optimizer: merging commits from 2 PRs into one Optimizer's skeleton: use advisor to optimize config options Jul 26, 2018
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maysamyabandeh is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

rcane pushed a commit to rcane/rocksdb that referenced this pull request Sep 13, 2018
…k#4169)

Summary:
In facebook#3934 we introduced advisor scripts that make suggestions in the config options based on the log file and stats from a run of rocksdb. The optimizer runs the advisor on a benchmark application in a loop and automatically applies the suggested changes until the config options are optimized. This is a work in progress and the patch is the initial skeleton for the optimizer. The sample application that is run in the loop is currently dbbench.
Pull Request resolved: facebook#4169

Reviewed By: maysamyabandeh

Differential Revision: D9023671

Pulled By: poojam23

fbshipit-source-id: a6192d475c462cf6eb2b316716f97cb400fcb64d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants