Optimizer's skeleton: use advisor to optimize config options #4169

poojam23 · 2018-07-23T21:23:37Z

In #3934 we introduced advisor scripts that make suggestions in the config options based on the log file and stats from a run of rocksdb. The optimizer runs the advisor on a benchmark application in a loop and automatically applies the suggested changes until the config options are optimized. This is a work in progress and the patch is the initial skeleton for the optimizer. The sample application that is run in the loop is currently dbbench.

maysamyabandeh · 2018-07-24T15:30:44Z

tools/advisor/advisor/db_benchmark_client.py

+    PERF_CON = "db perf context"
+
+    # Map from Rocksdb option to its corresponding db_bench command-line arg
+    OPTION_CMD_LINE_FLAG = {


Why these few options are excepted here?

This is for the case when options need to be given through the command-line instead of the options file, some options have a different name / value when given as command-line args. Such options are added to this map.

Can you add this explanation to the inline comment?

maysamyabandeh · 2018-07-24T15:32:30Z

tools/advisor/advisor/db_benchmark_client.py

+'''
+
+
+class BenchmarkRunner(ABC):


should not BenchmarkRunner be in a separate file? Is not db_benchmark_client.py the file that is specific to each new benchmark that we have?

shifted BenchmarkRunner to new file bench_runner.py
renamed file db_benchmark_client.py --> db_bench_runner.py to make it more specific to db_bench

maysamyabandeh · 2018-07-24T15:36:03Z

tools/advisor/advisor/db_benchmark_client.py

+
+    @staticmethod
+    def get_info_log_file_name(db_path):
+        file_name = db_path[1:]


Why the first letter is removed? I would put an example input line as an inline comment.

maysamyabandeh · 2018-07-24T15:44:18Z

tools/advisor/advisor/db_benchmark_client.py

+        with open(self.OUTPUT_FILE, 'r') as fp:
+            for line in fp:
+                if line.startswith(self.benchmark):
+                    print(line)  # print output of db_bench run


I would inline a sample line here so we understand what the rest of parsing is doing,

maysamyabandeh · 2018-07-24T15:46:26Z

tools/advisor/advisor/db_benchmark_client.py

+                    perf_context = {
+                        tk.split('=')[0].strip(): tk.split('=')[1].strip()
+                        for tk in token_list
+                        if tk


A sample input line would clarify what this lines are doing.

maysamyabandeh · 2018-07-24T16:01:27Z

tools/advisor/advisor/db_benchmark_client.py

+
+        return (logs_file_prefix, stats_freq_sec)
+
+    def _get_options_command_line_args_str(self, curr_options):


Can you add inline comment to describe what this function is doing?

maysamyabandeh · 2018-07-24T16:13:59Z

tools/advisor/advisor/db_benchmark_client.py

+        data_sources = {
+            DataSource.Type.DB_OPTIONS: [db_options],
+            DataSource.Type.LOG: [db_logs],
+            DataSource.Type.TIME_SERIES: [db_log_stats, db_perf_context]


Why is perf context is time series?

maysamyabandeh · 2018-07-24T16:18:35Z

tools/advisor/advisor/db_benchmark_client.py

+            ))
+        return data_sources, parsed_output[self.THROUGHPUT]
+
+    def get_available_workloads(self):


This also seems to be only used for testing. If yes can we mark it as such.

maysamyabandeh · 2018-07-24T16:19:36Z

tools/advisor/advisor/db_benchmark_client.py

+        return self.supported_benchmarks
+
+
+# TODO: remove this method, used only for testing


Are you planning to remove this? Do we have other kind of testing?

maysamyabandeh · 2018-07-24T16:39:24Z

tools/advisor/advisor/db_config_optimizer.py

+        if action is Suggestion.Action.increase:
+            if old_value:
+                old_value = float(old_value)
+            if (not old_value or old_value <= 0) and chosen_sugg_val:


Do not we have any legit config parameter with negative value? How about 0?

I am not sure about this.
but in case the chosen_sugg_val is None, then it might be handled by the (old_value<10) condition.

maysamyabandeh · 2018-07-24T16:42:26Z

tools/advisor/advisor/db_config_optimizer.py

+            elif not old_value:
+                new_value = None
+            elif old_value < 10:
+                new_value = old_value + 2


Can you explain the rational behind numbers 10 and 2?

I added this condition to handle a negative or 0 or 'small' (between 0 and 10) old_value.
In case of 'small' old_value, I needed this condition to see a (substantial) difference in the option in a single iteration.
For example, if old_value is 1 / 2 / 3, then: int(1.3*old_value) will again give the same value.
(need the int() because most of Rocksdb options are integers afaik)

maysamyabandeh · 2018-07-24T16:42:46Z

tools/advisor/advisor/db_config_optimizer.py

+            elif old_value < 10:
+                new_value = old_value + 2
+            else:
+                new_value = 1.3 * old_value


Can you explain the rationale behind number 1.3?

There is no strong preference, but to see a substantial difference in an option's value in a single iteration, I increase/decrease it's value by 30%.

maysamyabandeh · 2018-07-24T16:46:26Z

tools/advisor/advisor/db_config_optimizer.py

+            new_value = int(new_value)
+        elif action is Suggestion.Action.set:
+            # don't care about old value of option
+            new_value = chosen_sugg_val


Is it possible that chosen_sugg_val be None here?

No
There is a check in the class Suggestion that throws an error if the action is 'SET' but there is no suggested value.

can you assert it then?

maysamyabandeh · 2018-07-24T16:53:56Z

tools/advisor/advisor/db_config_optimizer.py

+                new_value = ConfigOptimizer.apply_action_on_value(
+                    old_value, action, suggested_values
+                )
+                if new_value:


Do we need a bit of debugging printf to clarify why new_value is None? If it is due to a bug, we silently skip a suggestion.
Also does "if new_value" return false if new_value is 0?

You are right, if new_value == 0, if new_value returns false, thank you for pointing this out!
I am using AssertionExceptions now for handling this.

maysamyabandeh · 2018-07-24T17:28:32Z

tools/advisor/advisor/db_config_optimizer.py

+        final_guidelines = copy.deepcopy(guidelines)
+        options_to_remove = []
+        acl = [action for action in Suggestion.Action]
+        # for any option, if there is any intersection of scopes between the


What is the rationale behind this? What would happen if we skip this entirely?

I will try to explain (through an example) what will happen if we skip calling this method:

Say Rule_ABC was triggered for col_fam_A and it had a suggestion to INCREASE Option_1 and
Rule_XYZ was also triggered for col_fam_A and it had a suggestion to DECREASE Option_1.
This probably means that we should let Option_1 be as it is.

'disambiguate_guidelines' will remove Option_1 from the guidelines object, so that Option_1 is not picked by the optimizer in its optimization loop.

maysamyabandeh · 2018-07-24T17:30:14Z

tools/advisor/advisor/db_config_optimizer.py

+                    if sc1.intersection(sc2):
+                        options_to_remove.append(option)
+                        break
+        # if it's a database-wide option, only one action is possible on it,


I am not sure if I follow the logic here.

maysamyabandeh · 2018-07-24T17:32:13Z

tools/advisor/advisor/db_config_optimizer.py

+            final_guidelines.pop(option, None)
+        return final_guidelines
+
+    def get_guidelines(self, rules, suggestions_dict):


can you add an inline comment of what this function is doing?

maysamyabandeh · 2018-07-24T17:48:17Z

tools/advisor/advisor/db_config_optimizer.py

+                triggered_rules, self.rule_parser.get_suggestions_dict()
+            )
+            # use the guidelines to improve the database configuration
+            working_config = new_options.get_options(list(guidelines.keys()))


What this line is doing? What does get_options do here?

This gets the subset of options for which suggestions were triggered.

can you add inline comments then?

maysamyabandeh · 2018-07-24T17:49:22Z

tools/advisor/advisor/db_config_optimizer.py

+            new_options.update_options(updated_config)
+        return new_options
+
+    def run_v2(self):


Can you add an inline comment of what run_v2 is doing and how is it different than run?

maysamyabandeh · 2018-07-24T18:03:11Z

tools/advisor/advisor/db_log_parser.py

+import time
+
+
+NO_FAM = 'DB_WIDE'


Can you replace it with NO_COL_FAMILY?

maysamyabandeh · 2018-07-24T18:07:49Z

tools/advisor/advisor/db_log_parser.py

+        if not self.column_family:
+            self.column_family = NO_FAM
+
+    def get_human_readable_time(self):


Can you inline a sample output?

maysamyabandeh · 2018-07-24T18:07:59Z

tools/advisor/advisor/db_log_parser.py

-        self.message = self.message + remaining_log
+        self.message = self.message + '\n' + remaining_log.strip()
+
+    def get_timestamp(self):


Can you inline a sample output?

maysamyabandeh · 2018-07-24T18:12:27Z

tools/advisor/advisor/db_log_parser.py

+                trigger = cond.get_trigger()
+                if not trigger:
+                    trigger = {}
+                if log.get_column_family() not in trigger:


i am not sure if I understand this. Why would a log have a column family?

Some logs are printed for specific column families, example:
2018/07/20-14:21:37.111317 7feca0dff700 [WARN] [db/column_family.cc:743] [default] Stopping writes because we have 6 immutable memtables (waiting for flush), max_write_buffer_number is set to 6

maysamyabandeh · 2018-07-24T18:12:39Z

tools/advisor/advisor/db_log_parser.py

-        for remove_cond in conditions_to_be_removed:
-            conditions.remove(remove_cond)
-        return conditions
+                trigger = cond.get_trigger()


What does trigger exactly mean here?

Adding a comment to explain this in the trigger_conditions_for_log() method.

maysamyabandeh · 2018-07-24T20:22:37Z

tools/advisor/advisor/db_options_parser.py

+    def get_column_families(self):
+        return self.column_families
+
+    def get_all_options(self):


add inline comments of what this function do.

maysamyabandeh · 2018-07-24T20:23:45Z

tools/advisor/advisor/db_options_parser.py

+                    reqd_options_dict[option] = {}
+                reqd_options_dict[option][NO_FAM] = self.misc_options[option]
+            else:
+                sec_type = '.'.join(option.split('.')[:-1])


inline sample to explain the format.

This does not explain [:-1] in the code.

maysamyabandeh · 2018-07-24T20:24:10Z

tools/advisor/advisor/db_options_parser.py

+        # List[option] -> Dict[option, Dict[col_fam, value]]
+        reqd_options_dict = {}
+        for option in reqd_options:
+            if '.' not in option:


inline comments to explain the significance of not having . in the option.

maysamyabandeh · 2018-07-24T20:25:13Z

tools/advisor/advisor/db_options_parser.py

+                        )
+        return reqd_options_dict
+
+    def update_options(self, options):


ditto: inline comment

maysamyabandeh · 2018-07-24T20:29:28Z

tools/advisor/advisor/db_options_parser.py

+                        copy.deepcopy(options[option][col_fam])
+                    )
+
+    def generate_options_config(self, nonce):


do we store the misc configs anywhere? How do we report the suggested misc configs to the user?

We report it in the tool's output to stdout at the end of its run.
An example output is:

Final configuration in: /home/poojamalik/workspace/rocksdb/tools/advisor/advisor/../temp/OPTIONS_final.tmp
Final miscellaneous options: {'bloom_bits': 7, 'cache_size': '16000000', 'rate_limiter_bytes_per_sec': None}

maysamyabandeh · 2018-07-24T20:39:46Z

tools/advisor/advisor/db_stats_fetcher.py

+    def parse_log_line_for_stats(log_line):
+        # Note: case insensitive stat names
+        token_list = log_line.strip().split()
+        stat_prefix = token_list[0] + '.'


inline sample input please.

maysamyabandeh · 2018-07-24T20:55:02Z

tools/advisor/advisor/db_config_optimizer.py

+            new_value = int(new_value)
+        elif action is Suggestion.Action.set:
+            # don't care about old value of option
+            new_value = chosen_sugg_val


can you assert it then?

This adds the ConfigOptimizer code, though testing the code is WIP. It also adds some unit tests for the classes Log and DatabaseLog; correspondingly fixes some issues in the 2 classes. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Changes per file: db_benchmark_client.py - name of the LOG file db_options_parser.py - not storing file_name, since update_options does not update in the original Options file; removed some redundant code; added code for testing (to be moved to unit tests later) rule_parser.py - modified OptionCondition to always contain list of options db_timeseries_parser.py - added some checks for case when map keys from the TimeSeriesCondition may not be available in the provided data source db_stats_fetcher.py - modified LogStatsParser to add statistic to keys_ts map only when it is present in the LOG file; added code for testing (to be moved to unit test later) Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

config_optimizer_example.py - changed the way the bench runner is being initialized; also send it ODS arguments db_benchmark_client.py - added OdsStatsFetcher object to data_sources returned by run_experiment db_config_optimizer.py - added code to output results db_stats_fetcher.py - changed the way TimeSeriesCondition keys are processed by LogStatsParser and OdsStatsFetcher rule_parser.py - added code for more information in output of rule_parser Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

config_optimizer_example: calling the new optimizer method, taking stats_dump_period_sec, db_log_dir as command-line args db_benchmark_client: logic for the LOG file name; removed timeout from _run_command(); run_experiment() also returns the throughput obtained at the end of the db_bench run db_timeseries_parser: added support for the case when condition requires evaluate_expression at each epoch instead of with only aggregated values; fetch_aggregated_values() returns aggregated values of statistics for a given entity, earlier it used to return the same for all entities db_config_optimizer: moved the code that applies suggestion from improve_db_config to apply_action_on_value; added run_v2() and improve_db_config_v2(): in this one rule is picked at a time, all its suggestions are applied, then bench_runner.run_experiment(new_config) called, if throughput improves, then use the new_data_sources returned for checking for more triggered rules, else, backtrack to the previous config and pick another rule to apply. db_stats_fetcher: added a parser for the ods cli output; added some more code for testing rule_parser: remove the check for 'bursty' conditions in the Rule's is_triggered() method; since now the evalutate_expression (without aggregation_op) also returns a list of epoch where the expression evaluates to true Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Changes per file: config_optimizer_example- command-line args support for options that are not supported by options.ini file, but can be given to bench_runner db_benchmark_client- fixed the bugs in the location and name of the LOG files; support for using the misc_options as db_bench command-line args; some testing code db_options_parser- method for finding diff between 2 option_configs; support for misc_options; some testing code rules.ini- support for misc_options db_config_optimizer- per-method changes: * apply_action_on_value: handle cases when old_value is None * improve_db_config: modified to handle the case when a suggestion's option was not in the existing config * improve_db_config_v2: same as above * disambiguate_guidelines: handle the case of disambiguation when a guideline's option is not in the existing config * run_v2: shifted code for picking a rule and getting updated_config to new method: apply_suggestions() * apply_suggestions: new method to pick new rule, and get updated_config * get_backtrack_config: new method to get config to update options to so that the latest changes applied, are reversed Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Changes per file: config_optimizer_example - removed the db_log_dir option, since making changes to it might cause DBBenchRunner to crash when it tries to use it as a command-line arg db_options_parser - added a method to return all the options in the DatabaseOptions object db_benchmark_client - added method to fetch default Rocksdb options used by db_bench in case the default OPTIONS file is not provided; shifted the db_bench output parsing code to a separate method; created a build the appropriate command for db_bench; added some testing code Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

changes per file: db_stats_fetcher: add the DatabasePerfContext class; some code for testing db_benchmark_client: modified DBBenchRunner to parse its own output and return a DatabasePerfContext too; changed the return type of run_experiment; added some code for testing rule_parser: changes in trigger_conditions to take into account the changes in the data_sources object returned by DBBenchRunner db_timeseries_parser: initialise stats_freq_sec in TimeSeriesData constructor Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Changes per file: config_optimizer_example: command line args ldb, base_db_path db_benchmark_client: run_experiment takes db_path db_config_optimizer: bootstraps database before each experiment run db_timeseries_parser: performing a common (bursty/evaluate_expression) check for entities with all required stats (per condition) in check_and_trigger_conditions Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Changes in files: config_optimizer_example: removed the ldb argument db_benchmark_client: handle the compression option as a command-line arg; bootstrap the database according to the current options db_config_optimizer: don't set up the database with default options; leave it to the benchrunner to do the same on its own with the applicable options rules.ini: added some more rules and corrected some Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Changes per file: db_benchmark_client: add staticmethod is_metric_better() to BenchRunner; implement the same in DBBenchRunner to compare throughput db_config_optimizer: use bench_runner.is_metric_better() method to decide whether to backtrack or not in the optimization loop Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Changes per file: bench_runner: shifted BenchmarkRunner abstract class to this file; also shifted method get_info_log_file_name to this class from DBBenchRunner as other bench runners may need it too db_bench_runner: removed the OPTIONS diff logic; assumption is that db_bench should be able to handle Rocksdb OPTIONS given as command-line args even when options_file arg is given; added more comments to the code ALL FILES: added user-id to the TODO comments Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

maysamyabandeh · 2018-07-24T23:33:30Z

tools/advisor/advisor/bench_runner.py

@@ -0,0 +1,32 @@
+from abc import ABC, abstractmethod


is the copy-right header forgotten?

maysamyabandeh · 2018-07-24T23:37:43Z

tools/advisor/advisor/config_optimizer_example.py

+    # these are options that are column-family agnostic and are not yet
+    # supported by the Rocksdb Options file: eg. bloom_bits=2
+    parser.add_argument('--base_db_path', required=True, type=str)
+    parser.add_argument('--misc_options', nargs='*')


this is most likely to be read by non-experts. would be great to add an inline comment of what misc_option is. In fact would be great to add helper string for all these options.

example of helper strings:

parser.add_argument('integers', metavar='N', type=int, nargs='+', help='an integer for the accumulator')

maysamyabandeh · 2018-07-24T23:40:45Z

tools/advisor/advisor/config_optimizer_example.py

+CONFIG_OPT_NUM_ITER = 10
+
+
+def main(args):


can you add an example command line here (without ods-related args)?

maysamyabandeh · 2018-07-24T23:45:27Z

tools/advisor/advisor/db_bench_runner.py

+        self.db_bench_binary = positional_args[0]
+        self.benchmark = positional_args[1]
+        self.db_bench_args = None
+        self.supported_benchmarks = None


can you add a TODO to move this line to unit test too?

maysamyabandeh · 2018-07-25T00:00:31Z

tools/advisor/advisor/db_bench_runner.py

+                        if tk
+                    }
+                    # add timestamp information
+                    timestamp = int(time.time())


Add a TODO that is a hack and should be replaced with the timestamp that db_bench will provide per printed perf context.

maysamyabandeh · 2018-07-25T00:04:28Z

tools/advisor/advisor/db_bench_runner.py

+        if not log_dir_path.endswith('/'):
+            log_dir_path += '/'
+
+        logs_file_prefix = log_dir_path + log_file_name


it is not prefix is it? then can we rename it to log_full_path?

talked offline. lets keep the name but explain that old files (which share the same prefix) are not currently checked.

added TODO's to the places where these names are being used to fetch logs:
-- in check_and_trigger_conditions() of DatabaseLogs class (db_log_parser.py)
-- in fetch_timeseries() of LogStatsParser class (db_stats_fetcher.py)

Changes per file: bench_runner: added example in comment for explanation of get_info_log_file_name config_optimizer_example: added help strings to the argparser command-line arguments db_bench_runner: added TODO comments db_config_optimizer: changed method apply_action_on_value() for better readability, and it throws AssertionError if new_value is None; handling AssertionError in places that call this method; added more comments to the code; removed disambiguate_guidelines() and handling the scope of database-wide options in get_guidelines() db_log_parser: added comments for readability db_stats_fetcher: added comments for readability Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

maysamyabandeh · 2018-07-25T22:00:55Z

tools/advisor/advisor/db_config_optimizer.py

+            curr_options, curr_rule, suggestions_dict
+        )
+        conf_diff = DatabaseOptions.get_options_diff(curr_conf, updated_conf)
+        if not conf_diff:  # the current and updated configs are the same


can you add a printf after this if-then so that we would notice if it got stuck in the loop?

maysamyabandeh · 2018-07-25T22:03:33Z

tools/advisor/advisor/db_config_optimizer.py

+        self.rule_parser = rule_parser
+        self.base_db_path = base_db
+
+    def get_guidelines(self, rules, suggestions_dict):


can we remove this function as it is only used in run()?

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

maysamyabandeh · 2018-07-26T15:49:47Z

tools/advisor/advisor/db_options_parser.py

        all_options = []
+        # Example: in the section header '[CFOptions "default"]' read from the


How many section types we have? Can you point me to where it is documented?

rocksdb/options/options_parser.h

Line 24 in 17731a4

enum OptionSection : char {

Can you put inline comment like: "Refer to OptionSection for other type of section types."

facebook-github-bot · 2018-07-26T16:07:37Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

maysamyabandeh · 2018-07-26T16:41:45Z

tools/advisor/advisor/db_log_parser.py

-
-    def get_time(self):
+        self.column_family = None
+        for col_fam in column_families:


add a sample string

maysamyabandeh · 2018-07-26T16:46:36Z

tools/advisor/advisor/db_log_parser.py

+        self.message = self.message + '\n' + remaining_log.strip()
+
+    def get_timestamp(self):
+        # assumes that the LOG timestamp is in GMT; this means that if this


These lines are extra and distracting. Can you remove them:

+ this means that if this + # timestamp is to be converted to a human-readable value, one could + # either use the method get_human_readable_time() in this class or + # one can use 'datetime.utcfromtimestamp(<timestamp>).isoformat()'; + # in either case the value returned would match the Log time in the LOG + # file

maysamyabandeh · 2018-07-26T17:35:35Z

tools/advisor/advisor/db_options_parser.py

+                # supported by the Rocksdb OPTIONS file, so it is not prefixed
+                # by '<section_type>.' and must be stored in the separate
+                # misc_options dictionary
+                if NO_COL_FAMILY not in options[option]:


What is the case that would make this if condition to be true?

As an example, let there be a Suggestion (in the Rules spec) in which the 'option' field is incorrectly specified, say option=write_buffer_size instead of option=CFOptions.write_buffer_size.
This is an undesirable situation and I am adding a print statement here to print a Warning to the user.

maysamyabandeh · 2018-07-26T17:40:16Z

tools/advisor/advisor/db_options_parser.py

+            for ix, option in enumerate(cond.options):
+                if option not in reqd_options_dict:
+                    missing_reqd_option = True
+                    break  # required option is absent


What are the cases that this could happen?

Example: there is an OptionCondition that requires the 'bloom_bits' option in its expression to be evaluated, however, the bloom filter is not enabled yet and this option is not part of the misc_options dictionary.

maysamyabandeh · 2018-07-26T17:47:26Z

tools/advisor/advisor/db_stats_fetcher.py

+        stat_values = [
+            token
+            for token in token_list[1:]
+            if token != ':'


What happens to the ones with :? like "rocksdb.db.get.micros P50 :"

maysamyabandeh · 2018-07-26T18:08:35Z

tools/advisor/advisor/db_timeseries_parser.py

+    def fetch_timeseries(self):
+        pass
+
+    def fetch_burst_epochs(


Can you add inline comments of what each function do? ditto for the next functions as well.

maysamyabandeh · 2018-07-26T18:36:01Z

tools/advisor/advisor/ini_parser.py

@@ -62,7 +62,7 @@ def get_element(line):
    def get_key_value_pair(line):
        line = line.strip()
        key = line.split('=')[0].strip()
-        value = line.split('=')[1].strip()
+        value = "=".join(line.split('=')[1:])
        if not value:


In what case value could be None?

(The OptionsSpecParser is a subclass of IniParser)
I added this check for checking for empty string, for example in the OPTIONS file we can have a line:
"db_log_dir="
Changing the if-condition to explicitly check for empty string for clarity

maysamyabandeh · 2018-07-26T18:38:39Z

tools/advisor/advisor/rule_parser.py

+                    in_seconds *= (24 * 60 * 60)
+                self.overlap_time_seconds = in_seconds
+
+    def get_overlap_timestamps(self, key1_trigger_epochs, key2_trigger_epochs):


ditto: inline comments of what the function do.

maysamyabandeh · 2018-07-26T18:42:41Z

tools/advisor/advisor/rule_parser_example.py

+from advisor.rule_parser import RulesSpec
+
+
+def main(args):


Is this a test file? If yes, can we rename it?

I used this file to run and manually test the Rules parser when that was the only code. I am not using it at all now, so I am deleting the file.

maysamyabandeh · 2018-07-26T18:44:13Z

tools/advisor/advisor/rules.ini

@@ -29,7 +29,6 @@ conditions=stall-too-many-memtables
 [Condition "stall-too-many-memtables"]
 source=LOG


Where do we explain the format to the users? If we do not have such place already should we do it as inline comment on top of this file?

maysamyabandeh · 2018-07-26T18:44:49Z

tools/advisor/test/input_files/test_rules.ini


 [Condition "log-2-true"]
 source=LOG
 regex=Stalling writes because we have \d+ level-0 files
-scope=column_family


Why did we remove the scope keyword?

I thought that the code should be able to figure out the scope on its own when the triggers are set for the conditions.

…ions This commit improves the comments that were added to the parsing methods in the previous commit. It adds a new method is_misc_option() to the DatabaseOptions class that isolates the logic for checking whether a given option is supported by the Rockdb OPTIONS file or not (in which case it is a 'misc_option'). Added the word 'WARNING' where the eval() method might fail. Added assert() for a Suggestion's option and action fields to be non-empty in the db_config_optimizer. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

maysamyabandeh · 2018-07-26T23:10:20Z

tools/advisor/advisor/db_options_parser.py

@@ -232,6 +239,10 @@ def update_options(self, options):
                # by '<section_type>.' and must be stored in the separate
                # misc_options dictionary
                if NO_COL_FAMILY not in options[option]:
+                    print(
+                        'WARNING(DatabaseOptions): check format of option: ' +


can you print a more descriptive warning?

maysamyabandeh · 2018-07-26T23:11:12Z

tools/advisor/advisor/db_options_parser.py

+                    print(
+                        'WARNING(DatabaseOptions): condition ' + cond.name +
+                        ' requires option ' + option + ' but this option is' +
+                        ' not available'


Also lets say that we are skipping the condition.

facebook-github-bot

@maysamyabandeh is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…k#4169) Summary: In facebook#3934 we introduced advisor scripts that make suggestions in the config options based on the log file and stats from a run of rocksdb. The optimizer runs the advisor on a benchmark application in a loop and automatically applies the suggested changes until the config options are optimized. This is a work in progress and the patch is the initial skeleton for the optimizer. The sample application that is run in the loop is currently dbbench. Pull Request resolved: facebook#4169 Reviewed By: maysamyabandeh Differential Revision: D9023671 Pulled By: poojam23 fbshipit-source-id: a6192d475c462cf6eb2b316716f97cb400fcb64d

poojam23 requested a review from maysamyabandeh July 23, 2018 21:23

facebook-github-bot added the CLA Signed label Jul 23, 2018

maysamyabandeh reviewed Jul 24, 2018

View reviewed changes

poojam23 added 16 commits July 24, 2018 15:05

Adding support for ODS metric rules to the advisor

a30e8ac

Fixing the rule_parser output and its unit tests

cfac73a

the wrong version of the optimizer

ea082d5

updated TimeSeriesData, OdsStatsFetcher, added DBBenchStatsParser

494ad84

updated DBBenchRunner, fixed some bugs in Stats classes

4d54e50

poojam23 force-pushed the optimizer-jul23 branch from fd5700c to 1bffadd Compare July 24, 2018 22:09

maysamyabandeh reviewed Jul 24, 2018

View reviewed changes

maysamyabandeh reviewed Jul 25, 2018

View reviewed changes

More comments for readability;NO_COL_FAMILY;remove old optimization loop

b2ac4b4

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

maysamyabandeh reviewed Jul 26, 2018

View reviewed changes

maysamyabandeh changed the title ~~Optimizer: merging commits from 2 PRs into one~~ Optimizer's skeleton: use advisor to optimize config options Jul 26, 2018

This was referenced Jul 26, 2018

[WIP] A base PR for the rocksdb configuration optimizer. #4047

Closed

[WIP] Optimizer code #4119

Closed

DatabaseOptions: use is_misc_option(); update WARNING statements

773fd8a

maysamyabandeh approved these changes Jul 26, 2018

View reviewed changes

facebook-github-bot reviewed Jul 26, 2018

View reviewed changes

facebook-github-bot closed this in 134a52e Jul 27, 2018

maysamyabandeh mentioned this pull request Jul 27, 2018

Adding support for ODS rules in advisor #3978

Closed


		return (logs_file_prefix, stats_freq_sec)

		def _get_options_command_line_args_str(self, curr_options):

		return self.supported_benchmarks


		# TODO: remove this method, used only for testing

		all_options = []
		# Example: in the section header '[CFOptions "default"]' read from the

		@@ -29,7 +29,6 @@ conditions=stall-too-many-memtables
		[Condition "stall-too-many-memtables"]
		source=LOG

Optimizer's skeleton: use advisor to optimize config options #4169

Optimizer's skeleton: use advisor to optimize config options #4169

Conversation

poojam23 commented Jul 23, 2018 • edited by maysamyabandeh

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Jul 26, 2018

Choose a reason for hiding this comment

maysamyabandeh Jul 26, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

poojam23 commented Jul 23, 2018 •

edited by maysamyabandeh

maysamyabandeh Jul 26, 2018 •

edited