Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked boolean-only Option processing framework for the OMR Compiler #3675

Open
wants to merge 10 commits into
base: master
from

Conversation

@nbhuiyan
Copy link
Contributor

commented Mar 18, 2019

This PR introduces some of the main components of the new option processing framework for the OMR Compiler that aims to replace the existing Options class. This was motivated by the fact that the current Options class is difficult for developers to work with and not in an ideal state for an open-source project (see #2835). This is the first of a handful of PRs in both OMR and OpenJ9 necessary to migrate to a new option processing framework. During the transition period, the new and old implementations will co-exist, but not used at the same time by default.

Overview

Some of the main components of the new options processing framework being introduced in this PR are the following:

  • OMROptions.json
  • options-gen
  • A new option table design
  • OptionProcessors
  • OptionsBuilder
  • CompilerOptionsManager (will be renamed as OptionsManager once existing Options class is deprecated)
  • CompilerOptions (will be renamed as Options once existing Options class is deprecated)

The sections below discusses the roles of each of these components, and how they work together.

OMROptions.json

OMROptions.json contains all the OMR compiler options as an array of JSON objects and is consumed by the options-gen script at build time. Here is an example of an entry in the JSON array followed by the description of the fields:

{
    "name": "x86UseMFENCE",
    "category": "M",
    "desc": "Enable to use mfence to handle volatile store",
    "option-member": "TR_X86UseMFENCE",
    "type": "bool",
    "default": "false",
    "processing-fn": "setTrue",
    "subsettable": "no"
}
  • name: The name field is used as an identifier for an option. Every entry should have a unique name. Options provided in the command line and the environment are matched against this field.
  • category: The category field is used to group the options into different categories (eg, codegen, optimization, logging, etc). In this case, "M" stands for miscellaneous options. This information will be used when outputting options help text.
  • desc: Option description text explaining what this option does. It will be used when outputting options help text.
  • option-member: This field is used to identify the field in the options class that stores the value of the option.
  • type: Type of the option data.
  • default: This is the value the option is initialized as, and will remain that way unless changed through the command line, environment variable, or some other logic after options have been processed. For boolean options, the default is false. Numeric and textual data may use this field to set a default value, and will also have the option of using an additional field in the json object to provide as a parameter to the processing function.
  • processing-fn: This is the name of the function in the TR::OptionProcessors class that will be used to process this option if it appears on the command line or environment variable.
  • subsettable: This field is used to set whether an option can be a part of an option subset to apply to specific method compilations. This capability will be added in the near future, so currently this field is not used during the processing of options.

To add a new option, all that you would need to do is add a new entry anywhere in the json file and re-build. If the available option processing functions do not meet your requirements, you may define a new function in TR::OptionProcessors and specify its name in the processing-fn field. Currenlty, OMROptions.json is a very large file because it contains downstream project options as well. Downstream project options should exist in their own Options.json. Since there are a lot of options that can be moved downstream, this migration will happen over time as downstream projects get support for the new option processing framework.

options-gen

Simplifying the process of adding new options was possible through the use of the options-gen script run at build time. The options-gen script takes 2 input files: OMROptions.json containing the OMR options, and Options.json (or any other name you may define as a command line option to the script) containing the project-specific options. Other command-line options include the path to the input files, as well as the -platform option that will exclude or include platform-specific options (this mechanism is yet to be implemented in the near future, but it will basically be another field in the JSON options object allowing you to specify platform(s) to apply an option to)

The script takes the entries in OMROptions.json and Options.json, combining them into a single list, and processing them into an internal representation as an instance of class OptionTable. The internal representation is then used to generate a hash table that is used at run-time when processing the options. At build time, options-gen determines the index The name of each entry is hashed using a hashing algorithm that is also used at run-time to get the index into the bucket of the hash table containing these processed entries.

The options-gen script currently outputs 6 files:

  • OptionCharMap.inc: an array of case-insensitive values associated with the characters used in the hashing function to determine the index into the hash table. This array is indexed into using the ascii values.
  • OptionTableProperties.inc: contains macro definitions regarding the option table properties such as min/max hash value, table size, etc
  • OptionTableEntries.inc: contains an aggregate initialization list for the hash table
  • Options.inc: contains the data members of the CompilerOptions class
  • OptionTranslatingSwitch.inc: contains the body of a switch statement that helps translating between the option word masks used in the existing Options class and the corresponding pointer to member of the new CompilerOptions class. This is temporarily in place to enable querying the new options class using the existing getOption(TR_CompilationOptions o) and setOption(TR_CompilationOption o) functions.
  • OptionEnumToStringSwitch.inc: contains the body of a switch statement that translates the existing option masks to strings that can be used for debugging cases of option setting mismatches

A new option table design

Previously, the option table was an array alphabetically ordered by the option names. Finding the entry corresponding to a command line option string in the array required a binary search in the array, comparing the string each time. In the new framework, the option table is a hash table that is indexed into using a hash function.

OLD DESIGN:

{ {aa},
  {ab},
  {ad},
  {ba},
  .
  .
  NULL
}

NEW DESIGN:

{ {{aa}},
  {{ab},{ba}},
  {},
  {{ad}},
  .
  .
  {}
}

In the example above, notice how the hash table is not perfect, or minimal, although a lot of work was done to get close to that trying to mimic gperf's perfect hash function generating mechanism. This is a possible area of improvement we could perhaps revisit later to completely eliminate string comparisons during the processing of command line options. The hash function simply gives us the index into a bucket in the hash table that contain a single entry, or a number of entries that hash to the same value, requiring us to perform string comparisons to get the right entry. At the moment, it doesn't seem worth the trouble as we only need to use the option table during once during the initialization phase.

OptionProcessors

OptionProcessors is an extensible class that contains the option processing functions. There will be a set of standard processing functions (eg, setTrue, setFalse, setInt32, setStringFromOptionArg, etc) that you may use for processing an option you add to the json file. If you would need your option to be processed in a very specific way not handled by existing processors (eg, option affecting multiple members of the option class), you may simply define your own and specify its name in the processing-fn field of the json object.

OptionsBuilder

This class handles the processing of command line and environment variable options and processing of the options (as well as the option sets) during the initialization.

CompilerOptions

This class contains all of the options data members by including the Options.inc file generated by options-gen.

CompilerOptionsManager

CompilerOptionsManager is an extensible class handling the initialization of options, looking up entries in the option table, obtaining the CompilerOptions object specific to a compilation, setting a project's default options (ones that can only be determined at runtime).


In addition to the components above, this PR also implements a solution using preprocessor directives to have both old and new option processing frameworks coexist within #if/#else blocks. It is also possible to verify that the outcome of querying CompilerOptions is same as the existing Options class when NEW_OPTIONS_DEBUG is defined.

Currently, the new options are initialized with the existing options to make the transition easier, but eventually this will be managed by the CompilerOptionsManager.

You may build using CMake to test adding and querying of options. To use the new options, run CMake configure with -DOMR_COMPILER_NEW_OPTIONS.

To enable checking for discrepancies between the new and old options, run CMake configure with the additional flag: -DOMR_COMPILER_NEW_OPTIONS_DEBUG.

@0xdaryl 0xdaryl self-assigned this Mar 21, 2019

@fjeremic

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2019

Simplifying the process of adding new options was possible through the use of the options-gen script run at build time.

@dnakamura FYI for CMake changes.

@fjeremic fjeremic added the cmake label Mar 25, 2019

@nbhuiyan nbhuiyan force-pushed the nbhuiyan:options-gen branch from e74c429 to 1bc4a50 Apr 3, 2019

@nbhuiyan nbhuiyan force-pushed the nbhuiyan:options-gen branch from 1bc4a50 to 953f9ed May 16, 2019

@nbhuiyan nbhuiyan force-pushed the nbhuiyan:options-gen branch 2 times, most recently from 8e06fa5 to 6192dc7 May 27, 2019

@nbhuiyan nbhuiyan marked this pull request as ready for review May 28, 2019

@nbhuiyan nbhuiyan changed the title WIP: Reworked boolean-only Option processing framework for the OMR Compiler Reworked boolean-only Option processing framework for the OMR Compiler May 29, 2019

@0xdaryl

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2019

I know this doesn't strictly need it (according to GitHub), but would you mind rebasing this? I'd like to schedule a few tests with it.

nbhuiyan added some commits Feb 14, 2019

WIP: Initial boolean-only option processing framework
Includes:
 * OMROptions.json
 * options-gen.py
 * A new option table design
 * A new option processing and querying framework

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Use member initializer list for CompilerOptions class members
This change was necessary due to some build compilers not having
full C++11 support. A new file, OptionInitializerList.inc is
generated by options-gen.

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Use a 2d array of OptionTableItem for the option hash table
Previously, a vector of OptionTableItem was used to represent
a bucket in the hash table, and the table consisted of an array
of such vectors representing buckets.

Unfortunately, older build compilers with incomplete C++11 support
may not have support for using initializer lists for initializing
std::vector, which is a non-aggregate type.

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Improve handling of the new Compiler Options in CMake builds
This commit adds a number of changes:
* Added new CMake configuration option OMR_COMPILER_NEW_OPTIONS. By
  default, this option is OFF, and would need to be turned on if you
  want to use the new options instead. Another configuration option,
  OMR_COMPILER_NEW_OPTIONS_DEBUG enables comparing results of querying
  the old and the new options.
* Add -outputdir commandline option to options-gen script to specify
  output dir to write generated files to
* Generate .inc files in build/compiler/control instead of generating
  them in the source tree
* Make generate-options target as a requirement for compiler projects
  instead of being part of ALL (provided OMR_COMPILER_NEW_OPTIONS is ON)

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Introduce a temporary solution for querying options using current API
options-gen now generates the body of a swith statement that returns
the pointer to member in the new CompilerOptions class based on the
TR_CompilationOptions enum value. This opens up a way to remove the
need to replace all instances of boolean option querying using
getOption(TR_CompilationOptions o), simplifying the transition phase.

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Add some missing options to the OMROptions.json file
Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Generate OptionEnumToStringSwitch.inc
This is required for having a mechanism to translate option
masks to strings to help with debugging options.

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Add TR::CompilerOptions* _newOptions member to class OptionSets
This is implement the initial option set support in the new
options.

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Implement Option Set support in the new options, and more
Some of the major changes this commit introduces:
* The new options are initialized during the initialization
  of TR::Options as a temporary solution for the transition phase.
* The option processing responsibility of looking up entries
  in the options hash table is now performed by OptionsBuilder
  instead of CompilerOptionsManager.
* The TR::Options::getOptions and TR::Options::setOptions methods
  now set/query both old and new options, and comparing the results.
  This comparision is in place to identify any mismatches early on
  during the transition phase.
* Every instance of TR::Options class has a corresponding
  TR::CompilerOptions object.
* OptionsBuilder's option processing mechanism has been improved
  to handle option filter expressions and processing them into
  option sets.

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>
Sync OMROptions.json with OMROptions.hpp
Added/removed entries in OMROptions.json to reflect the
recent additions/deletions made in OMROptions.hpp and
the option table.

Signed-off-by: Nazim Uddin Bhuiyan <nazimudd@ualberta.ca>

@nbhuiyan nbhuiyan force-pushed the nbhuiyan:options-gen branch from 6192dc7 to cf5ad55 Jul 31, 2019

@nbhuiyan nbhuiyan requested a review from rwy0717 as a code owner Jul 31, 2019

@nbhuiyan

This comment has been minimized.

Copy link
Contributor Author

commented Jul 31, 2019

@0xdaryl I have rebased the branch. To get a build that uses the new options, you will need to run CMake configure with -DOMR_COMPILER_NEW_OPTIONS=ON as the new options are off by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.