Skip to content

Conversation

@vkarak
Copy link
Contributor

@vkarak vkarak commented Jun 21, 2018

This PR provides two major contributions:

  1. A new build system infrastructure.
  2. A major redesign of the internal infrastructure of shell script generation.

New build system infrastructure

The compilation phase is now separated from the programming environment.
It is part of the new concept of build systems. A build system is
responsible for generating the required commands for compiling a
code. The framework then uses these commands to generate a build job
script which is then submitted. Currently, only local compilation is
supported.

The current behavior of the sourcepath and sourcesdir attributes is
maintained for convenience. Internally, they translate to concrete build
systems that are set up accordingly.

All build systems share some basic attributes and behavior:

  • The compilers and compilation flags. By default, if not specified the
    corresponding values from the current programming environment will be
    used.
  • The ability to ignore completely the current programming
    environment. A build system may be configured independently of the
    current programming environment by explicitly setting the compilers
    and compilation flags.

The above design allows the programming environment to become immutable
holding the global default values for each system.
Currently, for backward compatibility, it is not yet immutable, but
setting its attributes is now deprecated.

Two build systems are provided by this commit:

  1. SingleSource: This build system is responsible for compiling a
    single source file in any of the recognized programming languages,
    i.e., C, C++, Fortran and CUDA.

  2. Make: This build system is responsible for compiling a project
    using the make command.

Redesign of the shell script generation infrastructure

Shell script generation is revised significantly. Here are the key
points of this revision:

  • ReFrame generates only Bash. This has always been the case, but the
    previous design was such as to "enable" generation of other types of
    shell scripts. This pseudo-generic design was completely dropped and
    replaced by a Bash-only script generator. The rationale behind this is
    that in order to support fully portable script generation, it would
    require an intermediate language that would interface with the shell
    script generators and would require users to program in that language
    when setting the pre_run, post_run etc. attributes. This is way to
    much effort with zero or negative value.

    Another solution considered was to globally control the shell type
    used by ReFrame from the settings. The downside with this solution is
    that user tests are becoming inherently less portable, if a user wants
    to write his pre_run, post_run etc. commands in, say, fish.

    Finally, supporting multiple shell script generation backends would
    make the internal design more complicated and less consistent, since
    the different parts of the framework that need to emit shell commands
    would be required to do that through another API and not directly.

    For all these reasons, and given that Bash is the "standard" POSIX
    shell, I decided to drop completely the old pseudo-generic design in a
    favor of a simpler and more consistent behavior across the framework.

  • The generated shell scripts are more sophisticated now. They can trap
    different events (signals, errors and exit) during their execution and
    act accordingly. These additions are crucial for two parts of the
    framework:

    1. For the generated build script. We want to trap errors in order to
      exit immediately if any of the commands fail without requiring the
      user to take extra precaution for that when setting the
      prebuild_cmd and postbuild_cmd.

    2. For getting reliable exit and signal code information for job
      scheduler backends that do not support it. By trapping
      signals (with a "terminate" or "core dump" default action) and the
      shell exit, we can always record the both the signal number and the
      exit code in the output. Then the scheduler backend can retrieve
      this information.

      NOTE: This feature is not yet implemented.

  • A shell script inside ReFrame consists of the three parts:

    1. The shebang, i.e., the very first line of the script.
    2. The prolog
    3. The body
    4. The epilog

    Anyone wanting to generate a script may decide in which part (except
    the shebang) to emit the commands he likes. When the finalize()
    method is called the whole script will be generated. ReFrame emits its
    traps between the prolog and the body.

  • The Job abstract class does not hold any more information not
    relating directly to the job creation and status, i.e., pre_run,
    post_run, executable etc. Instead, it provides a new, richer
    version of the prepare method for generating the job script. This
    new prepare() has the following signature:

    def prepare(self, commands, environs, **gen_opts):

    The commands argument is a list of the actual shell commands to be
    emitted in the job script. The caller is responsible for filling up
    this list.

    The environs argument is a list of the environments to be set loaded
    before emitting the commands.

    The gen_opts arguments are passed through the bash script generator.

  • This PR establishes also conventions for the functions emitting shell
    code. These should start with the emit_ prefix and must return a
    list of shell commands. They must not accept a shell script generator
    object as argument. The standard way of generating a shell script is
    the following:

    import reframe.core.shell as shell
    
    with shell.generate_script(filename) as gen:
        gen.write_prolog(emit_foo1())
        gen.write_body(emit_bar())

    The generate_script context manager takes care of the file creation
    and the finalization of the shell script.

Fixes #295, #297, #296, #339, #176.

Still todo:

Vasileios Karakasis added 2 commits June 21, 2018 11:15
The compilation phase is now separated from the programming environment.
It is part of the new concept of build systems. A build system is
responsible for generating the required commands for compiling a
code. The framework then uses these commands to generate a build job
script which is then submitted. Currently, only local compilation is
supported.

The current behavior of the `sourcepath` and `sourcesdir` attributes is
maintained for convenience. Internally, they translate to concrete build
systems that are set up accordingly.

All build systems share some basic attributes and behavior:

- The compilers and compilation flags. By default, if not specified the
  corresponding values from the current programming environment will be
  used.
- The ability to ignore completely the current programming
  environment. A build system may be configured independently of the
  current programming environment by explicitly setting the compilers
  and compilation flags.

The above design allows the programming environment to become immutable
holding the global default values for each system.
Currently, for backward compatibility, it is not yet immutable, but
setting its attributes is now deprecated.

Two build systems are provided by this commit:

1. `SingleSource`: This build system is responsible for compiling a
    single source file in any of the recognized programming languages,
    i.e., C, C++, Fortran and CUDA.

2. `Make`: This build system is responsible for compiling a project
   using the `make` command.
Create a different type that accepts Sequences.
self._fflags = fflags
self._ldflags = ldflags
self._include_search_path = []
self._propagate = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not treating this variable in the new build systems. I should fix that for compatibility reasons.

class BuildSystemError(ReframeError):
"""Raised when a build system is not configured properly."""


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also remove the CompilationError. Not valid any more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I need to keep the CompilationError and change it. Before we used to print the output of the compilation output. We still need to treat such errors specially. Perhaps, we could print the path to the compilation output.



class TestProgEnvironment(unittest.TestCase):
class _TestProgEnvironment:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, a workaround I forgot to remove. I should fix this file.


# FIXME: this check is not reliable for certain scheduler backends
if self._build_job.exitcode != 0:
raise PipelineError('compilation failed')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should perhaps raise a more specific error here, e.g., BuildError. What do you think?

@codecov-io
Copy link

codecov-io commented Jun 21, 2018

Codecov Report

Merging #340 into master will increase coverage by 0.06%.
The diff coverage is 91.65%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #340      +/-   ##
=========================================
+ Coverage   91.13%   91.2%   +0.06%     
=========================================
  Files          68      70       +2     
  Lines        8268    8581     +313     
=========================================
+ Hits         7535    7826     +291     
- Misses        733     755      +22
Impacted Files Coverage Δ
reframe/core/runtime.py 87.39% <ø> (ø) ⬆️
reframe/core/schedulers/pbs.py 65.38% <100%> (-2.48%) ⬇️
reframe/settings.py 100% <100%> (ø) ⬆️
unittests/resources/checks/hellocheck_make.py 100% <100%> (ø) ⬆️
reframe/core/schedulers/local.py 100% <100%> (ø) ⬆️
reframe/frontend/executors/__init__.py 97.61% <100%> (+0.02%) ⬆️
unittests/test_shell.py 100% <100%> (ø)
reframe/frontend/executors/policies.py 96.56% <100%> (+0.03%) ⬆️
unittests/test_launchers.py 93.9% <100%> (-0.15%) ⬇️
unittests/test_pipeline.py 97.37% <100%> (+0.02%) ⬆️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd295f1...12f682a. Read the comment docs.

Shell script generation is revised significantly. Here are the key
points of this revision:

- ReFrame generates only Bash. This has always been the case, but the
  previous design was such as to "enable" generation of other types of
  shell scripts. This pseudo-generic design was completely dropped and
  replaced by a Bash-only script generator. The rationale behind this is
  that in order to support fully portable script generation, it would
  require an intermediate language that would interface with the shell
  script generators and would require users to program in that language
  when setting the `pre_run`, `post_run` etc. attributes. This is way to
  much effort with zero or negative value.

  Another solution considered was to globally control the shell type
  used by ReFrame from the settings. The downside with this solution is
  that user tests are becoming inherently less portable, if a user wants
  to write his `pre_run`, `post_run` etc. commands in, say, fish.

  Finally, supporting multiple shell script generation backends would
  make the internal design more complicated and less consistent, since
  the different parts of the framework that need to emit shell commands
  would be required to do that through another API and not directly.

  For all these reasons, and given that Bash is the "standard" POSIX
  shell, I decided to drop completely the old pseudo-generic design in a
  favor of a simpler and more consistent behavior across the framework.

- The generated shell scripts are more sophisticated now. They can trap
  different events (signals, errors and exit) during their execution and
  act accordingly. These additions are crucial for two parts of the
  framework:

  1. For the generated build script. We want to trap errors in order to
     exit immediately if any of the commands fail without requiring the
     user to take extra precaution for that when setting the
     `prebuild_cmd` and `postbuild_cmd`.

  2. For getting reliable exit and signal code information for job
     scheduler backends that do not support it. By trapping
     signals (with a "terminate" or "core dump" default action) and the
     shell exit, we can always record the both the signal number and the
     exit code in the output. Then the scheduler backend can retrieve
     this information.

     NOTE: This feature is not yet implemented.

- A shell script inside ReFrame consists of the three parts:

  1. The shebang, i.e., the very first line of the script.
  2. The prolog
  3. The body
  4. The epilog

  Anyone wanting to generate a script may decide in which part (except
  the shebang) to emit the commands he likes. When the `finalize()`
  method is called the whole script will be generated. ReFrame emits its
  traps between the prolog and the body.

- The `Job` abstract class does not hold any more information not
  relating directly to the job creation and status, i.e., `pre_run`,
  `post_run`, `executable` etc. Instead, it provides a new, richer
  version of the `prepare` method for generating the job script. This
  new `prepare()` has the following signature:

  ```python
  def prepare(self, commands, environs, **gen_opts):
  ```

  The `commands` argument is a list of the actual shell commands to be
  emitted in the job script. The caller is responsible for filling up
  this list.

  The `environs` argument is a list of the environments to be set loaded
  before emitting the commands.

  The `gen_opts` arguments are passed through the bash script generator.

- This PR establishes also conventions for the functions emitting shell
  code. These should start with the `emit_` prefix and must return a
  list of shell commands. They must not accept a shell script generator
  object as argument. The standard way of generating a shell script is
  the following:

  ```python
  import reframe.core.shell as shell

  with shell.generate_script(filename) as gen:
      gen.write_prolog(emit_foo1())
      gen.write_body(emit_bar())
  ```

  The `generate_script` context manager takes care of the file creation
  and the finalization of the shell script.

Finally, this PR brings some fixes in the unit tests regarding the
resources directory.
@vkarak vkarak changed the title WIP: Build systems infrastructure [WIP] Build systems infrastructure Jun 26, 2018
@vkarak
Copy link
Contributor Author

vkarak commented Jun 27, 2018

@jenkins-cscs retry daint kesch monch

@vkarak
Copy link
Contributor Author

vkarak commented Jul 11, 2018

@teojgo @victorusu When reviewing this PR, I suggest playing a bit with your tests and try to adapt them to the build system syntax. This way you can have a better feeling of how it works and whether there are things that you don't like and need to be done differently.

- The old `CompilationError` is removed.
- This new exception just prints the location of the standard output and
  error of the build job.
@vkarak
Copy link
Contributor Author

vkarak commented Jul 11, 2018

The regression tests affected by this PR are a lot. I don't think it makes sense to include them in this one, cos it will become huge. I see two options: either we merge this to the master and cope with the deprecation warnings for some time in production or I will do a separate PR on this one (to be reviewed separately) that will update the affected tests. What do you think?

@teojgo
Copy link
Contributor

teojgo commented Jul 12, 2018

I think that everyone can take some regression tests that is mostly familiar with and adapt them to the new syntax/new build system. That way we can also practice both of these new features and completed the transformation as quickly as possible.

@vkarak vkarak requested review from kraushm and rsarm July 18, 2018 09:02
@vkarak vkarak changed the title [WIP] Build systems infrastructure Build systems infrastructure Jul 18, 2018
@vkarak
Copy link
Contributor Author

vkarak commented Jul 18, 2018

Removed WIP.

@vkarak vkarak changed the title Build systems infrastructure [feat] Add a new infrastructure for build systems Jul 18, 2018
Copy link
Contributor

@victorusu victorusu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of changes... I reviewed twice and I could not find anything that is broken...
I have also tested with the new build system with the DGEMM test and it worked fine for me...

Copy link
Contributor

@teojgo teojgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the duplicate job name line it looks fine.

def emit_preamble(self):
preamble = [
self._format_option(self.name, '--job-name="{0}"'),
self._format_option(self.name, '--job-name="{0}"'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --job-name is set twice here.

Vasileios Karakasis added 2 commits July 24, 2018 20:32
Copy link
Contributor

@teojgo teojgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create class infrastructure to support multiple build systems

4 participants