Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rose suite run proposals cylc flow.rc #44

Merged
merged 25 commits into from
Aug 27, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
c10ebdc
Proposed changes
wxtim Jul 24, 2019
b162aa3
moved some detail on future CLI into another folder.
wxtim Aug 1, 2019
22b6a4a
thoughts on cylc-flow.rc
wxtim Aug 1, 2019
227314b
changed a file name
wxtim Aug 1, 2019
1291a50
Work on cylc-flow
wxtim Aug 15, 2019
656304b
Update docs/rose-suite-run-proposal/cylc-flow-rc.md
wxtim Aug 19, 2019
f3ebca2
Update docs/proposal-rose-suite-run.md
wxtim Aug 19, 2019
9faf4ad
stuiff
wxtim Aug 19, 2019
f2c55da
Merge branch 'rose-suite-run-proposals-cylc-flow.rc' of github.com:wx…
wxtim Aug 19, 2019
0485e4e
Update docs/proposal-rose-suite-run.md
wxtim Aug 19, 2019
804e5c6
Update docs/proposal-rose-suite-run.md
wxtim Aug 19, 2019
3d8098c
Update docs/proposal-rose-suite-run.md
wxtim Aug 19, 2019
fb51d6e
fixes based on review
wxtim Aug 19, 2019
f155061
Merge branch 'rose-suite-run-proposals-cylc-flow.rc' of github.com:wx…
wxtim Aug 19, 2019
e921d1c
remove unwanted file
wxtim Aug 19, 2019
b5d78bd
spag changes
wxtim Aug 19, 2019
2c61713
modified suite-rc spec with thoughts from Dave Matthews
wxtim Aug 21, 2019
e0c418e
stiff
wxtim Aug 22, 2019
758a75c
Update docs/proposal-rose-suite-run.md
wxtim Aug 22, 2019
8f24575
Merge branch 'rose-suite-run-proposals-cylc-flow.rc' of github.com:wx…
wxtim Aug 22, 2019
e90cb34
deleted a file which sould never have been committed
wxtim Aug 22, 2019
aabe0c6
deleted a file which sould never have been committed
wxtim Aug 22, 2019
08d7de1
Refactored
wxtim Aug 22, 2019
d8534b5
Added thoughts on locking down global settings.
wxtim Aug 27, 2019
e1b6570
merged with master
wxtim Aug 27, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions docs/proposal-rose-suite-run.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ The work described in this document aims to:
* Change the way hosts for jobs are selected to improve support for clusters.
* Rationalize the formats of settings `*.rc` files.


## Background

In the early days of Rose, there was the desire for Rose to be an umbrella
Expand Down Expand Up @@ -40,16 +41,18 @@ all suites will start up with only Cylc commands in the future.
- [ ] Add sections for `[job platforms]`


- [ ] Implement new `cylc-flow.rc` form. [Cylc-flow #3260](https://github.com/cylc/cylc-flow/issues/3260)
- [ ] Implement new `cylc-flow.rc` schema. [Cylc-flow #3260](https://github.com/cylc/cylc-flow/issues/3260)
- [ ] Check old tests for `global.rc` & `suite.rc` to ensure that functionality
is not lost.
- [ ] Devise tests for the new flow.rc
- [ ] Create new config specification, perhaps called `config_schema.py` &
remove the config schemas folder.
- [ ] Devise tests for the new `cylc-flow.rc`
- [ ] Create new config schema module, called `cylc.flow.config_schema`
- [ ] Remove the `cylc.flow.cfgspec` folder.


- [ ] Implement Cluster support functionality. [Cylc-flow #2199](https://github.com/cylc/cylc-flow/issues/2199)
- [ ] Modify `task_job_mgr.py` to use the new variables.
- [ ] Modify modules to use the new variables:
- [ ] `task_job_mgr.py`
- [ ] `task_remote_mgr`
- [ ] Randomize login host used


Expand All @@ -74,6 +77,7 @@ all suites will start up with only Cylc commands in the future.

These functionalities are currently provided by Rose, but should really be part
of Cylc:

* On start up and reload, install suite on suite server cluster (cylc servers).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe suite running hosts or suite running platform?

Copy link
Member

@hjoliver hjoliver Aug 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer "suite run hosts" or "suite run platform".

I'm still not sure about "platform" in general. I forget the original argument against "cluster" ... but maybe "cluster" is OK even for a single host (the minimum size limit of a cluster!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that platforms is a more general/neutral term, and that, although clusters can include clusters of 1, that generally the term implies clusters of size >> 1.
I think that a customer hearing cluster will automatically assume we are talking about HPC/Large Compute resource, whereas "platfoms" covers submitting jobs to the raspi on my desk, my smartwatch, my desktop, Matt's desktop, a cylc server, or an HPC or SPICE system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made it "suite run platform" for the time being.

* On start up and reload, validate suite.
* On start up, archive old log directory.
Expand Down Expand Up @@ -107,6 +111,7 @@ While we consider the above, we may also want to consider the following:
subsets of settings:
![Venn Diagram showing expected common usage](img/flow_rc_settings_locs.svg)
* Migrate relevant settings from `rose.conf` and `rose-suite.conf`.
__UPDATE THIS__
* Settings such as `run directory` and `work directory` may need better names
(users think of "work" as a sub-directory of the run directory, but `run
directory` and `work directory` are configured separately, and the latter
Expand Down Expand Up @@ -149,10 +154,12 @@ configuration logic. Some points to consider:
* Users will configure tasks to run on clusters instead of hosts/batch systems.
* If relevant, improve alignment with DRMAA Open Grid Forum API?


So, for example, a suite `suite-flow.rc` might look like this:
(Although a detailed specification should also be created)
```ini
[cylc]

...

[scheduling]
Expand Down Expand Up @@ -244,8 +251,10 @@ sources on installation. Other things to consider:
### Suite Validation

The `rose suite-run` command calls `cylc validate --strict` by default.

Automatic suite validation should become the default behaviour for the new
command, as well as for `cylc reload`.
command, as well as for `cylc reload`.


### Rationalise Suite Start Up Commands

Expand Down
184 changes: 184 additions & 0 deletions docs/rose-suite-run-proposal/cylc-flow-rc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# Proposal example of a new style `cylc-flow.rc` file

## References
[cylc-admin PR #40](https://github.com/cylc/cylc-admin/pull/40)
[cylc-admin work plan](../proposal-rose-suite-run.md)

## Purpose
As part of the work to transfer functionality from `rose suite run` to cylc
it is proposed that the options available in global, user and suite config
files are brought into alignment. This file sets out a specification for the
combined `cylc-flow.rc` file, although for the foreseeable future `suite.rc`
will also be read in the same way.

## Preamble

To reduce changes required for end users this file format is based on
`suite.rc`: Where aspects of the file are described as unchanged this implies
that file contents are exactly the same as they would be in a `suite.rc`.
Users of the `global.rc` are usually admins or power users and thus changes
to the format of this file are preferable.

It should be noted that although it will be possible to modify all settings in
wxtim marked this conversation as resolved.
Show resolved Hide resolved
all contexts, that some settings are more likely to be used in global contexts
and some are more likely to be used in suite configs. It has been proposed
that it ought to be possible for sysadmins to lock some settings in site
configurations.

****

## specification for `cylc-flow.rc`

### 1. Top level sections to come from from `suite.rc`

Most sites will leave these to users (Although I could imagine adding a
copyright message to meta by default, for example, and one user has suggested a
very simple runtime might be added too, for training and debugging purposes.)

These items are:
```ini
[meta]
[scheduling]
[vizualization]
```

* `[cylc]` will be renamed `[general]`

### 2. Top level sections from site configuration

It is likely that most users will continue to have these set by site admins.

#### 2.1 Small changes

* `[authentication]` becomes `[authorization]`

* The `[suite servers]` to be renamed `[suite run platforms]` for consistency
with job platforms.

* `[test battery]` will be removed entirely.

* `[task events]` will be moved to `[runtime][[root]][[[events]]]`

#### 2.2 `[suite run platforms]`
This is the dictionary key formerly known as ``[suite servers]``. Changed only
for the purpose of keeping the name "platforms" conistent. This is expected to
be set only by system administrators. It should include the former top-level
section `[[suite host self-identification]]`.

#### 2.3 `[cylc]` -> `[general]`

`[cylc]` is be renamed `[general]`.

`global.rc[cylc]` at present contains a subset of the items available in
`suite.rc[cylc]` for this section so it is proposed that the new fill just has
the larger set. These items are:
```ini
[cylc]
health check interval = 600
task event mail interval = 300
[[events]]
```

### 3 Job Platforms and the deprecation of `[runtime][[TASK]][[[job]]]host`

#### 3.1 `[job platforms]`
Many of the options in this section will be very similar to `[hosts]`
It is expected that these will mainly be set at site level, but that
small numbers of power users may wish to over-ride them.

```ini
[job platforms]
[[example platform]]
run directory =
work directory =
task communication method =
submission polling intervals =
execution polling intervals =
scp command =
ssh command =
use login shell =
login hosts = # list of possible login hosts
batch system = # name of batch system
cylc executable =
global init-script =
copyable environment variables =
retrieve job logs =
retrieve job logs command =
retrieve job logs max size = [[default directives]]

retrieve job logs retry delays =
task event handler retry delays =
tail command template =
[[batch systems]]
[[__MANY__]]
err tailer =
out tailer =
err viewer =
out viewer =
job name length maximum =
execution time limit polling intervals =


[[default directives]] # This is probably something to do
--some-directive="directive here!" # sometime after cylc8
```

#### 3.2 Legacy Hosts behaviour
`[hosts]` will be deprecated but we need to keep many of its settings in
`[job platforms]`. For back compatibility host should re-direct to
`[runtime][[__MANY__]][[[platform]]]`. If the re-mapped `host` is part of a
cluster defined in `[job platforms]` then that job will use that cluster.
If a user wishes to over-ride this they can over-ride the
`[job-platforms][[PLATFORM]]` section.


### 4 Top level sections to merge in a more complex way

#### 4.1 `[[[job]]]` & `[[[remote]]]`
Old `[runtime][[__MANY__]][[[job]]]` & `[[[remote]]]`
sections to be merged and rationalized, being replaced by a new
`[runtime][[__MANY__]][[[job]]]` section.

We should select the platform defined by `[job platforms]`

```ini
[runtime]

[[job]]
platform =
```

I think that we will probably want users to set this in the
[job platforms] section, leaving some of these options here as due-to-be
deprecated back compat over-rides which will give warnings if set?

```ini
batch system =
batch submit command template =
execution polling intervals =
execution retry delays =
execution time limit =
submission polling intervals =
submission retry delays =
host =
owner =
suite definition directory =
retrieve job logs =
retrieve job logs max size =
retrieve job logs retry delays =
[[[batch systems]]]
err tailer =
out tailer =
err viewer =
out viewer =
job name length maximum =
execution time limit polling intervals =
```

### Locking down global settings
There should be a mechanism by which system administators can lock global
settings for their sites. This should probably be a `lock=True/False` switches
within those settings that we wish to make lockable. If unset these will
default to `False`. If set to `True` a user who tries to over-ride that setting
will see a warning explaining that the setting has been over-ridden by a site
admin.