Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release the OCF RA API 1.1 standard #24

Merged
merged 8 commits into from Mar 26, 2021
Merged

Conversation

kgaillot
Copy link
Contributor

@kgaillot kgaillot commented Mar 3, 2021

The main goals of this revision are to formalize existing widespread practice that deviates from or extends the 1.0 standard, and to add optional new meta-data hints that user interfaces can benefit from.

A summary of changes from the 1.0 standard:

  • Much of the document has been reworded and reorganized for consistency and clarity.

  • The official home of the standard has been changed from opencf.org to github.com/ClusterLabs.

  • The version number is now 1.1. (Obviously.)

  • Resource types (agent names) should be suitable for use as a file name.

  • Resource types and provider names that start with a dot (.) are a hint that the item should be omitted from lists provided to users.

  • The resource agent directory, which could only be /usr/ocf/resource.d in 1.0, is now left to installers. Resource managers may search multiple directories.

  • The resource agent meta-data syntax is now described in greater detail, and the schema is provided in RNG format rather than DTD. Changes in meta-data syntax:

    • The resource agent itself may have longdesc and shortdesc elements (previously, only parameters did).

    • It is now discouraged to use XML within longdesc, shortdesc, and desc, and instead limit the content to a text string.

    • The "unique" attribute is deprecated, and replaced with two new, optional attributes: "unique-group" is a hint that the combination of all parameters with the same value should be unique to the resource type, and "reloadable" is a hint that changes in the parameter can be made to take effect with the new, optional "reload-agent" action instead of a full stop and start.

    • The new "required" attribute may be given as a hint as to whether the user must specify the parameter.

    • The new "deprecated" child element is a hint that a parameter is deprecated and supported for backward compatibility only. It may contain zero or more "replaced-with" child elements to indicate parameters that should be used instead.

    • Parameter content elements may now have a type of "select", in which case it must have one or more "option" subelements listing specifically allowed values for the parameter.

    • The "special" element is now optional.

  • Changes in resource actions (besides "reload-agent" mentioned above):

    • Resource agents may optionally support the new notify action to coordinate multiple instances of the service running simultaneously in a cluster. The exact behavior is left to resource managers and agents, but may be clarified in a future version.

    • Resource agents may optionally support two modes of operation, called roles, named "unpromoted" and "promoted". The new promote action brings a service to the promoted role, and the start action and new demote action bring it to the unpromoted role. The role names were chosen to be service-agnostic (unlike master/slave, primary/secondary, controller/worker, etc.). The unpromoted role was not named demoted, to avoid implying that a particular instance was previously promoted (it may have been started in the unpromoted role and never promoted). The unpromoted role was not named base, default, or standard because those are vague and could be easily misunderstood when context is unavailable.

    • Resources may now define any arbitrary actions as desired, as well as the mandatory and optional actions whose behavior is described by the standard.

  • Changes in environment variables:

    • Resource managers and agents may optionally support the OCF_OUTPUT_FORMAT environment variable to select a format to be used (such as "text" or "xml") for user output.

    • Resource agents may optionally use the existing OCF_CHECK_LEVEL environment variable with meta-data actions (as well as monitor, as before). With meta-data, an OCF_CHECK_LEVEL of 0 or unspecified indicates internal consistency of parameters only, without regard to the local environment, while an OCF_CHECK_LEVEL of 10 may additionally verify the suitability of the local environment.

  • Changes in exit statuses:

    • Resource agents are recommended, but not required, to use exit status 2 for parameters that are invalid in the context of the local host (such as a nonexistent configuration file), and exit status 6 for parameters that are internally invalid (such as a string given where only an integer is allowed).

    • New exit statuses have been defined: 8 for properly running in the promoted role, 9 for failed in the promoted role, 190 for properly running but degraded, and 191 for properly running in the promoted role but degraded.

    • No undefined exit statuses are reserved.

  • The "to-do list" has been removed.

@nrwahl2
Copy link

nrwahl2 commented Mar 3, 2021

Love it. I especially like the improved definitions of exit status codes and OCF_CHECK_LEVEL values, and I think unique-group is a major improvement over unique.

@kgaillot
Copy link
Contributor Author

kgaillot commented Mar 5, 2021

@tomjelinek @oalbrigt @liangxin1300 @MalloZup @ioguix : My goal is to merge this in a couple of weeks. Please look this over to consider how it might affect your projects, and whether any final changes are needed.

ra/1.1/ra-metadata-example.xml Outdated Show resolved Hide resolved
ra/1.1/ra-metadata-example.xml Outdated Show resolved Hide resolved
ra/1.1/resource-agent-api.md Show resolved Hide resolved
ra/1.1/resource-agent-api.md Show resolved Hide resolved
ra/1.1/resource-agent-api.md Outdated Show resolved Hide resolved
@kgaillot kgaillot force-pushed the ocf1.1 branch 2 times, most recently from 4337681 to c80de58 Compare March 22, 2021 16:34
@kgaillot
Copy link
Contributor Author

kgaillot commented Mar 22, 2021

I have updated this PR with four new commits:

  • c8d035a ra/next: Drop recommendation that resource agent directory be identical
  • 01b4e72 ra/next: OCF means Open Cluster Framework, not Clustering
  • f93046e ra/next: Rename reload-params action to reload-agent
  • 00c9ef5 ra/next: Capitalize headings consistently

which are carried over to 1.1 via the updated "bring in proposed changes" commit.

@nrwahl2 @ioguix and anyone else interested, please review and let me know what you think. I hope to merge this at the end of this week or early next week.

@nrwahl2
Copy link

nrwahl2 commented Mar 22, 2021

@kgaillot Capitalization skips "Parameter passing" and "Global OCF attributes". Not sure if intentional.

Everything looks good to me otherwise.

I felt a little bit iffy about "reload-agent", since the agent is sort of getting reloaded on every recurring monitor as-is -- but I can't think of any alternative that's clearer. I get that there may be certain actions that the agent only takes during start or reload-agent. So I think this is fine and we can rely on the description for clarification.

... as suggested by Reid Wahl <nrwahl@protonmail.com>
... as suggested by Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
It's always good to get your own name right :)
It conflicted with allowing resource managers to search multiple directories.
The only (slight) differences from "next/" is that this uses "1.1" instead of
"next" in URLs, and "status of this memo" is different (i.e. this is the
proposed draft of 1.1 standard).
The main goals of this revision are to formalize existing widespread practice
that deviates from or extends the 1.0 standard, and to add optional new
meta-data hints that user interfaces can benefit from.

Changes from the 1.0 standard:

* Much of the document has been reworded and reorganized for consistency and
  clarity.

* The official home of the standard has been changed from opencf.org to
  github.com/ClusterLabs.

* The version number is now 1.1. (Obviously.)

* Resource types (agent names) should be suitable for use as a file name.

* Resource types and provider names that start with a dot (.) are a hint
  that the item should be omitted from lists provided to users.

* The resource agent directory, which could only be /usr/ocf/resource.d in 1.0,
  is now left to installers. Resource managers may search multiple directories.

* The resource agent meta-data syntax is now described in greater detail, and
  the schema is provided in RNG format rather than DTD. Changes in meta-data
  syntax:

  * The resource agent itself may have longdesc and shortdesc elements
    (previously, only parameters did).

  * It is now discouraged to use XML within longdesc, shortdesc, and desc,
    and instead limit the content to a text string.

  * The "unique" attribute is deprecated, and replaced with two new, optional
    attributes: "unique-group" is a hint that the combination of all parameters
    with the same value should be unique to the resource type, and "reloadable"
    is a hint that changes in the parameter can be made to take effect with the
    new, optional "reload-agent" action instead of a full stop and start.

  * The new "required" attribute may be given as a hint as to whether the
    user must specify the parameter.

  * The new "deprecated" child element is a hint that a parameter is deprecated
    and supported for backward compatibility only. It may contain zero or more
    "replaced-with" child elements to indicate parameters that should be used
    instead.

  * Parameter content elements may now have a type of "select", in which case
    it must have one or more "option" subelements listing specifically allowed
    values for the parameter.

  * The "special" element is now optional.

* Changes in resource actions (besides "reload-agent" mentioned above):

  * Resource agents may optionally support the new notify action to coordinate
    multiple instances of the service running simultaneously in a cluster.
    The exact behavior is left to resource managers and agents, but may be
    clarified in a future version.

  * Resource agents may optionally support two modes of operation, called
    roles, named "unpromoted" and "promoted". The new promote action brings a
    service to the promoted role, and the start action and new demote action
    bring it to the unpromoted role.

    The role names were chosen to be service-agnostic (unlike master/slave,
    primary/secondary, controller/worker, etc.). The unpromoted role was not
    named demoted, to avoid implying that a particular instance was previously
    promoted (it may have been started in the unpromoted role and never
    promoted). The unpromoted role was not named base, default, or standard
    because those are vague and could be easily misunderstood when context is
    unavailable.

  * Resources may now define any arbitrary actions as desired, as well as
    the mandatory and optional actions whose behavior is described by the
    standard.

* Changes in environment variables:

  * Resource managers and agents may optionally support the OCF_OUTPUT_FORMAT
    environment variable to select a format to be used (such as "text" or
    "xml") for user output.

  * Resource agents may optionally use the existing OCF_CHECK_LEVEL environment
    variable with meta-data actions (as well as monitor, as before). With
    meta-data, an OCF_CHECK_LEVEL of 0 or unspecified indicates internal
    consistency of parameters only, without regard to the local environment,
    while an OCF_CHECK_LEVEL of 10 may additionally verify the suitability of
    the local environment.

* Changes in exit statuses:

  * Resource agents are recommended, but not required, to use exit status 2 for
    parameters that are invalid in the context of the local host (such as a
    nonexistent configuration file), and exit status 6 for parameters that are
    internally invalid (such as a string given where only an integer is
    allowed).

  * New exit statuses have been defined: 8 for properly running in the promoted
    role, 9 for failed in the promoted role, 190 for properly running but
    degraded, and 191 for properly running in the promoted role but degraded.

  * No undefined exit statuses are reserved.

* The "to-do list" has been removed.
@kgaillot
Copy link
Contributor Author

@kgaillot Capitalization skips "Parameter passing" and "Global OCF attributes". Not sure if intentional.

Whoops, not intentional. Fixed.

@kgaillot
Copy link
Contributor Author

Last chance for comments, I'll merge tomorrow

@kgaillot kgaillot merged commit d22386d into ClusterLabs:master Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants