Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lighter Runtime #1762

Merged
merged 91 commits into from Mar 15, 2021
Merged

Lighter Runtime #1762

merged 91 commits into from Mar 15, 2021

Conversation

refs
Copy link
Member

@refs refs commented Mar 4, 2021

What?

Replacing the current os thread-based runtime in favor of supervision trees and the option to run the entire set of extensions under a single process. All prior compatibility still exists and extensions can still run individually unsupervised.

Known issues

  • Config parsing using Viper is broken see issue and we have to find a way around this to support config via config files
  • The entire flag parsing has to be rethought now that we dug into go-micro.
  • Reva services are not being stopped on context cancellation.

There are more todo's and regards on the JIRA ticket.

Ticket: https://jira.owncloud.com/browse/OCIS-1715

TODO

  • add json tags for config files. Currently configuring Reva extensions, values are case sensitive because they get transpiled to golang type names, and these must match. Look into using struct tags.

Some Considerations:

  • subcommands MUST also set MICRO_LOG_LEVEL to error.
  • normalize flag parsing (and fix hack of renaming ocis top level for the destination side effect)
  • Metrics endpoint in Proxy changed, now we do not panic. What happens if we kill the proxy and run it again? Same metrics?
  • config file parsing with Viper is no longer possible as viper is not thread-safe (Is viper thread safe? spf13/viper#19)
    • when an extension runs on supervised mode, ocis parses the global config
  • establish on suture a max number of retries before all initialization comes to a halt.
  • Reva runtime currently does not have support for closing a service when a context is canceled??
  • reuse the runtime service struct to control suture.
  • shutting down a service implies iterating over all the serviceToken for the given service name and terminating them.
  • to avoid this getting out of hands, a supervisor would need to be injected on each supervised service.
  • each service would then add its execute func to the supervisor, and return its token (?)
  • this runtime should only care about start / stop services, for that we use serviceTokens.
  • normalize the use of panics so that suture can restart services that die. return achieves the same effect.
  • the runtime should ideally run as an rpc service one can do requests, like the good ol' pman, rest in pieces.
  • remove default log flagset values.

More Considerations

  • solve the annoying issue with:
    • Failed to shutdown server error="context canceled" server=debug service=proxy
  • scaling on a single node works. Example editing the config:
    • create /etc/ocis/proxy.json
    • start ocis using the runtime
    • modify the proxy.json config HTTP and debug ports
    • ocis run proxy
    • there are 2 proxy instances now
  • deal with the error:
    • panic: send on a closed channel
  • [tusd] logging format is not in sync with the rest of the services
  • sharing is broken, could it be the same issue with the home storage config we had a few days ago? duplicate config downstream.
  • when running supervised group run should not listen for OS interruptions
  • does web need: //Flags: flagset.RootWithConfig(cfg), ?

Docs


Configuration via config files

As soon as #1762 gets merged, oCIS admins will be able to define a global configuration file in order to configure the entire node. Values present in such file correlate to those of ocis-pkg/config.

Naturally oCIS offers a wide range of ways to be configured, from admin friendly config files to cli flags and cloud / 12factor environment variables, but with so many sources is really easy to get lost in translation. Let's dig into precedence and operation modes.

Modes of Operation

In order to understand precedence let's define modes of operation first. Depending on how an extension runs this will be supervised or unsupervised. Think Erlang supervisors. Whenever the ocis single binary 's server command is invoked, all default embedded extensions that will bootstrap a fully functional oCIS instance with its default in-memory configuration. On the other hand, when an extension is ran standalone (picture an external extension built outside the single binary) with the single binary, i.e: ocis run idp then there is no supervision tree looking after this extension. In other words the extension is looking only after itself, and is up to it to recover from panics, restart or fork itself in case of unrecoverable states.

Simply put, ocis server runs using goroutines and supervision trees, everything else runs as a single, independent process.

Precedence

Supervised

An extension running supervised will receive and modify a configuration in the following order (from more to less relevant):

  1. cli flags
  2. environment variables
  3. config files

Given the nature of oCIS, cli flags can't make it all the way to subcommands, so oCIS does not propagate cli flags down to extensions. To see the list of supported CLI flags for oCIS run ocis help.

Unsupervised

  • TODO

Example Global Configuration

proxy:
  http:
    addr: "0.0.0.0:6200"
accounts:
  http:
    addr: "0.0.0.0:8888"
Storage:
  Reva:
    StorageMetadata:
      Port:
        HTTPAddr: "0.0.0.0:7777"

@refs refs mentioned this pull request Mar 14, 2021
4 tasks
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
graph-explorer/pkg/command/server.go Outdated Show resolved Hide resolved
@butonic
Copy link
Member

butonic commented Mar 15, 2021

@refs I think you got around the config parsing using a mutex, right? Tick it off from the 'known issues' list? We also redid the flag parsing. I'll check the reva cancelation.

@sonarcloud
Copy link

sonarcloud bot commented Mar 15, 2021

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 15 Code Smells

2.7% 2.7% Coverage
31.9% 31.9% Duplication

@butonic
Copy link
Member

butonic commented Mar 15, 2021

ok, to shutdown reva we need a context option on the runtime.RunWithOptions()... but not as part of this PR

@butonic butonic merged commit 208f19c into master Mar 15, 2021
@delete-merged-branch delete-merged-branch bot deleted the ocis-1715-lighter-runtime branch March 15, 2021 16:28
ownclouders pushed a commit that referenced this pull request Mar 15, 2021
Merge: fa7cce9 9fa77a2
Author: Jörn Friedrich Dreyer <jfd@butonic.de>
Date:   Mon Mar 15 17:28:51 2021 +0100

    Merge pull request #1762 from owncloud/ocis-1715-lighter-runtime

    Lighter Runtime
@refs
Copy link
Member Author

refs commented Mar 15, 2021

ok, to shutdown reva we need a context option on the runtime.RunWithOptions()... but not as part of this PR

thanks for looking into this one :) I would add a follow-up ticket for the fast lane

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants