Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Refactor platform handling #716

Open
johananl opened this issue Jul 13, 2020 · 8 comments
Open

Refactor platform handling #716

johananl opened this issue Jul 13, 2020 · 8 comments
Assignees
Labels
area/testing refactoring size/xl Issues which likely require multiple work weeks technical-debt Technical debt-related issues

Comments

@johananl
Copy link
Member

johananl commented Jul 13, 2020

General

At the moment the platform package has multiple responsibilities:

  • It defines which Terraform modules are used to deploy a cluster on a specific platform.
  • It handles the Terraform execution logic for each platform. This logic is largely generic and therefore duplicated among the platforms.
  • If Fix constant Terraform diff bug #655 is merged, the package will also define which Helm charts are used by a platform.
  • If Fix constant Terraform diff bug #655 is merged, the package will also handle the extraction of Helm-related assets to disk.

We should consider refactoring the platform package to have a single purpose: defining the contents of a Lokomotive platform. The rationale for doing the refactor:

  • Less code duplication.
  • Better delineation between the packages. This makes it easier to figure out where a piece of code should live and also makes testing easier.

Following the refactoring, the following should be true:

  • The platform package exposes a minimal API which allows specifying the "recipe" for deploying a Lokomotive cluster on a given cloud (or bare metal) platform. The package doesn't perform any API calls, file operations or other forms of I/O.
  • Specifically, the platform package doesn't depend on the terraform package. Any code related to executing Terraform operations should be generic (i.e. not platform specific) and reside in the terraform package.
  • Specifically, the platform package doesn't depend on the assets package. Any code related to generating assets into the binary and extracting assets to disk should be generic and reside in the assets package.

Implementation

When implementing this change, we should look for cases where we have "specialized" logic for a platform and ideally eliminate such cases. This is important if we want to make Terraform-related code generic. An example of such specialized logic:

func (c *config) Initialize(ex *terraform.Executor) error {
if c.AuthToken == "" && os.Getenv("PACKET_AUTH_TOKEN") == "" {
return fmt.Errorf("cannot find the Packet authentication token:\n" +
"either specify AuthToken or use the PACKET_AUTH_TOKEN environment variable")
}
if err := c.DNS.Validate(); err != nil {
return errors.Wrap(err, "parsing DNS configuration failed")
}
assetDir, err := homedir.Expand(c.AssetDir)
if err != nil {
return err
}
terraformRootDir := terraform.GetTerraformRootDir(assetDir)
return createTerraformConfigFile(c, terraformRootDir)
}

func (c *config) Initialize(ex *terraform.Executor) error {
assetDir, err := homedir.Expand(c.AssetDir)
if err != nil {
return err
}
terraformRootDir := terraform.GetTerraformRootDir(assetDir)
return createTerraformConfigFile(c, terraformRootDir)
}

As can be seen above, the Packet implementation of Initialize() contains things like API token validation and DNS config validation whereas the AWS implementation of the same function does not.

If we realize we can't make Terraform-related code completely generic, we can look into allowing arbitrary logic to be specified in a platform implementation, for example using callback functions. We should do our best to keep things well defined if we go in this direction. For example, we may want to consider defining a small set of "hooks" to allow a platform implementation to specific, for example, that some arbitrary logic needs to be run before/after executing terraform apply, before/after applying Lokomotive components etc.

A central part of this refactoring should be re-examining the Platform interface and changing it so that it properly matches the responsibility of the platform package and exposes a limited set of methods which every platform implementation should implement.

@invidian, @iaguis - FYI.

@johananl johananl added technical-debt Technical debt-related issues refactoring area/testing labels Jul 13, 2020
@johananl
Copy link
Member Author

A relevant discussion: #655 (comment)

@invidian
Copy link
Member

IMO The main constraint for the refactoring would be to make platform-specific packages (e.g. pkg/platform/aws, pkg/platform/packet) to contain only platform-unique information. This would include:

  • provide configuration scheme and configiration validation rules (validation rules should not validate environment variables, which are not part of the configuration, but of the runtime)
  • which Helm charts should be used as part of the controlplane (as the list would be based on the configuration)
  • Terraform apply/destroy execution flow (as platform may require different flows as it is currently the case)
  • provide generic platform information (like expected number of initial nodes)
  • Terraform code for cluster
  • define default configuration (currently NewConfig() method in aws package)

Following actions should be generic to all platforms:

  • terraform initialize
  • Extracting Helm charts
  • Extracting Terraform modules

Configuration loading should be done separately from the platform-specific package, perhaps in HCL package (which we do not have right now). I guess this can be left for later.

Also related: #666 #499 #478

@johananl johananl self-assigned this Jul 15, 2020
@johananl johananl added the size/xl Issues which likely require multiple work weeks label Sep 3, 2020
@rugwirobaker
Copy link

rugwirobaker commented Sep 4, 2020

Is it a must for baremetal to use matchbox? Cause I am exploring other projects in the same space. https://github.com/plunder-app/plunder and http://tinkerbell.org/ to name a few.

@invidian
Copy link
Member

invidian commented Sep 4, 2020

@rugwirobaker I'm not sure what you mean, but you might also be interested in #392, where experimental support for Tinkerbell is implemented.

@rugwirobaker
Copy link

@invidian just skimming through the code in pkg/platform the structure seemed it is intended to support just one bare-metal provisioner. I'm gonna go ahead and checkout it.

@invidian
Copy link
Member

invidian commented Sep 4, 2020

@rugwirobaker ah, I know what you mean now. We also have #383 to actually rename baremetal to matchbox, so we can support more.

@rugwirobaker
Copy link

@invidian is there somewhere a setup guide using Tinkerbell?

@invidian
Copy link
Member

invidian commented Sep 4, 2020

There are some hints here: https://github.com/kinvolk/lokomotive/pull/392/files#diff-7196abe142cd5822d69f345a960f986fR1

For setting up Tinkerbell, follow https://tinkerbell.org/docs/setup/.

For the configuration structure, you can look at https://github.com/kinvolk/lokomotive/pull/392/files#diff-f0ab264028d0de5d78c3665c00a126f2R31-R72.

I hope that helps, given that the PR is not fully ready yet.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/testing refactoring size/xl Issues which likely require multiple work weeks technical-debt Technical debt-related issues
Projects
None yet
Development

No branches or pull requests

3 participants