Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scipian Customer Onboarding #3

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions 0001-customer-onboarding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
- Feature Name: `scipian_customer_onboarding`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this feature generic customer onboarding or is it specifically onboarding with existing terraform infrastructure?

The rest of the document seems to lean towards onboarding with existing terraform infrastructure. For teams that are looking to onboard and use the terraform controller to create their infrastructure, I imaging the process would be pretty similar, minus the terraform importing, but can teams onboard to Scipian for non-terraform or infrastructure management use cases? For example, if they just wanted to use Scipian to deploy to their existing infrastructure?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This first RFC is meant to stay relatively generic, but focus on some internal needs for customers currently using scipctl and stack who will need to onboard to Scipian. This will include importing existing Terraform state, however this is not a requirement for Scipian, just a scenario we need to support in this first MVP.

- Start Date: 2019-06-21
- RFC PR: [scipian/rfcs#3](https://github.com/scipian/rfcs/pull/3)
- Community Issue: [scipian/community#6](https://github.com/scipian/community/issues/6)

# Summary
[summary]: #summary

Scipian customer onboarding will allow a seamless process for customers to
onboard to the Scipian platform. At a high level, this includes creation of
namespaces, RBAC rules, and importing Terraform state from a location external

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is not specific to onboarding with existing terraform infrastructure, then we might want to note that part is optional here.

Copy link
Author

@nicklathe nicklathe Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it could say: "...creation of namespaces, RBAC rules, and optionally importing Terraform state..."

to Scipian.

# Motivation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the motivation section here, we've covered why we're doing this, but we probably want to specifically layout the use cases (onboarding for deployments, onboarding without existing infra, onboarding with existing infra) and enumerate the expected outcomes. It seems like the outcomes should be:

  • documented process for what a Scipian admin would need to do to onboard a new team (creating namespace and rbac roles)
  • new scipian functionality to import existing terraform infrastructure (this might even require it's own separate rfc to really document out what the functionality will be and how it will be supported)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The steps to onboarding are laid out below, in the guide-level explanation. Should these steps be more explicit? I believe the guide-level section is the correct section to do this in.

[motivation]: #motivation

There is currently process for a new customer to be onboarded to Scipian. Having

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently no process?

Copy link
Author

@nicklathe nicklathe Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, typo. Should be no process.

a smooth process in place will be paramount to onboarding multiple customers
who want to use Scipian. Having a low barrier of entry will encourage
teams to use Scipian.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Scipian builds on and uses many Kubernetes primitives. Some of these include
namespaces and RBAC rules. This allows multiple teams to use Scipian, while
keeping each team's data and interactions with Scipian separated from eachother.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space in eachother.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


For onboarding a new team to the Scipian platform, the following is created:

- **Namspace**. A new namespace in the Scipian cluster will be created for a
given team, under a unique name tied to a specific project or application.
- **Role**. This will define a role in Scipian as to what operations are allowed
in the cluster for anyone asigned to that role. It will be tied to the created
namespace, and only allow the least privileges required to create Workspaces and
Runs in Scipian.
- **RoleBinding**. This will bind the created role to a set of users associated
with the onboarding team.

If a team is already using Terraform to manage their infrastructure, Scipian
supports the importing of Terraform state from a location external to Scipian.
Scipian will pull this state and push it to it's own managed backend.

Below is the flow when onboarding a new customer to Scipian:

- New customer requests to be onboarded to Scipian
- Customer provides list of users, (the email associated with the IDP Scipian
uses) of all the members who will need to use Scipian
- A Scipian admin will create the Namespace and RBAC roles for the onboarding
team
- Onboarding team members will use the Scipian Authentication CLI to set up
their kubeconfigs for interfacing Scipian.
- The onboarding team will create a Kubernetes secret in their namespace with
the AWS credentials needed for Scipian to pull Terraform state from an S3 bucket
- Scipian will pull that state into it's own backend
- Customer will then be onboarded and ready to use the Scipian platform

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

- RBAC rules and Namespace will be created manually by a Scipian admin using
yaml and `kubectl`. This could be automated later via an onboarding UI, but that
is beyond the scope of this RFC.
- Terraform state "puller" will be a Kubernetes job that runs a

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is part that I'm wondering if it should be it's own RFC.

Does this section need to include parts about how the onboarding team will need to provide their own docker container with the appropriate secret and terraform information inside in order to interact with their imported terraform?

We specified scipctl here, but will this import be limited to infrastructure created by scipctl and stack or will it be open to any terraform infrastructure? What would supporting any terraform managed infrastructure mean for mapping the current Terraform state workspace structure to the Scipian expected structure?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there's a line somewhere between getting Scipian ready for a new customer to use (RBAC, importing existing infra) and actually using Scipian. The latter will depend on good documentation, the former are steps we as Scipian developers/administrators need to have in place for a new customer.

I would have to check a version, but I believe we can take any version of Terraform after they introduced workspaces and runs. This should probably be documented as part of the using scipian documentation.

It should also be noted that this RFC is to support the onboarding of only a handful of internal customers we have right now. I expect more RFC's to follow as we iterate on this feature of Scipian in the future.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, this first RFC and onboarding process will focus on customers already using stack, as that is the immediate need right now.

special Docker container that will pull state from an S3 bucket,
and place in the appropriate workspace in Scipian's backend. This could be
initiated by a Scipian admin, or an onboarding user. The state puller will need
to map the current Terraform state workspace structure created by `scipctl` to
the new workspace structure that Scipian expects.

# Drawbacks
[drawbacks]: #drawbacks

This is a necessary feature of Scipian, and currently no drawbacks have been
identified by adding this feature.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

The onboarding outlined in this RFC will be an MVP level feature. Later, this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should break this down even further. Maybe the most basic MVP should just be the RBAC rules and Namespace creation. That would be enough to allow a team to onboard to Scipian and start creating brand new infrastructure.

I think importing existing infrastructure is a really useful feature, but I'm concerned the cost will also be much higher to implement that feature and it might be worth having the first part done to unblock anyone who wants to get going with new infra.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The impact of not doing the RBAC rules and Namespace would be that multiple teams could not securely onboard onto a single Scipian cluster.

The impact of not doing the terraform importing would be that a team with existing infrastructure would need to create brand new infrastructure with Scipian.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I could do a better job of separating RBAC/Namespace creation and importing existing Terraform state, which is definitely an optional piece.

As for cost of implementing importing state, I'm not concerned about that. I wrote a quick proof of concept this week to test it out. Having a path for customers who don't want to tear down existing infra is an important piece of this onboarding process in my opinion.

functionality could be extended to include a self service UI, which would
automate RBAC rules and Namespace creation, as well as the pulling of extneral
Terraform state, thereby removing the requirement for a Scipian admin's direct
involvement every time a new customer wishes to onboard to Scipian.

# Prior art
[prior-art]: #prior-art

Because Scipian extends Kubernetes and utilizes many of Kubernetes' primitives,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had time to really go through these two references yet, but I wanted to get them noted just to see if there's anything we should be looking into:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look at these, but Scipian is a platform that extends much beyond infrastructure management (i.e. just Terraform). I haven't dug into these resources yet, but at a high level, these seem very focused on Terraform. Our goal is to iterate towards a place where we abstract Terraform away from the customer, and eventually even stop using it all together if it makes sense.

the approach outlined in this RFC will use typical Kubernetes cluster
administration flows to accomplish these tasks, even though it will require
manual action from a Scipian admin. Namely, these resources will be created to
onboard a customer by a Scipian admin using `kubectl` and yaml. Using the
primitives already built into Kubernetes allows us to stay unified with
Kubernetes, build this solution quickly, and iterate to improve this onboarding
process over time.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

The following are open questions the Scipian dev community should answer before
work commences:

- Should the Terraform state "puller", which will be a Kubernetes Job, be
initiated by the onboarding team, or by a Scipian admin?
- Somewhat tied to this process is the question of how best to distribute the
Scipian Authentication CLI to onboarding teams?
- What areas do we need to consider to align us for future iterations and
improvements to automate this process?
- Further research into Terraform will be needed, to determine the best way
to pull state and remap it into Scipian's backend in the correct workspace
structure expected by Scipian.

# Future possibilities
[future-possibilities]: #future-possibilities

As mentioned in this RFC, this is an MVP level solution to onboarding. This
process should be automated and include a self service UI in the future, as
the manual action requried by a Scipian admin will not scale. However, this MVP
should be done in a way where future iteration can begin at any time, but the
proposed implementation should support the current needs for the forseable
future.