Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the traffic flags to the apply command #1187

Closed
wants to merge 2 commits into from

Conversation

jwilner
Copy link

@jwilner jwilner commented Jan 8, 2021

Description

Adds the traffic flags to the apply command.

Changes

  • Adds the traffic flags to the apply command.

Reference

Fixes #1186

@google-cla google-cla bot added the cla: yes Indicates the PR's author has signed the CLA. label Jan 8, 2021
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jwilner: 0 warnings.

In response to this:

Description

Adds the traffic flags to the apply command.

Changes

  • Adds the traffic flags to the apply command.

Reference

Fixes #1186

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 8, 2021
@knative-prow-robot
Copy link
Contributor

Hi @jwilner. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 8, 2021
@navidshaikh
Copy link
Collaborator

/ok-to-test

@knative-prow-robot knative-prow-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 11, 2021
@navidshaikh
Copy link
Collaborator

@jwilner : The build tests complaining about the missing auto-docs (for new traffic flags), which can be generated by running ./hack/build.sh, please add those docs to the PR.

@jwilner
Copy link
Author

jwilner commented Jan 11, 2021

Thanks for the tip @navidshaikh -- it should be fixed now.

Copy link
Contributor

@maximilien maximilien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contribution. Can you please add a test for new feature?

@rhuss
Copy link
Contributor

rhuss commented Feb 2, 2021

Sorry for my late reply, I still have to wrap around how and even whether the traffic split should be included into kn service apply, as I'm not sure that we can provide the full state on the command line to regenerate the setup. E.g. you need at least two revisions for a traffic split, not something that we can specify in a single CLI call (remember the difference between kn service create and kn service update to kn service apply, where the first is about incremental, step-wise build up of the state and the latter nees to provide always the full state.

@jwilner
Copy link
Author

jwilner commented Feb 2, 2021

Hi @rhuss. Thanks for responding.

It's certainly not beautiful to have to do a read and all of the parsing to construct the traffic split expression with old versions, but it's possible and useful. You can already control traffic splits with this command via the file; really, this is just providing parity via flags imo.

For context, my use case is really very standard I think.

I am trying to set up a simple CD pipeline. Like most CD pipelines, we want to track as much as possible the configuration of our service in source control. Using kn service {create,update} for this purpose is awkward. create is very limited in its scope; update fundamentally operates with deltas, so it is at odds with the notion of keeping the whole config in vcs. 
apply of course works for tracking and applying the whole state.

That said, we also want to do some very basic traffic splitting -- we want the ability to "dark deploy" a feature branch of our service, and it would be very cumbersome to have to edit and check in the file definition to achieve that. Permitting the traffic flags with the apply command gets us there.

I think the above is a very standard use case, and I think this change set is necessary to support it (which I can empirically confirm, b/c we're using my fork for CD).

I'm pasting below the wrapper I wrote because it may help provide additional context. The material part is the deploy function, but it might be helpful to understand that PLUGIN_ROUTE_TAG is conventionally set to a git-branch name.

#!/usr/bin/env bash

# A drone plugin which runs in two different logical modes:
# - `deploy` creates a new knative revision serving 100% of traffic; old revisions serving traffic are forgotten.
# - `dark deploy` deploys a new knative service serving 0% of traffic that's exposed via knative tagged header routing;
#    they're intended for testing. Triggered by the presence of `route_tag`.
#
# See the README for a full documentation of available parameters.

main() {
  [[ -n "${PLUGIN_CREDENTIALS}" ]] || die "credentials is required"
  [[ -n "${PLUGIN_PROJECT}" ]] || die "project is required"
  [[ -n "${PLUGIN_REGION}" ]] || die "region is required"
  [[ -n "${PLUGIN_CLUSTER}" ]] || die "cluster is required"

  [[ -z "${PLUGIN_SERVICE_YAML}" ]] && export PLUGIN_SERVICE_YAML=service.yaml
  [[ -f "${PLUGIN_SERVICE_YAML}" ]] || die "service_yaml does not exist"

  local service_name
  service_name="$(yq --exit-status e .metadata.name "${PLUGIN_SERVICE_YAML}")"
  export SERVICE_NAME="${service_name}"

  if [[ -z "${PLUGIN_DOCKER_REPO}" ]]; then
    local repo
    repo="$(cut -d/ -f2 <<<"${DRONE_REPO}")"
    export PLUGIN_DOCKER_REPO="us.gcr.io/nyt-auth-dev/${repo}"
  fi

  if [[ -z "${PLUGIN_IMAGE_TAG}" ]] && [[ -f .version ]]; then
    PLUGIN_IMAGE_TAG="$(cat .version)"
    export PLUGIN_IMAGE_TAG
  fi

  log_in

  deploy
}

die() {
  echo "${@}" >&2
  exit 1
}

log_in() {
  local credentials="${PLUGIN_CREDENTIALS}"
  local project="${PLUGIN_PROJECT}"
  local region="${PLUGIN_REGION}"
  local cluster="${PLUGIN_CLUSTER}"

  local creds_file
  creds_file="$(mktemp)"
  # shellcheck disable=SC2064
  trap "rm ${creds_file}" EXIT

  echo "${credentials}" >"${creds_file}"
  gcloud auth activate-service-account --key-file="${creds_file}"

  gcloud container clusters get-credentials --region "${region}" --project "${project}" "${cluster}"
}

deploy() {
  local service_yaml="${PLUGIN_SERVICE_YAML}"
  local service_name="${SERVICE_NAME}"
  local docker_repo="${PLUGIN_DOCKER_REPO}"
  local image_tag="${PLUGIN_IMAGE_TAG}"
  local route_tag="${PLUGIN_ROUTE_TAG}" # empty if regular release

  # Traffic Settings
  #
  # Knative traffic settings look like:
  #
  # [{"tag": "foo", "revisionName": "rev2", "percent": 0}, {"revisionName": "rev1", "percent": 100}]
  #
  # The fields
  # - Tag exposes the revision at an alternate URL
  # - Revision name is the actual revision
  # - Percent is the amount of traffic it receives
  #
  # Note that a revision does not necessarily have a tag.
  #
  # We aim to always preserve "dark" revisions. If doing a non-dark release, we:
  # - Remove all revisions serving > 0% (other non-dark revisions)
  # - Add our revision at 100%
  #
  # If you wanted to add a new dark mode revision (fizzbuzz) to the service, you need to pass the whole state of the
  # world via key-value pairs on the CLI like so:
  #   --tag @latest=fizzbuzz,rev2=foo --traffic @latest=0,rev2=0,rev1=100
  #
  # If you wanted to redeploy -- overwrite -- the tag foo, you must first untag the existing revision:
  #   --untag foo
  # And then reapply it (also omitting the old rev from traffic at this point)
  #   --tag @latest=foo --traffic @latest=0,rev1=100

  # load traffic settings, defaulting to empty if service doesn't exist yet
  local traffic_settings="[]"
  {
    local svc_desc
    if svc_desc="$(kn service describe --output json "${service_name}" 2>/dev/null)"; then
      traffic_settings="$(jq '.status.traffic' <<<"${svc_desc}")"
    fi
  }

  if [[ -z "${route_tag}" ]]; then
    traffic_settings="$(
      jq '
        [{revisionName: "@latest", percent: 100}]
        + [.[] | select(.percent == 0)]
      ' <<<"${traffic_settings}"
    )"
  else
    # Does the desired tag already exist? If so, delete it:
    if jq --exit-status --arg tag "${route_tag}" 'any(.tag == $tag) ' <<<"${traffic_settings}" >/dev/null; then
      echo "Untagging old revision with tag ${route_tag}" >&2
      kn service update --untag "${route_tag}" "${service_name}"
    fi
    # Note that we filter out any overwritten tags
    traffic_settings="$(
      jq --arg tag "${route_tag}" '
        [{tag: $tag, revisionName: "@latest", percent: 0}]
        + [.[] | select(.tag != $tag)]
      ' <<<"${traffic_settings}"
    )"
  fi

  # [{"revisionName": "@latest", "percent": 100}, ...] -> "@latest=100,..."
  local traffic
  traffic="$(jq --raw-output '[.[] | "\(.revisionName)=\(.percent)"] | join(",")' <<<"${traffic_settings}")"

  if [[ "${traffic}" == "@latest=0" ]]; then
    echo "Cannot do a deploy if no service has been created yet." >&2
    exit 1
  fi

  local args=(
    kn service apply "${service_name}"
    --filename "${service_yaml}"
    --image "${docker_repo}:${image_tag}"
    --revision-name "${service}-${image_tag}-{{.Generation}}-{{.Random 5}}"
    --traffic "${traffic}"
  )
  if [[ -n "${route_tag}" ]]; then
    args+=(--label-revision "auth.dev.nyt.net/auto-cleanup=1")
  fi

  {
    local tags
    # [{"revisionName": "@latest", "tag": "foo"}, ...] -> "@latest=foo,..."
    tags="$(jq --raw-output '[.[] | select(.tag) | "\(.revisionName)=\(.tag)"] | join(",")' <<<"${traffic_settings}")"
    if [[ -n "${tags}" ]]; then
      args+=(--tag "${tags}")
    fi
  }

  "${args[@]}"
}

main "${@}"

@rhuss
Copy link
Contributor

rhuss commented Feb 9, 2021

@jwilner do you have an example of an initial Knative service YAML that defines the traffic split already during creation time (with at least one route that is not @latest) ? not sure how this is supposed to work without any already existing revision, but I could have missed something obvious of course.

@rhuss
Copy link
Contributor

rhuss commented Feb 9, 2021

Thanks for the update, and I can see you use case to set the traffic split for an already running service. Still, this it might be trickier than just adding those flags to kn service apply, especially for the case when you create the service. I have to test it, but I suspect we will get back an error from the server anyway in this case. If the error message is meaningful enough, then I'm happy to add this PR. if not, we might need to add some more checks to prevent the case of adding a traffic split in this situation. An alternative behaviour would be to ignore a traffic specification on creation (which btw can not really be performed if you don't have an already existing revision).

I still have to play with the flow and the edge cases, but I think for a CD setup it totally makes sense to support the traffic split (which is an odd construct wrt/ to state anyway, so we might add a 'special' behaviour for this part)

@knative-prow-robot knative-prow-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 11, 2021
@google-cla
Copy link

google-cla bot commented Feb 11, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added cla: no Indicates the PR's author has not signed the CLA. and removed cla: yes Indicates the PR's author has signed the CLA. labels Feb 11, 2021
@google-cla
Copy link

google-cla bot commented Feb 11, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@jwilner jwilner changed the base branch from release-0.19 to master February 11, 2021 23:51
@rhuss
Copy link
Contributor

rhuss commented Mar 9, 2021

However, this error message looks strange:

./kn service apply random2 --image rhuss/random:1.0 --traffic "@latest=100" --tag random-0001=v4
Error: multiple traffic targets are impossible when creating a service
Run 'kn --help' for usage

There is only one target, but an additional tag option when creating the service. I would expect this command to succeed. Omitting the --tag indeed works:

./kn service apply random2 --image rhuss/random:1.0 --traffic "@latest=100"
Creating service 'random2' in namespace 'default':

  0.013s The Route is still working to reflect the latest desired specification.
  0.037s ...
  0.062s Configuration "random2" is waiting for a Revision to become ready.
  7.437s ...
  7.502s Ingress has not yet been reconciled.
  7.681s Waiting for load balancer to be ready
  7.972s Ready to serve.

Service 'random2' created to latest revision 'random2-00001' is available at URL:
http://random2-default.apps.rhuss-dev.devcluster.openshift.com

@rhuss
Copy link
Contributor

rhuss commented Mar 9, 2021

/retest

@rhuss
Copy link
Contributor

rhuss commented Mar 9, 2021

tl;dr - I'm positive to integrate the PR as I think it might be useful exactly for the use case that @jwilner mentioned above, and it still can be performed in an idempotent way.

The messaging needs to be improved though, as well as the error check for create has some issues.

I'm just trying to find out why my tests sometimes hang when running apply and then will jump on the review.

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 27, 2021
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 4, 2021
@github-actions
Copy link

github-actions bot commented Aug 3, 2021

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen. Mark as fresh by adding the
comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2021
@rhuss
Copy link
Contributor

rhuss commented Aug 3, 2021

/remove-lifecycle stale

@knative-prow-robot knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2021
Copy link
Contributor

@maximilien maximilien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, thanks for contribution. Second, wondering if the main logic could be simplified. And third, I did not see an integration test.
/ok-to-test

waitDoing, waitVerb, err := examineServiceForApply(cmd, client, service.Name)
if err != nil {
waitDoing, waitVerb := "Applying", "applied"
if isCreate, err := examineServiceForApply(cmd, client, service.Name); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not is,ply this if/else/if with simply:

err := examineServiceForApply(cmd, client, service.Name)
if err != nil {
  return err
}

// do as in the else / if since the ifCreate condition applies to both parts of the statement
// and just add a `if !isCreate ...` guard and return or do what's needed in that case in a helper function

@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-knative-client-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/kn/commands/service/apply.go 84.4% 85.5% 1.0

@github-actions
Copy link

github-actions bot commented Nov 9, 2021

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen. Mark as fresh by adding the
comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 9, 2021
@github-actions github-actions bot closed this Dec 9, 2021
@rhuss
Copy link
Contributor

rhuss commented Dec 9, 2021

/remove-lifecycle stale

@knative-prow-robot knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 9, 2021
@rhuss rhuss reopened this Dec 9, 2021
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jwilner
To complete the pull request process, please ask for approval from maximilien after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rhuss rhuss mentioned this pull request Jan 27, 2022
4 tasks
@github-actions
Copy link

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen. Mark as fresh by adding the
comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 10, 2022
@knative-prow-robot
Copy link
Contributor

@jwilner: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-knative-client-integration-tests 00aef84 link true /test pull-knative-client-integration-tests
pull-knative-client-unit-tests 00aef84 link true /test pull-knative-client-unit-tests
pull-knative-client-integration-tests-latest-release 00aef84 link true /test pull-knative-client-integration-tests-latest-release
pull-knative-client-build-tests 00aef84 link true /test pull-knative-client-build-tests
build-tests_client_main 00aef84 link true /test build-tests_client_main

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 6, 2022
@knative-prow-robot
Copy link
Contributor

@jwilner: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 16, 2022
@github-actions
Copy link

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen. Mark as fresh by adding the
comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 16, 2022
@github-actions github-actions bot closed this Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Indicates the PR's author has signed the CLA. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support traffic flags for kn service apply
6 participants