Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(spel): Moved stageByRefId to StageExpressionFunctionProvider #2721

Merged
merged 1 commit into from
Mar 9, 2019

Conversation

ajordens
Copy link
Contributor

@ajordens ajordens commented Mar 9, 2019

No description provided.

@ajordens ajordens requested a review from robzienert March 9, 2019 00:07
Copy link
Member

@robzienert robzienert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in you.

@ajordens ajordens merged commit 246365e into spinnaker:master Mar 9, 2019
link108 added a commit to armory-io/orca that referenced this pull request May 8, 2019
* feat(delivery): upsert and delete delivery configs through orca (spinnaker#2672)

* fix(redis): Fix intermittently failing test (spinnaker#2675)

Another test fails occasionally due to non-monotonic ULIDs over
short timescales.

* feat(cf): Move service polling into orca (spinnaker#2671)

spinnaker/spinnaker#3637

Co-Authored-By: Jason Chu <jchu@pivotal.io>
Co-Authored-By: Jammy Louie <jlouie@pivotal.io>

* chore(serviceaccounts): Do not update service account if no change in roles (spinnaker#2674)

* chore(serviceaccounts): Do not update service account if no change in roles

After the introduction of the OR mode for checking permissions, a
regular user should be able to modify the pipeline if it has any of the
roles in the pipeline (not necessary all of them). However, the service
user is created in every save operation, which prevents users to update
the pipeline when the OR mode is enabled.

This patch skips creating/updating the service user in case the user
already existed and the roles were the same as in the pipeline
definition. This change allows the user to udpate the pipeline only when
the roles are not changed to avoid privilege scalation.

* Return service account even thought it's not updated

... so the save pipeline task update the triggers accordingly.

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2678)

* fix(clouddriver): Expose attributes of `StageDefinition` (spinnaker#2682)

* fix(MPT): Propagate notifications from pipeline config (spinnaker#2681)

One can specify notification on the pipeline in deck (even if pipeline is templated)
However, those notifications are not respected because they don't end up in the
pipeline config. Copy them there so notifications work as expected.

* fix(execution): Ensure exceptions in stage planning are captured (spinnaker#2680)

Presently any exceptions that occur during stage planning are not captured
and therefore will not show up in the execution JSON/UI for end-user to
see. This can be very confusing as there is no explanation why a pipeline
stage fails.

(reference SPIN-4518)

* fix(triggers): surface error if build not found (spinnaker#2683)

* fix(kayenta): fix NPE when moniker defined without a cluster name (spinnaker#2684)

* fix(orca-redis): delete stageIndex when deleting a pipeline (spinnaker#2676)

* test(core): Basic startup test (spinnaker#2685)

* test(core): Basic startup test

Test that orca starts up with a very basic config, with just the
baseUrl for each dependent service defined, and an in-memory
execution queue.

* test(core): Remove optional dependencies

Also add some mock beans and parameters to avoid creating any beans
depending on redis, and remove the redis url from the config.

* fix(core): Add omitted copyright header

* feat(kubernetes): add dynamic target selection to Patch Manifest Stage (spinnaker#2688)

* fix(execution): Honor 'if stage fails' config for synthetic stages. (spinnaker#2686)

* fix(templated-pipelines): handle a missing notifications field in the template config (spinnaker#2687)

* feat(gremlin): Adds a Gremlin stage (spinnaker#2664)

Gremlin is a fault-injection tool, and this addition wraps its API to allow for creation/monitoring/halting a fault-injection.

* feat(spel): add manifestLabelValue helper function (spinnaker#2691)

* fix(build): make gradle use https (spinnaker#2695)

spinnaker/spinnaker#3997

* fix(expressions): Fix ConcurrentModificationException (spinnaker#2698)

Fixes error introduced by me in spinnaker#2653

* chore(artifacts): Clean up generics in ArtifactResolver (spinnaker#2700)

* feat(artifacts): Add stage artifact resolver (spinnaker#2702)

The new method is used to fully resolve a bound artifact on a stage that can either select an expected artifact ID for an expected artifact defined in a prior stage or as a trigger constraint OR define an inline expression-evaluable default artifact.

* refactor(core): Allow registering custom SpEL expression functions (spinnaker#2701)

Provides a strategy for extending Orca's SpEL expression language with new helper functions.
Additionally offers a more strongly typed and documented method of building these helpers, as
well as namespacing capabilities in case an organization wants to provide a canned group of
helper functions.

* feat(redblack): pin min size of source server group

* fix(redblack): unpin min size of source server group when deploy fails

* fix(artifacts): handle bound artifact account missing (spinnaker#2705)

the bound artifact should use the matched artifact account if it doesn’t have one

* feat(upsertScalingPolicyTask): make upsertScalingPolicyTask retryable (spinnaker#2703)

* feat(upsertScalingPolicyTask): make upsertScalingPolicyTask retryable

* feat(upsertScalingPolicyTask): make upsertScalingPolicyTask retryable

* feat(upsertScalingPolicyTask): make upsertScalingPolicyTask retryable

* feat(MPTv2): Support artifacts when executing v2 templated pipelines. (spinnaker#2710)

* fix(imagetagging): retry for missing namedImages (spinnaker#2713)

* refactor(cf): Adopt artifacts model for CF deployments (spinnaker#2714)

* chore(jinja): Upgrade jinjava (spinnaker#2717)

The ModuleTag class breaks with jinjava >= 2.2.8 because of a fix in
failsOnUnknownTokens.  We need to catch this error in cases where we
expect that a token may be unknown. The test also uses True as a literal
which should be true (ie, lowercase) which worked before but breaks with
the upgrade.

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2716)

* fix(artifacts): Revert double artifact resolution (spinnaker#2719)

This reverts commit 2f766c8.

* feat(spel): New `currentStage()` function (spinnaker#2718)

This fn provides a direct reference to the currently executing `Stage`.

* feat(core): add #stageByRefId SpEL helper function (spinnaker#2699)

* feat(spel): Moved `stageByRefId` to `StageExpressionFunctionProvider` (spinnaker#2721)

* fix(MPTv2): Supports artifact resolution for v2 MPTs. (spinnaker#2725)

* fix(artifacts): Make artifact resolution idempotent (spinnaker#2731)

Artifact resolution is not currently idempotent; this causes
occasional bugs where trying to resolve again causes errors due
to duplicate artifacts. In order to be truly idempotent, also
changed the Set instances in the resolution to LinkedHashSet
so that the order of operations (and resulting artifact lists
in the pipeline) are stable and deterministic.

* feat(rrb): add support for preconditions check as part of deploy stage

Surprising things can happen when we start a deployment with a starting
cluster configuration where there are multiple active server groups.

With this change, we make an attempt to fail the deployment earlier, in
particular before we make potentially dangerous changes to the
infrastructure.

* feat(core): Add support for an Artifactory Trigger (spinnaker#2728)

Co-Authored-By: Jammy Louie <jlouie@pivotal.io>

* debug(clouddriver): Log when initial target capacity cannot be found (spinnaker#2734)

* debug(clouddriver): Log when initial target capacity cannot be found (spinnaker#2736)

* fix(mpt): temporarily pin back jinjava (spinnaker#2741)

jinjava >= 2.2.9 breaks some v1 pipeline evaluation due to a change in
unknown token behavior that needs to be handled.

* debug(clouddriver): Include cloudprovider in server group capacity logs (spinnaker#2742)

* debug(clouddriver): Include cloudprovider in server group capacity logs (spinnaker#2744)

* feat(cloudformation): support YAML templates (spinnaker#2737)

adds support for YAML templates by attempting to deserialize using
snakeyaml instead of object mapper. since YAML is a superset of JSON
snakeyaml can process either format properly.

* fix(provider/azure): Failed to delete firewall (spinnaker#2747)

Background:
Currently it failed to delete firewall and throw timeout exception at force cache stage while deleting firewall. After investigated, delete firewall task and force cache refresh task would be processed concurrently. If delete firewall task doesn't get the response from azure within 20 seconds, then force cache refresh task would throw timeout exception.
Fix:
So update the execution order for deleting firewall. After updated, force cache refresh task will be processed when monitor delete task is completed. After tested, now it can delete firewall successfully.

* feat(core): add save pipelines stage (spinnaker#2715)

* feat(core): add save pipelines stage

This stage will be used to extract pipelines from an artifact and save them. Spinnaker is a tool for deploying code, so when we treat pipelines as code it makes sense to use Spinnaker to deploy them.
Imagine you have your pipelines in a GitHub repo, and on each build you create an artifact that describes your pipelines. This CircleCI build is an example:
https://circleci.com/gh/claymccoy/canal_example/14#artifacts/containers/0

You can now create a pipleine that is triggered by that build, grab the artifact produced, and (with this new stage) extract the pipelines and save them.
It performs an upsert, where it looks for existing pipelines by app and name and uses the id to update in that case.

The format of the artifact is a JSON object where the top level keys are application names with values that are a list of pipelines for the app. The nested pipeline JSON is standard pipeline JSON. Here is an example:
https://14-171799544-gh.circle-artifacts.com/0/pipelines/pipelines.json

For now this simply upserts every pipeline in the artifact, but in the future it could allow you to specify a subset of apps and pipelines and effectively test and increase the scope of pipeline roll out. It (or a similar stage) could also save pipeline templates in the future as well.

* Use constructors and private fields rather than auto wired

* summarize results of pipeline saves

* Summarize save pipelines results as created, updated, or failed

* fix(evaluateVariables): enable EvaluateVariables stage to run in dryRun (spinnaker#2745)

Variables are hard to get right, hence the EvaluateVariables stage, but
if it doesn't run in `dryRun` it makes quickly iterating on it harder.

* feat(clouddriver): Favor target capacity when current desired == 0 (spinnaker#2746)

This handles a situation where we appear to be getting an occasional
stale min/max/desired when an asg id is re-used across deployments.

ie. app-v000 was deleted and re-created

Also reaching out to AWS for clarification as to whether this is
expected or not.

* feat(buildservices): Permission support for build services (CI's) (spinnaker#2673)

* refactor(triggers): Gate calls to igor from orca (spinnaker#2748)

Now that echo handles augmenting triggers with build info, and that
manual triggers default to go through echo, all triggers should be
arriving in Orca with their build information already populated. We
should gate the logic in Orca to only run if it's not there.

We can't completely remove this logic yet because while manual
triggering via echo defaults to enabled, there's still a flag to
turn it off. Once that flag is deprecated, and we're confident that
all manual triggers (including any via the API) go through echo,
we can completely remove this block of code.

This commit also completely removes the populating of taggedImages,
which has not been used since spinnaker#837.

* feat(metrcis): convert stage.invocations.duration to PercentileTimer (spinnaker#2743)

* feature(ci): Fetch artifacts from CI builds (spinnaker#2723)

* refactor(ci): Convert BuildService and IgorService to Java

Both of these files are trivial to convert to Java. Also, in
IgorService, replace deprecated @EncodedPath with the equivalent
@path(encode = false).

* refactor(ci): Clean up inheritance of CI stages

It's confusing that Travis and Wercker stages extend Jenkins stages;
make an abstract CIStage that these all extend. Also convert these
stages to Java, and clean up some of the string interpolation that
was hard to read.

* feature(ci): Fetch artifacts from CI builds

Jenkins triggers support inflating artifacts from the build results
based on a Jinja template specified in the build's properties. Add
the same functionality to Jenkins stages.

Use the property file defined in the CI stage to extract artifacts
from the CI stage build.

* refactor(ci): Pull getProperties into its own task

The prior commit added general support for retrying tasks that
communicate with Igor; update getProperties to be its own task
and have it use that new support.

* feat(cf): Add Sharing / Unsharing of services (spinnaker#2750)

- Also removed Autowired fields from com.netflix.spinnaker.orca.clouddriver.pipeline.servicebroker.*ServiceStage
- Also converted all service-related tests from Spock to JUnit

spinnaker/spinnaker#4065

Co-Authored-By: Jason Chu <jchu@pivotal.io>

* fix(core): Fix startup (spinnaker#2753)

Orca no longer starts up without front50 because a dependency on
front50 was added to a task bean. Halyard needs an orca without
front50 to bootstrap deploy to kubernetes.

* test(core): Remove front50 bean from startup test (spinnaker#2752)

When deploying to Kubernetes, halyard uses a bootstrap orca that
doesn't have front50 enabled. This means that we need to keep orca
starting without front50 around; set it to false in the test and
remove the mock bean.

* fix(clouddriver): Revert change to pin source capacity for redblack deploys (spinnaker#2756)

* fix(authz): Fix copying pipelines with managed service accounts (spinnaker#2754)

* fix(authz): Fix copying pipelines with managed service accounts

Copying pipelines currently fails when managed service accounts are enabled. This commit fixes that by generating a pipeline id (UUID) in Orca before generating the managed service account.

* Set flag for cron trigger update, overwrite old managed service accounts

* fix(pipelines): Remove gating of SavePipelineTask (spinnaker#2759)

* fix(ci): Fix cast error in CIStage (spinnaker#2760)

* fix(ci): Fix cast error in CIStage

WaitForCompletion is a Boolean, but given that the prior groovy
code accepted either a String or a Boolean, being defensive and
handling both.

* fix(ci): Fix default value

* feat(pipelines): if saving multiple pipelines, continue on failure (spinnaker#2755)

This is a minor change to SavePipelineTask. By default it will still fail with a TERMINAL status exactly as before. I’ve added a unit test for this original behavior. But if it detects that multiple pipelines are being saved, then the failure status will be FAILED_CONTINUE instead. This allows the SavePipelinesFromArtifactStage to attempt to save all the pipelines from the artifact and then give a summary of the results.

* fix(scriptStage): add get properties task (spinnaker#2762)

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2763)

* fix(executions): Break up execution lookup script. (spinnaker#2757)

The previous implementation was expensive in terms of
redis memory usage and eventually would cause slowdowns
of redis command processing.

* feat(expressions): adding #pipeline function (spinnaker#2738)

Added #pipeline function for usage in SpEL expressions.
It returns the ID of the pipeline given the name of the pipeline
(within the same app only)

* fix(aws): fix NPE when image name is not yet available (spinnaker#2764)

* feat(kubernetes): add expression evaluation options to bake and deploy manifest stages (spinnaker#2761)

* fix(docker): fix gradle build step (spinnaker#2765)

* feat(core): Add support for Concourse triggers (spinnaker#2770)

* fix(tests): Use JUnit vintage engine so Spock tests still run (spinnaker#2766)

*  test(ci): Add tests to JenkinsStage and ScriptStage (spinnaker#2768)

* test(ci): Add tests to JenkinsStage and ScriptStage

There are no tests that waitForCompletion is properly read when
starting a Jenkins stage; this functionality recently broke, so
add some tests to it.  Also add some tests to verify that binding
of artifacts works correctly.

The ScriptStage has no tests at all; add a simple test that it
fetches the properties after the script finishes.

* refactor(ci): Move CIJobRequest and rename it CIStageDefinition

This definition was originally only used one specific task, but
it really represents the full stage definition of a CIStage, so
rename it and move it to a more appropriate location.

* refactor(ci): Remove explicit casting from CIStage

In order to quickly fix a bug, I explicitly added some code to
coerce waitForCompletion to a boolean. Let Jackson handle this by
adding it to the CIStageDefinition model, and also do the same with
the pre-existing expectedArtifacts field.

* test(pipelinetemplates): Add tests to ModuleTag (spinnaker#2767)

I was trying to reproduce the particular case that broke on the
Jinjava upgrade.  So far I've been unsuccessful, as these tests
all pass both before and after the upgrade. But it's probably
worth committing the tests anyway, just to prevent other issues
in the future.

* fix(jenkins): Fix Jenkins trigger serialization (spinnaker#2774)

* chore(core): Remove noisy spel function registration log message (spinnaker#2776)

* fix(spel): Optionally auto-inject execution for spel helpers (spinnaker#2772)

* fix(MPTv2): Avoid resolving artifacts during v2 MPT plan. (spinnaker#2777)

* fix(unpin): bug caused us to not be able to unpin to 0

0 was being used in the Elvis operator but it is not Groovy-truthy.
Fix that to an actual null check.

* fix(unpin): touch up ResizeStrategySupportSpec

Delete unused methods and add a few comments

* chore(logging): add a log wrapper in chatty WaitForUpInstancesTask

This allows us to consolidate the multiple log messages into a single
one, and silences calls that don't pass a splainer object to
calculateTargetDesiredSize.

* chore(logging): add complete explanations

The goal is to provide a single consolidated log that in
WaitForCapacityMatchTask and WaitForUpInstancesTask that explains every
step of the decision. No more guessing!

* fix(concourse): Fix concourse build info type (spinnaker#2779)

* feat(ecs): Grab container image from trigger or context (spinnaker#2751)

* fix(artifacts): Fix successful filter for find artifacts (spinnaker#2780)

When filtering to only include successful executions, the
front-end sets the flag 'successful' in the stage config.
Orca is incorrectly looking for a value 'succeeded', and
thus ignores the value that was set on the front-end.

While 'succeeded' is slightly better named as it matches the
execution status, changing this on the backend means that
existing pipelines will work automatically.  (Whereas changing
it on the front-end would require either a pipeline migrator
or for anyone affected to re-configure the stage.)

To avoid breaking anyone who manually edited their stage config to
set 'succeeded' because that's what the back-end was looking for,
continue to accept that in the stage config.

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2784)

* feat(core): Add support for a condition aware deploy preprocessor (spinnaker#2749)

- adds ability to add a synthetic wait before a deploy stage
- pauses a deployment if certain conditions are not met
- provides visibility into currently unmet conditions

* fix(triggers): Add Jenkins and Concourse build info types to allowed deserialization types (spinnaker#2785)

* fix(clouddriver): Reduce jardiff connect/read timeouts (spinnaker#2786)

This handles situations where a security group may prevent ingress
and otherwise timeout after the default (10s).

We automatically retry 5 times so our deployments are taking _at least_
50s longer than they need to.

This isn't perfect but that 50s will be lowered to 10s.

* fix(sql): Fix intermittently failing tests (spinnaker#2788)

Some of the execution repository tests assert the ordering of
executions retreived from the database, but ULIDs are only
monotonic with millisecond resolution. The fix has been to sleep
for 1ms beteween creating executions, which mostly fixed the tests
but there are still occasional failures where two executions have
the same timestamp bits in their ULID.

To reduce these failures, just wait 5ms between adding executions to
the database.

* fix(conditions): Make task conditional per config (spinnaker#2790)

- s/@ConditionalOnBean/@ConditionalOnExpression

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2789)

* fix(queue): Ensure that after stages run with an authenticated context (spinnaker#2791)

This addresses a situation where the `DeployCanaryStage` failed to
plan after stages that involved a lookup against a restricted account.

* feat(gcb): Add Google Cloud Build stage (spinnaker#2787)

Add a stage that triggers a build using Google Cloud Build. At
this point, the stage only accepts a build definition directly
in the stage, and does not wait for the build to complete.

* feat(clouddriver): Remove ec2-classic migration code (spinnaker#2794)

* fix(loadbalancer): wait for onDemand cache processing when supported (spinnaker#2795)

* chore(clouddriver): rest helper for deploying clouddriver-sql with an empty cache (spinnaker#2690)

* chore(logging): remove high volume logs used to debug old issue

* fix(imagetagging): asset foundImage count >= upstreamImageIds count (spinnaker#2799)

* feat(runJob): support kubernetes jobs (spinnaker#2793)

adds support for preconfigured run job stages for kubernetes v2

* fix(preconfiguredJob): add tests preconfig job (spinnaker#2802)

adds tests for the preconfigured job stage for upcoming groovy -> java
refactor.

* Revert "fix(imagetagging): asset foundImage count >= upstreamImageIds count (spinnaker#2799)" (spinnaker#2807)

This reverts commit 0d25d50.

* fix(MPTv2): Fix pipeline triggers for v2 templated pipelines. (spinnaker#2803)

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2800)

* feat(gremlin): Add Halyard config for Gremlin (spinnaker#2806)

* fix(clouddriver): Enable the parameter of allowDeleteActive (spinnaker#2801)

* refactor(MPTv2): Change nomenclature from version to tag. (spinnaker#2814)

* fix(MPTv2): Allow unresolved SpEL in v2 MPT plan. (spinnaker#2816)

* fix(provider/cf): Bind clone manifest artifacts (spinnaker#2815)

* fix(expressions): make sure trigger is a map (spinnaker#2817)

Some places uses `ContextParameterProcessor.buildExecutionContext` which (correctly) converts the
`trigger` in the pipeline to an object (instead of keeping it as `Trigger` class).
However, when tasks run, they use `.withMergedContext` which opts to build it's own SpEL
executionContext (in `.augmentContext`). This is needed so that fancy look up in ancestor stages works.
Fix `.augmentContext` to convert the trigger to an object/map.

This addresses issues evaluating expressions like
`${trigger.buildNumber ?: #stage('s1').context.buildNumber}`` when no `buildNumber` is present on the `trigger`

* feat(cf): Create Service Key Stage (spinnaker#2819)

spinnaker/spinnaker#4242

Co-Authored-By:  Ria Stein <eleftheria.kousathana@gmail.com>

* fix(MPTv2): Fix for spinnaker#2803 (spinnaker#2823)

Many templatedPipelines don't have a `.schema` set and the code treats them as `v2` but they should
stay as `v1` and not be processed.
We saw a bunch of pipelines going through to `V2Util.planPipeline` increasing traffic to `front50`
about 200x and many calls would fail (presumably due to `v1` being treated as `v2`?)

* feat(deleteSnapshot): Adding deleteSnapshot stage and deleteSnapshot … (spinnaker#2769)

* feat(deleteSnapshot): Adding deleteSnapshot stage and deleteSnapshot task

* feat(deleteSnapshot): Adding deleteSnapshot stage and deleteSnapshot task

* feat(deleteSnapshot): Adding deleteSnapshot stage and deleteSnapshot task

* fix(cloneservergrouptask): undo the move of CloneServerGroupTask (spinnaker#2826)

This change undoes the move of CloneServerGroupTask to a different package
(this was introduced in spinnaker#2815).

Today, tasks can not be moved once they have been created because Orca will
be unable to deserialize the tasks already in the queue created under a
different package.

* feat(conditions): Adding support for config based conditions (spinnaker#2822)

- support for config based conditions
- support for config based whitelisted clusters

* feat(exp): deployedServerGroups now grabs deployments from deploy result (spinnaker#2812)

* feat(exp): deployedServerGroups now grabs deployments from deploy results

The deployment result is the place that cloud providers can put info about their deployments, but it is deeply nested on the context and hard to find. That can be eased by appending it to the info returned from the existing deployedServerGroups property/function.

* Changed generics into nested classes

I considered making this model generally useful, but after casting the context just to get a very specific path it doesn’t feel like it belongs with the more general purpose stuff under model. It actually seems to conflict with the existing stage/task/context model and would cause confusion.

*  feat(provider/kubernetes): Add traffic options to deploy manifest (spinnaker#2829)

* refactor(provider/kubernetes): Use constructor injection

Refactor DeployManifestTask to use constructor injection, and
change impelentation-specific fields to be private and final.

* feat(provider/kubernetes): Add traffic options to deploy manifest

Add functionality to the deploy manifest task to handle the fields
in the new trafficManagement section. In particular, pass along
any specified service as well as whether to send traffic to new
workloads.

* chore(expressions): Don't create new objectmapper all the time (spinnaker#2831)

This is from additional PR comments on spinnaker#2817

* chore(build): Upgrade to Gradle 5.0

* feat(webhook): retry on name resolution failures

This is adding a new narrow use case where we retry, instead of retrying
on broader exceptions. CreateWebhookTask was already quite specific about
the failure modes that are retryable, so this change is consistent with
that pattern. The reason for being conservative is that we don't want to
potentially cause side effects where we hit the same webhook multiple
times in a row.

I chose to rely on orca's built in scheduling mechanism (instead of
internal retries within the task) as it seems to be properly configured
already in terms of backoff and timeout.

* fix(webhook): catch URL validation failures in MonitorWebhookTask

Also actually fail on regular IllegalArgumentExceptions that are not
caused by UnknownHostException

* fix(k8s): Deploy manifest now accepts stage-inlined artifacts (spinnaker#2830)

* fix(travis): Support timestamp in JenkinsBuildInfo (provided by Travis) (spinnaker#2813)

(Travis is piggybacking on the Jenkins trigger implementation in Orca)

* feat(preconfiguredJobs): produces artifacts (spinnaker#2835)

allow preconfigured jobs to produce artifact. kubernetes jobs default to
true. can be overridden.

* fix(repository): don't generate invalid grammar (spinnaker#2833)

When performing a search with a pipeline name that doesn't exist
sqlrepository will generate a query with an empty IN clause:
```
SELECT * FROM ... WHERE IN ()
```

this change short circuits this evaluation.

Additionally, set H2 into `MODE=MYSQL` so that we can actually catch
this invalid SQL in unit tests

* fix(aws/imagetag): ensure matched images by name includes all upstream ids (spinnaker#2839)

* chore(*):  Seeding initial OWNERS file (spinnaker#2840)

* fix(expressions): populate context for evaluateExpression endpoint (spinnaker#2841)

When calling `pipelines/{id}/evaluateExpression`
we don't populate the eval context in the same way as we do for regular pipeline execution.
This means expressions that work during regular execution don't work when using this EP.
Most notably, nothing in `triggers` (or, more importantly, `trigger.parameters`) is accessible via
`${parameters["myParam"]}` instead one must modify the expression to be
`${execution.trigger.parameters["myParam"]}`

* feat(MPTv2): Inverts inherit -> exclude in template inheritance. (spinnaker#2842)

Previously by default, template consumers would have to specifically
opt in to including triggers, parameters, and notifications from
the parent template. This is an inversion of what is expected by
template consumers and has resulted in a bunch of confusion and
misuse. We've changed the default to inherit by default and have
users manually opt in to exclude the fields targeted by "inherit".

* fix(logging): Correctly propagate stage IDs for logging (spinnaker#2847)

Currently, the stageID isn't sent over when we make e.g. clouddriver calls.
This makes tracing operation of a given stage VERY difficult.
Add stageIDs to the MDC context for propagation with HTTP headers

* chore(migrations): ignore empty migrations (spinnaker#2846)

Currently, any non-standard migrations are defined in the orca.yml as a list.
Sometimes, it's nice to be able to run without those migrations
(e.g. when developing both in OSS and private land). However, due to an issue in
spring (which, I believe, should be fixed in boot2.0.3) it's impossible to
override a list with an empty list, but you can override a list with a new list.
Hence this change to not run empy entries

* fix(MPTv2): Restricts config var scope based on declared template vars. (spinnaker#2849)

* feat(kubernetes): support redblack and highlander strategies (spinnaker#2844)

* fix(provider/azure): Failed to disable azure server group when rollback (spinnaker#2848)

* Revert "fix(provider/azure): Failed to disable azure server group when rollback (spinnaker#2848)" (spinnaker#2850)

This reverts commit 4c1a896.

* feat(cf): Fetch created service key via SpEL (spinnaker#2827)

spinnaker/spinnaker#4260

Co-Authored-By: Ria Stein <eleftheria.kousathana@gmail.com>
Co-Authored-By: Stu Pollock <spollock@pivotal.io>

* refactor(gcb): Use generic maps for GCB objects (spinnaker#2853)

Orca currently deserializes the build configuration in the stage to
a Build object only to send it immediately over to igor (which
re-serializes it). As orca doesn't actually need to know any of the
details about the Build object it's sending over, just use a generic
Map.

This addresses an issue where some fields are not correctly
deserializing using the default objectMapper; I'll need to
work around this in igor, but this reduces the scope of where
that workaround needs to live.  (This also allows us to
better encapsulate which microservices actually need to know
about GCB objects.)

* fix(kubernetes): remove unused imports from DeployManifestStage

* feat(kubernetes): pass DeployManifestTask strategy to Clouddriver to enable downstream validation

* fix(orca-core): Add CANCELED to list of COMPLETED statuses (spinnaker#2845)

This will enable cleanup of old CANCELED tasks

* fix(FindImageFromCluster): only infer regions from deploy for aws (spinnaker#2851)

* fix(orca): if build stage fails and prop file exists, try and fetch (spinnaker#2855)

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2838)

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2856)

* fix(webhooks): Avoid a nasty HTTP 500 fetching preconfigured webhooks (spinnaker#2858)

This PR offers protection if `fiat` were unavailable when attempting
to fetch preconfigured webhooks.

The current behavior results in an HTTP 500 even if there were no
restricted preconfigured webhooks.

The proposed behavior would result in the unrestricted subset being
returned with an error logged.

* refactor(conditions): Do not inject waitForCondition on no conditions (spinnaker#2859)

- Updated interface to specify cluster, region and account
- Make injecting the stage optional on if there are active conditions

- Move to orca-clouddriver as this is deploy-centric

* chore(gradle): Convert orca to use kork-bom (spinnaker#2860)

* chore(BOM): Make orca use kork-bom

* Use `kork-bom`
* Remove dependence on `netflix.servo`
* Fixed a bunch of `TODOs` in gradle files along the way

* chore(cf): Move cfServiceKey from orca-core to orca-integrations-cloudfoundry (spinnaker#2857)

* Revert "chore(gradle): Convert orca to use kork-bom (spinnaker#2860)" (spinnaker#2863)

This reverts commit fedf50d.

* chore(openstack): remove openstack provider (spinnaker#2865)

* chore(conditions): Adding logging (spinnaker#2866)

- added more logging around pausing deploys

*  refactor(provider/kubernetes): Add tests and simplify cache refresh (spinnaker#2869)

* test(provider/kubernetes): Add tests to ManifestForceCacheRefresh

Most of the functionality in ManifestForceCacheRefreshTask is not
tested. Add significant test coverage to this class to prepare for
a bug fix/refactor.

* refactor(provider/kubernetes): Replace nested maps with an object

The ManifestForceCacheRefreshTask currently keeps track of its manifests
as a Map<String, List<String>> which leads to a lot of complex iteration
over this structure. Create a new class ScopedManifest and flatten this
structure into a List<ScopedManifest> so that it's much easier to
follow the processing that the class is doing.

* refactor(provider/kubernetes): Add account to ScopedManifest

Rather than thread account through multiple calls in this class, add
it to the ScopedManifest class and set the account on each manifest
when we create the initial list.

This also makes the helper class Details identical to ScopedManifest,
so replace instances of Details with ScopedManifest.

Finally, replace pendingRefreshProcessed which both returns a status
and mutates the stage context with getRefreshStatus, which only
returns a status. Leave mutation of the context to the caller,
checkPendingRefreshes. This allows us to avoid needing to mutate
a ScopedManifest and keep the class immutable.

* refactor(provider/kubernetes): Track ScopedManifests directly

Now that we have a simple data class to represent a manifest to
refresh that implements equals and hashCode, we don't need to
manually serialize it with toManifestIdentifier. Just directly
add these manifests to the collections we're using to track
them.

* chore(dependencies): ensure direct dependency on kork-secrets-aws (spinnaker#2870)

Prereq for spinnaker/kork#273 merge

* fix(provider/kubernetes): Don't poll immediately after cache refresh (spinnaker#2871)

* test(provider/kubernetes): Re-order statements in tests

This commit has no functional effect, it is just going to make the
tests much easier to read in the next commit, where I change the
order between checking on pending cache refresh requests and sending
new ones.

* refactor(provider/kubernetes): Change control flow in refresh

Instead of having refreshManifests and checkPendingRefreshes
contain logic for determinig if the task is done, have them
focus on mutating refreshedManifests and deployedManifests
as they take actions.

Then add allManifestsProcessed to check whether we're done; which
is just checking if all deployed manifests have been processed.
This removes the need to track the state of all manifests in the
mutating functions and allows us to consolidate the return value
of the task to one place.

* fix(provider/kubernetes): Don't poll immediately after cache refresh

We currently poll clouddriver to get the status of a cache refresh
immediately after requesting the cache refresh, and schedule a
re-refresh is we don't see the request we just requested.

For users with a read-only clouddriver pointed at a replica of redis,
there will be some replication lag before the pending refresh
appears in the cache, which will cause us to keep re-scheduling the
same cache refresh.

To address this, wait one polling cycle before checking on the status of
a pending refresh. Do this by changing the order of operations so that
we first check on any pending requests from the last cycle, then
schedule and needed new requests.

* fix(clouddriver): Hoist inferredRegions var to parent scope so it is accessible to groovy code down below (spinnaker#2874)

* chore(conditions): Adding better log messages (spinnaker#2867)

- update log messages

* feat(cf): Added support for rolling red black deployments (spinnaker#2864)

Co-Authored-By: Joris Melchior <joris.melchior@gmail.com>
Co-Authored-By: Ria Stein <eleftheria.kousathana@gmail.com>

* feat(cf): Delete Service Key pipeline stage (spinnaker#2834)

- Also instroduced an intermediate package called `cf`

spinnaker/spinnaker#4250

Co-Authored-By: Ria Stein <eleftheria.kousathana@gmail.com>
Co-Authored-By: Stu Pollock <spollock@pivotal.io>

*  feat(gcb): Monitor GCB build status after starting a build (spinnaker#2875)

* refactor(gcb): Use a GoogleCloudBuild type as return from igor

To make the task logic simpler, create a class GoogleCloudBuild
that has the build fields that Orca cares about and have retrofit
deserialize the result from igor. Also, add the resulting field
to the stage context so it can be used by downstream tasks.

* feat(gcb): Monitor GCB build status after starting a build

Instead of immediatly returning success once the new GCB build is
accepted, wait until the build completes and set the status of the
stage based on the result of the build.

* fix(MPTv2): Fails plan on missing template variable value. (spinnaker#2876)

* feat(core): Delegate task/stage lookup to `TaskResolver` and `StageResolver` respectively (spinnaker#2868)

The `*Resolver` implementations will be able to look at both raw
implementation classes _as well as_ any specified aliases.

This supports (currently unsupported!) use cases that would be made
easier if a `Task` or `StageDefinitionBuilder` could be renamed.

* feat(core): Add support for more flexible execution preprocessors (spinnaker#2798)

* chore(conditions): Adding metrics around deploy pauses (spinnaker#2877)

- added a metric around deploy pauses

* chore(*): Bump dependencies to 1.42.0 (spinnaker#2879)

* feat(core): Allow `ExecutionPreprocessor` to discriminate on type (spinnaker#2881)

This supports existing use cases around pipeline-centric preprocessors.

* refactor(ci): Generify RetryableIgorTask (spinnaker#2880)

* refactor(ci): Generify RetryableIgorTask

I'd like to re-use the logic in RetryableIgorTask for some GCB
tasks, but it requires that the stage map to a CIStageDefinition.
Make the stage definition a parameter to the class so it can be
re-used.

At some point it might make sense to make this even more general than
just for igor tasks, but at least this makes it a bit more general
than it is now.

* fix(ci): RetryableIgorTask should also retry on network errors

We're currently immediately looking up the response status, which
will NPE (and fail the task) if the error is a network error. We
should retry network errors as these are likely to succeed on
retry.

* fix(web): s/pipeline/config (spinnaker#2883)

Doh!

* fix(web): s/pipeline/config (spinnaker#2884)

* fix(cloudformation): Scope force update by account and region (spinnaker#2843)

When force updating the cloud formation cache on the
AmazonCloudFormationCachingAgent, it can take some time if the number of
accounts is quite big because the on demand update iterates over all
accounts for a given type. 

This makes the force refresh task to fail because of a timeout.

This patch sends scoping information to clouddriver so the caching
agents can skip doing the update if the force refresh does not involve
its own account and or region, making on demand update more efficient.

* fix(gremlin): Remove Gremlin config template and let Halyard do it from scratch (spinnaker#2873)

* feat(gcb): Fetch artifacts produced by a GCB build (spinnaker#2882)

* feat(gcb): Fetch artifacts produced by a GCB build

* refactor(gcb): Refactor monitor task to use generic retry logic

Now that RetryableIgorTask is generic, MonitorGoogleCloudBuildTask
can extend it instead of having its own retry logic.

* fix(gcb): Add longer timeout to GCB polling stage

We're currently using the default timeout of 1 minute, which will
likely be too short.  As a starting point, use the timeout and
backoff period we use for other CI jobs, though we can change
these in the future if needed.

* refactor(TaskResult): Add a TaskResultBuilder and use it everywhere (spinnaker#2872)

There are a significant number of times when a variable named 'outputs'
is passed to the 'context' field of TaskResult instead of the 'outputs'
field. I don't know if these are bugs, but it should be less error-prone
this way.

* feat(gce): Add SetStatefulDisk{Task,Stage} (spinnaker#2887)

* fix(gcb): Properly set buildInfo in the context (spinnaker#2890)

The refactor of TaskResult dropped setting the context in
StartGoogleCloudBuildTask; restore it. We also should update the
context each time we run MonitorGoogleCloudBuildTask so the
context has an up-to-date status; add this.

* chore(conditions): Adding a flag to skip wait (spinnaker#2885)

- added config property tasks.evaluateCondition.skipWait
- when skipWait=true, all paused deployments will proceed

* refactor(headers): Update spinnaker headers (spinnaker#2861)

Update the headers to match spinnaker/kork#270

* feat(core): ability to resolve targeted SG on different accounts and regions (spinnaker#2862)

* feat(kayenta): pass the accountId to Kayenta for deployments (spinnaker#2889)

* refactor(logging): improve logging for AbstractWaitForClusterWideClouddriverTask (spinnaker#2888)

Add specificity to the log message so you can tell which task is actually waiting for something to happen.
So.. instead of:
```
Pipeline 01D9QSYBNK7ENG0BSM7V9V07ZZ is looking for [us-east-1->serverlabmvulfson-dev-v044]
Server groups matching AbstractWaitForClusterWideClouddriverTask$DeployServerGroup(region:us-east-1, name:serverlabmvulfson-dev-v044) ...
```
we will have:
```
Pipeline 01D9R3HMZHZ82H0RNKG1D8JBRW:WaitForClusterDisableTask looking for server groups: [us-east-1->serverlabmvulfson-dev-v047] found: [[instances...
```

* fix(logging): use logger.GetFactory over @slf4j (spinnaker#2894)

This gets the proper class/logger name instead of using the parent `AbstractWaitForClusterWideClouddriverTask`
fixup for: spinnaker#2888

* chore(dependencies): Autobump spinnaker-dependencies (spinnaker#2891)

*  feat(gcb): Allow the build definition to come from an artifact  (spinnaker#2896)

* feat(gcb): Allow the build definition to come from an artifact

As an alternative to having the build definition inline in the stage,
allow it to come from an artifact.

* chore(gcb): nest gcb artifact properties under one key

* feat(runJob/kubernetes): extract log annotation (spinnaker#2893)

* refactor(jobRunner): refactor k8s job runner

refactor job runner from groovy to java

* feat(runJob/kubernetes): extract log url template

pulls the annotation `jobs.spinnaker.io/logs` and injects it into the
execution context where most UI components look for this link. this will
be used for the UI to provide a link to an external logging platform.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants