Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): execution strategy #758

Merged
merged 8 commits into from
Mar 6, 2023
Merged

Conversation

Malinskiy
Copy link
Member

@Malinskiy Malinskiy commented Mar 4, 2023

Removes strictMode in favor of executionMode consisting of:
mode:
ANY_SUCCESS test passes if any of executions is passing this mode works only if there is no complex sharding strategy applied

Why: it doesn't make sense when user asks for N number of tests to run explicitly, and we pass on the first one. Sharding used to verify probability of passing with an explicit boundary for precision

ALL_SUCCESS test passes if and only if all the executions are passing this mode works only if there are no retries, i.e. no complex flakiness strategy, no retry strategy

Why: when adding retries to tests with retry+flakiness strategies users want to trade-off cost for reliability, i.e. add more retries
and pass if one of them passes, so retries only make sense for the [ANY_SUCCESS] mode. When we use [ALL_SUCCESS] mode it means user wants to verify each test with a number of tries (they are not retries per se) and pass only if all of them succeed. This is the case when fixing a flaky test or adding a new test, and we want to have a signal that the test is fixed/not flaky.

fast: fail-fast or success-fast depending on the value of [mode]. This doesn't affect the result of the run, it only saves compute time

@codecov
Copy link

codecov bot commented Mar 4, 2023

Codecov Report

Merging #758 (7df6db2) into develop (48c3c9f) will increase coverage by 0.49%.
The diff coverage is 85.53%.

❗ Current head 7df6db2 differs from pull request most recent head 07b25ea. Consider uploading reports for the commit 07b25ea to get more accurate results

@@              Coverage Diff              @@
##             develop     #758      +/-   ##
=============================================
+ Coverage      59.06%   59.55%   +0.49%     
+ Complexity       832      820      -12     
=============================================
  Files            210      208       -2     
  Lines           4241     4285      +44     
  Branches         669      698      +29     
=============================================
+ Hits            2505     2552      +47     
+ Misses          1407     1406       -1     
+ Partials         329      327       -2     
Impacted Files Coverage Δ
...y/marathon/config/LogicalConfigurationValidator.kt 4.54% <0.00%> (-3.79%) ⬇️
...src/main/kotlin/com/malinskiy/marathon/Marathon.kt 64.91% <0.00%> (-0.61%) ⬇️
...iy/marathon/execution/progress/ProgressReporter.kt 0.00% <0.00%> (-78.58%) ⬇️
...m/malinskiy/marathon/execution/queue/QueueActor.kt 74.78% <57.14%> (-0.22%) ⬇️
...lin/com/malinskiy/marathon/config/Configuration.kt 50.94% <90.00%> (ø)
...thon/execution/progress/PoolProgressAccumulator.kt 92.85% <92.85%> (ø)
.../config/strategy/ExecutionStrategyConfiguration.kt 100.00% <100.00%> (ø)
...c/main/kotlin/com/malinskiy/marathon/di/Modules.kt 44.00% <100.00%> (-2.16%) ⬇️
...om/malinskiy/marathon/execution/DevicePoolActor.kt 73.91% <100.00%> (-0.38%) ⬇️
...tlin/com/malinskiy/marathon/execution/Scheduler.kt 83.60% <100.00%> (+0.84%) ⬆️
... and 7 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

private val poolId: DevicePoolId,
attachmentProviders: List<AttachmentProvider>,
private val attachmentCollector: AttachmentCollector = AttachmentCollector(attachmentProviders),
) : AccumulatingResultTestRunListener(timer), AttachmentListener by attachmentCollector {

private val logger = MarathonLogging.logger("TestRunResultsListener")
private val progressReporter = ProgressReporter(testBatch, poolId, device.toDeviceInfo())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this? I see that you dropped line 102 and looks like it is unused now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. So currently printing any of these doesn't mean marathon's core will actually see these events. Only after the batch results are calculated. I think these would confuse users so I'll remove them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iOS implementation should probably also not print these. If users want this level of information it's accessible using debug mode

}
onTransition {
if (it as? StateMachine.Transition.Valid !is StateMachine.Transition.Valid) {
logger.error { "from ${it.fromState} event ${it.event}" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to add more context here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point such transition is impossible: all states respond to every event

private val poolingStrategy = configuration.poolingStrategy.toPoolingStrategy()

private val logger = MarathonLogging.logger("Scheduler")

suspend fun execute() {
suspend fun execute() : Boolean {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what this bool means

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added return quickdoc

fun testEnded(device: DeviceInfo, testResult: TestResult, final: Boolean = false): TestAction? {
return when (testResult.status) {
TestStatus.FAILURE -> {
println("${toPercent(progress())} | [${poolId.name}]-[${device.serialNumber}] ${testResult.test.toTestName()} failed")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with logger?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended: it's the default way marathon prints progress when users have debug==false. We don't have an abstraction for this

* Should always be called before testEnded, otherwise the FSM might transition into a terminal state prematurely
*/
fun retryTest(test: Test): TestAction? {
return tests[test.toTestName()]?.transition(TestEvent.AddRetry).sideffect()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do warn log when tests returns null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, reworked

fun addRetries(poolId: DevicePoolId, count: Int) {
execute(poolId) { it.addTestRetries(count) }
fun testFailed(test: Test) {
println("${batch.id} | [${poolId.name}]-[${device.serialNumber}] ${test.toTestName()} failed")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logger?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. this is the old mechanism to show progress in stdout. I've removed it since most of the info should come from poolprogressaccumulator now

sealed class TestAction {
data class SaveReport(val deviceInfo: DeviceInfo, val testResult: TestResult) : TestAction()
object Conclude : TestAction()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use most common name for that action. End or Complete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or Finish?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the hardest problems in CS :) Let's use 'Complete'

val device: DeviceInfo,
val testResult: TestResult,
val count: Int
data class Failing(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's not fully cleared difference between Failed and Failing.
Maybe we could use additional differentiation: Terminal States and NotTerminal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression is that using adjectives ending with ing is quite common for this purpose in English i.e. https://buildkite.com/docs/pipelines/defining-steps#build-states

Comment on lines +71 to +73
executionStrategy:
mode: ANY_SUCCESS
fast: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you be updating the documentation in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, already updated

Comment on lines +6 to +11
* @property fast fail-fast or success-fast depending on the value of [mode]. This doesn't affect the result
* of the run, it only saves compute time
*/
data class ExecutionStrategyConfiguration(
@JsonProperty("mode") val mode: ExecutionMode = ExecutionMode.ANY_SUCCESS,
@JsonProperty("fast") val fast: Boolean = true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given it doesn't affect the result of the run, why is fast configurable by the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is explained in the docs, short version is you might want to run other retries anyways in some scenarios because test status is not the only output of a test, test probability of passing is one of the examples

Comment on lines +15 to +26
* @property ANY_SUCCESS test passes if any of executions is passing
* this mode works only if there is no complex sharding strategy applied
*
* Why: it doesn't make sense when user asks for N number of tests
* to run explicitly, and we pass on the first one. Sharding used to verify probability of passing with an explicit boundary for precision
* @property ALL_SUCCESS test passes if and only if all the executions are passing
* this mode works only if there are no retries, i.e. no complex flakiness strategy, no retry strategy
*
* Why: when adding retries to tests with retry+flakiness strategies users want to trade-off cost for reliability, i.e. add more retries
* and pass if one of them passes, so retries only make sense for the [ANY_SUCCESS] mode. When we use [ALL_SUCCESS] mode it means user
* wants to verify each test with a number of tries (they are not retries per se) and pass only if all of them succeed. This is the case
* when fixing a flaky test or adding a new test, and we want to have a signal that the test is fixed/not flaky.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a file where the full configuration defined by the user is parsed and potentially conflicting configurations such as the ones described here are identified? If so, perhaps this explanation belongs there. Furthermore, would such conflicts be presented to the user at runtime, either as a warning or error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration consists of multiple files, there is no single place to define all of it. Defaults should work for most people, if something is not right marathon will produce logical validation error during runtime. Doing so in yaml is not possible unless we have custom linting for yaml.

I've updated LogicalConfigurationValidator to fail in runtime under the edge-cases outlined here

Comment on lines +146 to +147
val name: String,
val outputDir: File,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't really matter but should these spaces have been added?
Also, I think there should be a new line above this class between lines 144 & 145.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not everything is reformatted according to the embedded code style. It works as an incremental code style formatter - you change something in a file and once you commit it will have approriate format

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a newline

Comment on lines +379 to +380
val state = it.state
when (state) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today I learned you could do this when the IDE recommended it 🙂

Suggested change
val state = it.state
when (state) {
when (val state = it.state) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also ignore IDE and language features for the sake of simplicity and readability ;)
I prefer to define variables separately from the branching logic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

executed and all non-started retries are removed.

#### Any success
Test passes if any of executions is passing. This mode works only if there is no complex sharding strategy applied. This is the default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Test passes if any of executions is passing. This mode works only if there is no complex sharding strategy applied. This is the default.
Test passes if any of its executions are passing. This mode works only if there is no complex sharding strategy applied. This is the default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

:::

#### All success
Test passes if and only if all the executions are passing. This mode works only if there are no retries, i.e. no complex flakiness strategy, no retry strategy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Test passes if and only if all the executions are passing. This mode works only if there are no retries, i.e. no complex flakiness strategy, no retry strategy.
Test passes if and only if all its executions are passing. This mode works only if there are no retries, i.e. no complex flakiness strategy, no retry strategy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


:::info

Adding retries with retry/flakiness strategies means users wants to trade-off cost for reliability, i.e. add more retries and pass if one
Copy link
Contributor

@VictorIreri VictorIreri Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Adding retries with retry/flakiness strategies means users wants to trade-off cost for reliability, i.e. add more retries and pass if one
Adding retries with retry/flakiness strategies means users want to trade off cost for reliability, i.e. add more retries and pass if one

Also, Is execution time the only "cost" being referred to here and elsewhere on this page? If so, perhaps we can be more specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cost is how much one pays with $ for the test runs

Copy link
Member Author

@Malinskiy Malinskiy Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed the wording as suggested

Adding retries with retry/flakiness strategies means users wants to trade-off cost for reliability, i.e. add more retries and pass if one
of test retries passes, so retries only make sense for any success mode.

When we use all success mode it means user wants to verify each test with a number of tries (they are not retries per se) and pass only if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When we use all success mode it means user wants to verify each test with a number of tries (they are not retries per se) and pass only if
When we use all success mode it means the user wants to verify each test with a number of tries (they are not retries per se) and pass only if

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

:::

#### Fast execution mode
When test reaches a state when a decision about it's state can be made marathon can remove additional in-progress retries.
Copy link
Contributor

@VictorIreri VictorIreri Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When test reaches a state when a decision about it's state can be made marathon can remove additional in-progress retries.
When the test reaches a state where a decision about its state can be made, marathon can remove additional in-progress retries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

This decision point is different depending on the execution mode used. Let's walk through two examples.

Assume any success strategy is used and 100 retries are scheduled for a test A via flakiness strategy. Let's say first 3 failed and the 4th attempt succeeded. At
this point the test should already be considered passed since any success out of all retries leads to the result by definition of any successs
Copy link
Contributor

@VictorIreri VictorIreri Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the first use of "any success" on line 468, is it worth changing all instances of "any success" & "all success" in this doc to ANY_SUCCESS & ALL_SUCCESS for easier reading?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced

@Malinskiy Malinskiy merged commit 6e87c35 into develop Mar 6, 2023
@Malinskiy Malinskiy deleted the feature/execution-strategy branch March 6, 2023 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants