Te 4789 add plan command #374

malclocke · 2025-10-21T02:11:20Z

This changes adds a new command, bktec plan [...], which generates a test plan without running any tests.

The command prints select metadata about the plan to STDOUT:

$ bktec plan
{"BKTEC_PLAN_IDENTIFIER":"facecafe","BKTEC_PARALLELISM":"42"}

This output format is intended to be used as input for buildkite-agent env set e.g.:

$ bktec plan | buildkite-agent env set --input-format json
Added:
+ BKTEC_PARALLELISM
+ BKTEC_PLAN_IDENTIFIER

malclocke · 2025-10-21T02:13:49Z

main.go


 func logErrorAndExit(err error) {
-	exitError := new(exec.ExitError)
+	fmt.Fprintln(os.Stderr, err)


This was accidentally removed during the review of #371 😅

Without this line logErrorAndExit becomes andExit!

malclocke · 2025-10-21T02:20:38Z

internal/command/plan.go

+// Structure of the JSON that is output when running `bktec plan`.
+type TestPlanSummary struct {
+	Identifier  string `json:"BKTEC_PLAN_IDENTIFIER"`
+	Parallelism string `json:"BKTEC_PARALLELISM"`


Parallelism is strictly an int not a string, and it's value is converted from an int representation in testPlan.Parallelism on line 68.

It's represented as a string here because this struct is output to STDOUT in JSON format with the intention that it be piped into buildkite-agent env set --input-format=json -, which requires string keys and values.

I wonder if a shorter version of this comment be made into a code comment?

Yeh, sounds good. Will update.

malclocke · 2025-10-21T02:23:54Z

internal/command/request_param.go

+// createRequestParam generates the parameters needed for a test plan request.
+// For runners other than "rspec", it constructs the test plan parameters with all test files.
+// For the "rspec" runner, it filters the test files through the Test Engine API and splits the filtered files into examples.
+func createRequestParam(ctx context.Context, cfg config.Config, files []string, client api.Client, runner TestRunner) (api.TestPlanParams, error) {


This method is moved from command.Run as it's now used in both command.Run and command.Plan.

Apart from the addition of MaxParallelism it's unchanged.

malclocke · 2025-10-21T02:25:04Z

internal/command/files.go

+	}
+}
+
+func getTestFilesFromFile(path string) ([]string, error) {


This method is moved unchanged from command.Run as it's now used in both command.Run and command.Plan.

malclocke · 2025-10-22T01:47:39Z

internal/command/plan_test.go

@@ -0,0 +1,86 @@
+package command_test


As this is conceptually a "black box" test I've put it in a different package so it can only test the exported methods of command.Plan.

malclocke · 2025-10-22T21:46:23Z

internal/command/testdata/rspec/Gemfile

+
+source "https://rubygems.org"
+
+gem "rspec"


The new test in internal/command/run_test.go calls rspec --dry-run during the plan generation, so needs a Gemfile to be present.

malclocke · 2025-10-22T21:47:40Z

internal/plan/type.go

 // TestPlan represents the entire test plan.
 type TestPlan struct {
+	Identifier   string           `json:"identifier"`
+	Parallelism  int              `json:"parallelism"`


These fields are already present in the JSON response from the test plan API, but were not declared in this struct.

malclocke · 2025-10-22T21:48:06Z

main.go

+						Value: 0,
+						Usage: "instruct the test planner to calculate optimal parallelism for the build, not to exceed the provided value. When 0 this flag is ignored and the test plan parallelism will be derived from the BUILDKITE_PARALLEL_JOB_COUNT environment variable",
+						Action: func(ctx context.Context, cmd *cli.Command, v int) error {
+							if v < 0 || v > 1000 {


The API for max parallelism has this same constraint.

nprizal · 2025-10-22T22:24:58Z

internal/command/plan.go

+	env := env.OS{}
+
+	debug.SetDebug(env.Get("BUILDKITE_TEST_ENGINE_DEBUG_ENABLED") == "true")
+	debug.SetOutput(os.Stderr)


Is this intentional? If yes, is it because we want the output the plan summary stdout?

Yes it's intentional, because the output on STDOUT for this command is intended to be parsed. If we "cross the streams" the output won't be valid JSON:

DEBUG: some message {"BKTEC_PARALLELISM":"5","BKTEC_PLAN_IDENTIFIER":"abc123/zxy987"}

I'll probably pitch for all debug to go to STDERR at some point, although I see looking at the git history this has flip-flopped over time between STDOUT / STDERR so I imagine there is some background there.

I can't remember why we flip-flopped from streaming the debug to stdout/stderr, but it will be nice if they are being streamed to the same place.

Here's the origin story #85 (comment)

I think we should codify that outputting to stdout by default instead of stderr is an explicit decision to conform to 12-factor principles.

Requiring everything to go to STDOUT will require a complete rethink of the dynamic parallelism mechanism so I'm pretty keen not to maintain that restriction.

Thanks for digging. Outputing the log to stderr for plan command is fine to me. We might need to redesign the whole bktec logging/debugging at some point.

nprizal · 2025-10-22T22:27:11Z

internal/command/plan.go

+var planWriter io.Writer = os.Stdout
+
+// Structure of the JSON that is output when running `bktec plan`.
+type TestPlanSummary struct {


Is this a placeholder or the final structure we want for triggering the dynamic pipeline? We’re using BUILDKITE_TEST_ENGINE_XX for other environment variables, so I’m wondering if using BKTEC_XX would be inconsistent.

Consistency seems desirable.

My understanding these are temporary env vars that exist only for the purposes for generating the dynamic pipeline steps so perhaps we want to differentiate these from the other BUILDKITE_TEST_ENGINE_XX env vars?

BUILDKITE_TEST_ENGINE_XX sounds good, I'll update the naming.

nprizal · 2025-10-22T22:30:12Z

internal/command/plan.go

+}
+
+// By default command.Plan writes to os.Stdout but the output can be changed here.
+func SetPlanWriter(writer io.Writer) {


any reason this has to be public? If it's just for testing, we don't need to make it public because the test lives in the same package.

I'll "unexport" it.

@nprizal the test actually isn't in the same package, see my comment at the top of the test file about it being a "black box" test.

Would you prefer I leave the SetPlanWriter method exported, or move the test into package plan?

I think the question is whether we want the consumer to set/change the writer or not. If this is solely for testing purposes, I'd rather have to keep it private and use other way to test the output. But if we expect the consumer to set/change the writer, then making it public is a way to go.

nprizal · 2025-10-22T22:35:07Z

internal/command/run.go

[nit] I think we can move this to request_param.go as we call this fn as part of param creation process.

nprizal

Tested locally and it works. But I have a question around the test plan summary structure and env vars key.

The other thing I noticed that the default error message for invalid option value is a bit cryptic. I wonder if we can customize the error message

Incorrect Usage: invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax
invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax

gchan · 2025-10-22T23:32:28Z

internal/command/plan.go

Comparing this file to run.go was useful for my review.

test-engine-client/internal/command/run.go

Line 1 in d17feb2

package command

Will we add debugging statements later?
Is it worth adding TODOs about handling server errors and connectivity issues gracefully?

Will we add debugging statements later?

@gchan I think most of the debug output is generated in the methods that are called from this method. E.g. command.Plan calls createRequestParams and the latter has debug statements. Did you have something else in mind?

Is it worth adding TODOs about handling server errors and connectivity issues gracefully?

The apiClient calls down api.Client.doWithRetry which has some error handling, did you have something more in mind?

test-engine-client/internal/api/client.go

Lines 96 to 102 in d17feb2

// DoWithRetry sends http request with retries.

// Successful API response (status code 200) is JSON decoded and stored in the value pointed to by v.

// The request will be retried when the server returns 429 or 5xx status code, or when there is a network error.

// After reaching the retry timeout, the function will return ErrRetryTimeout.

// The request will not be retried when the server returns 4xx status code,

// and the error message will be returned as an error.

func (c *Client) DoWithRetry(ctx context.Context, reqOptions httpRequest, v interface{}) (*http.Response, error) {

Nothing extra to add regarding debugging output!

Regarding error handling, I was wondering if we have considered what this may look like should we support an offline/fallback mode and whether we should add any code comments.

gchan · 2025-10-22T23:45:43Z

Looks good to me (as someone new to bktec and refreshing on GoLang). Reviewing commit-by-commit was easy and helped me understand the changes better, thanks!

Extract this duplicate behaviour from command.Run and command.Plan into a separate func and file.

The `log` part of `logErrorAndExit` was accidentally removed in a recent change.

This method is used in both `command.Run` and `command.Plan`.

malclocke · 2025-10-23T02:30:53Z

Tested locally and it works. But I have a question around the test plan summary structure and env vars key.

@nprizal thanks, I've updated these if you wouldn't mind having another look.

The other thing I noticed that the default error message for invalid option value is a bit cryptic. I wonder if we can customize the error message

Incorrect Usage: invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax
invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax

I don't see a way to do this, I'll have more of a dig.

nprizal

Looks good! I’ll leave the decision around making the SetPlanWriter public or private to you.

malclocke · 2025-10-23T22:05:44Z

Looks good! I’ll leave the decision around making the SetPlanWriter public or private to you.

I think I'll leave it exported, I'll probably remove the method entirely in #379

malclocke commented Oct 21, 2025

View reviewed changes

malclocke commented Oct 22, 2025

View reviewed changes

malclocke marked this pull request as ready for review October 22, 2025 02:06

malclocke requested a review from a team as a code owner October 22, 2025 02:06

malclocke commented Oct 22, 2025

View reviewed changes

nprizal reviewed Oct 22, 2025

View reviewed changes

gchan reviewed Oct 22, 2025

View reviewed changes

malclocke added 7 commits October 23, 2025 15:14

Add plan command

9bbdc2e

Extract command.getTestFiles()

ad807f9

Extract this duplicate behaviour from command.Run and command.Plan into a separate func and file.

Fix logging in logErrorAndExit

3e6b03b

The `log` part of `logErrorAndExit` was accidentally removed in a recent change.

Add --max-parallelism flag to bktec run

0709c7e

Move command.createRequestParam to separate file

e3069c1

This method is used in both `command.Run` and `command.Plan`.

Add test for command.Plan

d8bba17

Move filterAndSplitFiles to request_param.go

1ba62a6

malclocke force-pushed the te-4789-add-plan-command branch from 8d64b31 to 1ba62a6 Compare October 23, 2025 02:15

malclocke requested review from gchan and nprizal October 23, 2025 02:29

Add comment on string conversion of parallelism

5d68063

nprizal approved these changes Oct 23, 2025

View reviewed changes

gchan approved these changes Oct 23, 2025

View reviewed changes

malclocke merged commit 28fd4ed into main Oct 23, 2025
1 check passed

malclocke deleted the te-4789-add-plan-command branch October 23, 2025 22:06

	// DoWithRetry sends http request with retries.
	// Successful API response (status code 200) is JSON decoded and stored in the value pointed to by v.
	// The request will be retried when the server returns 429 or 5xx status code, or when there is a network error.
	// After reaching the retry timeout, the function will return ErrRetryTimeout.
	// The request will not be retried when the server returns 4xx status code,
	// and the error message will be returned as an error.
	func (c Client) DoWithRetry(ctx context.Context, reqOptions httpRequest, v interface{}) (http.Response, error) {


		source "https://rubygems.org"

		gem "rspec"

Te 4789 add plan command #374

Te 4789 add plan command #374

Uh oh!

Conversation

malclocke commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malclocke Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nprizal left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gchan commented Oct 22, 2025

Uh oh!

malclocke commented Oct 23, 2025

Uh oh!

nprizal left a comment

Choose a reason for hiding this comment

Uh oh!

malclocke commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

malclocke commented Oct 21, 2025 •

edited

Loading

malclocke Oct 21, 2025 •

edited

Loading