Skip to content

Conversation

@malclocke
Copy link
Contributor

@malclocke malclocke commented Oct 21, 2025

This changes adds a new command, bktec plan [...], which generates a test plan without running any tests.

The command prints select metadata about the plan to STDOUT:

$ bktec plan
{"BKTEC_PLAN_IDENTIFIER":"facecafe","BKTEC_PARALLELISM":"42"}

This output format is intended to be used as input for buildkite-agent env set e.g.:

$ bktec plan | buildkite-agent env set --input-format json
Added:
+ BKTEC_PARALLELISM
+ BKTEC_PLAN_IDENTIFIER


func logErrorAndExit(err error) {
exitError := new(exec.ExitError)
fmt.Fprintln(os.Stderr, err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was accidentally removed during the review of #371 😅

Without this line logErrorAndExit becomes andExit!

// Structure of the JSON that is output when running `bktec plan`.
type TestPlanSummary struct {
Identifier string `json:"BKTEC_PLAN_IDENTIFIER"`
Parallelism string `json:"BKTEC_PARALLELISM"`
Copy link
Contributor Author

@malclocke malclocke Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallelism is strictly an int not a string, and it's value is converted from an int representation in testPlan.Parallelism on line 68.

It's represented as a string here because this struct is output to STDOUT in JSON format with the intention that it be piped into buildkite-agent env set --input-format=json -, which requires string keys and values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a shorter version of this comment be made into a code comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh, sounds good. Will update.

// createRequestParam generates the parameters needed for a test plan request.
// For runners other than "rspec", it constructs the test plan parameters with all test files.
// For the "rspec" runner, it filters the test files through the Test Engine API and splits the filtered files into examples.
func createRequestParam(ctx context.Context, cfg config.Config, files []string, client api.Client, runner TestRunner) (api.TestPlanParams, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is moved from command.Run as it's now used in both command.Run and command.Plan.

Apart from the addition of MaxParallelism it's unchanged.

}
}

func getTestFilesFromFile(path string) ([]string, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is moved unchanged from command.Run as it's now used in both command.Run and command.Plan.

@@ -0,0 +1,86 @@
package command_test
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is conceptually a "black box" test I've put it in a different package so it can only test the exported methods of command.Plan.

@malclocke malclocke marked this pull request as ready for review October 22, 2025 02:06
@malclocke malclocke requested a review from a team as a code owner October 22, 2025 02:06

source "https://rubygems.org"

gem "rspec"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test in internal/command/run_test.go calls rspec --dry-run during the plan generation, so needs a Gemfile to be present.

// TestPlan represents the entire test plan.
type TestPlan struct {
Identifier string `json:"identifier"`
Parallelism int `json:"parallelism"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields are already present in the JSON response from the test plan API, but were not declared in this struct.

Value: 0,
Usage: "instruct the test planner to calculate optimal parallelism for the build, not to exceed the provided value. When 0 this flag is ignored and the test plan parallelism will be derived from the BUILDKITE_PARALLEL_JOB_COUNT environment variable",
Action: func(ctx context.Context, cmd *cli.Command, v int) error {
if v < 0 || v > 1000 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API for max parallelism has this same constraint.

env := env.OS{}

debug.SetDebug(env.Get("BUILDKITE_TEST_ENGINE_DEBUG_ENABLED") == "true")
debug.SetOutput(os.Stderr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? If yes, is it because we want the output the plan summary stdout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's intentional, because the output on STDOUT for this command is intended to be parsed. If we "cross the streams" the output won't be valid JSON:

DEBUG: some message
{"BKTEC_PARALLELISM":"5","BKTEC_PLAN_IDENTIFIER":"abc123/zxy987"}

I'll probably pitch for all debug to go to STDERR at some point, although I see looking at the git history this has flip-flopped over time between STDOUT / STDERR so I imagine there is some background there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember why we flip-flopped from streaming the debug to stdout/stderr, but it will be nice if they are being streamed to the same place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the origin story #85 (comment)

I think we should codify that outputting to stdout by default instead of stderr is an explicit decision to conform to 12-factor principles.

Requiring everything to go to STDOUT will require a complete rethink of the dynamic parallelism mechanism so I'm pretty keen not to maintain that restriction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging. Outputing the log to stderr for plan command is fine to me. We might need to redesign the whole bktec logging/debugging at some point.

var planWriter io.Writer = os.Stdout

// Structure of the JSON that is output when running `bktec plan`.
type TestPlanSummary struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a placeholder or the final structure we want for triggering the dynamic pipeline? We’re using BUILDKITE_TEST_ENGINE_XX for other environment variables, so I’m wondering if using BKTEC_XX would be inconsistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistency seems desirable.

My understanding these are temporary env vars that exist only for the purposes for generating the dynamic pipeline steps so perhaps we want to differentiate these from the other BUILDKITE_TEST_ENGINE_XX env vars?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUILDKITE_TEST_ENGINE_XX sounds good, I'll update the naming.

}

// By default command.Plan writes to os.Stdout but the output can be changed here.
func SetPlanWriter(writer io.Writer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason this has to be public? If it's just for testing, we don't need to make it public because the test lives in the same package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll "unexport" it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nprizal the test actually isn't in the same package, see my comment at the top of the test file about it being a "black box" test.

Would you prefer I leave the SetPlanWriter method exported, or move the test into package plan?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question is whether we want the consumer to set/change the writer or not. If this is solely for testing purposes, I'd rather have to keep it private and use other way to test the output. But if we expect the consumer to set/change the writer, then making it public is a way to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] I think we can move this to request_param.go as we call this fn as part of param creation process.

Copy link
Contributor

@nprizal nprizal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and it works. But I have a question around the test plan summary structure and env vars key.

The other thing I noticed that the default error message for invalid option value is a bit cryptic. I wonder if we can customize the error message

Incorrect Usage: invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax
invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing this file to run.go was useful for my review.

Will we add debugging statements later?
Is it worth adding TODOs about handling server errors and connectivity issues gracefully?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we add debugging statements later?

@gchan I think most of the debug output is generated in the methods that are called from this method. E.g. command.Plan calls createRequestParams and the latter has debug statements. Did you have something else in mind?

Is it worth adding TODOs about handling server errors and connectivity issues gracefully?

The apiClient calls down api.Client.doWithRetry which has some error handling, did you have something more in mind?

// DoWithRetry sends http request with retries.
// Successful API response (status code 200) is JSON decoded and stored in the value pointed to by v.
// The request will be retried when the server returns 429 or 5xx status code, or when there is a network error.
// After reaching the retry timeout, the function will return ErrRetryTimeout.
// The request will not be retried when the server returns 4xx status code,
// and the error message will be returned as an error.
func (c *Client) DoWithRetry(ctx context.Context, reqOptions httpRequest, v interface{}) (*http.Response, error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing extra to add regarding debugging output!

Regarding error handling, I was wondering if we have considered what this may look like should we support an offline/fallback mode and whether we should add any code comments.

@gchan
Copy link
Contributor

gchan commented Oct 22, 2025

Looks good to me (as someone new to bktec and refreshing on GoLang). Reviewing commit-by-commit was easy and helped me understand the changes better, thanks!

Extract this duplicate behaviour from command.Run and command.Plan
into a separate func and file.
The `log` part of `logErrorAndExit` was accidentally removed in
a recent change.
This method is used in both `command.Run` and `command.Plan`.
@malclocke malclocke force-pushed the te-4789-add-plan-command branch from 8d64b31 to 1ba62a6 Compare October 23, 2025 02:15
@malclocke malclocke requested review from gchan and nprizal October 23, 2025 02:29
@malclocke
Copy link
Contributor Author

Tested locally and it works. But I have a question around the test plan summary structure and env vars key.

@nprizal thanks, I've updated these if you wouldn't mind having another look.

The other thing I noticed that the default error message for invalid option value is a bit cryptic. I wonder if we can customize the error message

Incorrect Usage: invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax
invalid value "foobar" for flag -max-parallelism: strconv.ParseInt: parsing "foobar": invalid syntax

I don't see a way to do this, I'll have more of a dig.

Copy link
Contributor

@nprizal nprizal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I’ll leave the decision around making the SetPlanWriter public or private to you.

@malclocke
Copy link
Contributor Author

Looks good! I’ll leave the decision around making the SetPlanWriter public or private to you.

I think I'll leave it exported, I'll probably remove the method entirely in #379

@malclocke malclocke merged commit 28fd4ed into main Oct 23, 2025
1 check passed
@malclocke malclocke deleted the te-4789-add-plan-command branch October 23, 2025 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants