Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions docs/adr/29084-json-serialization-for-array-typed-import-inputs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# ADR-29084: JSON Serialization for Array-Typed Import-Inputs in Compiled Env Vars

**Date**: 2026-04-29
**Status**: Draft
**Deciders**: pelikhan, copilot-swe-agent

---

## Part 1 — Narrative (Human-Friendly)

### Context

Agentic workflows support shared workflow imports with typed `import-schema` fields, including `array`-typed inputs. When a caller passes an array value via `with:` and the shared workflow references it with `${{ github.aw.import-inputs.X }}` inside a step or job `env:` block, the compiler must serialize that value to a string. Prior to this fix, the serialization fell through to Go's `fmt.Sprint` fallback, producing the non-standard Go slice format `[a b]` instead of valid JSON `["a","b"]`. A compounding issue is that `goccy/go-yaml` — the YAML parser used in this repo — may deserialize YAML sequences as typed Go slices (`[]string`) rather than `[]any`, bypassing the existing `case []any:` JSON serialization branch entirely.

### Decision

We will serialize all array- and map-typed import-input values as JSON when writing them into compiled `env:` blocks, using a shared `marshalEnvValue()` helper that handles `[]any`, `map[string]any`, and typed slice/map variants via reflection. This applies consistently across all three serialization sites: `MapToStep` (step-level env), `buildCustomJobs` (job-level env), and `marshalImportInputValue` / `substituteImportInputsInContent` (content substitution). The `fmt.Sprint` fallback is retained only for scalar types (int, bool, float64, etc.) that do not require structured encoding.

### Alternatives Considered

#### Alternative 1: Comma-Separated String Join for Arrays

Join array elements with a comma separator (e.g., `microsoft/apm#main,github/awesome-copilot/skills/foo`) instead of JSON encoding. This would produce a simpler string that some shell scripts could split with `IFS=,`. However, it is not valid JSON, breaks for values containing commas, and is incompatible with `jq --argjson` and other JSON-consuming tools. It also doesn't compose well with map values.

#### Alternative 2: Normalize YAML Deserialization to Always Return `[]any`

Fix the root cause at the YAML parsing layer by wrapping `goccy/go-yaml` to always convert typed slices to `[]interface{}` immediately after deserialization. This would eliminate the need for reflection at the serialization sites. It was not chosen because it would require modifying shared parsing infrastructure used across many code paths, increasing the blast radius of the change. The reflection fallback is more localized and can be removed later if the YAML layer is standardized.

### Consequences

#### Positive
- Shell consumers that use `jq --argjson $VAR` now receive valid JSON arrays and objects.
- All three serialization paths (step env, job env, content substitution) are consistent and produce the same output for the same input type.
- Defense-in-depth: even if a future YAML parser upgrade changes the Go type returned for sequences, the reflection fallback ensures correct JSON output.

#### Negative
- The `reflect` package is now a dependency of three additional files in the hot compilation path, adding marginal complexity.
- Scalar non-string values (int, bool) continue to use `fmt.Sprint`, meaning there is no single uniform serialization strategy across all value types.
- Any tooling that previously relied on the `[a b]` format (unlikely, as it was a bug) would break.

#### Neutral
- The `marshalEnvValue()` helper is defined in `step_types.go` and shared with `compiler_jobs.go` via package scope; `marshalImportInputValue` and `substituteImportInputsInContent` retain their own local reflection logic in the `parser` package due to package boundaries.
- New regression tests were added covering `[]string`, `[]int`, `map[string]any`, and `[]any` inputs.

---

## Part 2 — Normative Specification (RFC 2119)

> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

### Env Value Serialization

1. Implementations **MUST** serialize array-typed import-input values as valid JSON arrays (e.g., `["a","b"]`) when writing them into compiled `env:` blocks at both step level and job level.
2. Implementations **MUST** serialize map-typed import-input values as valid JSON objects when writing them into compiled `env:` blocks.
3. Implementations **MUST NOT** use Go's `fmt.Sprint` or equivalent default string formatting for slice or map values in `env:` blocks, as this produces non-JSON output such as `[a b]`.
4. Implementations **MUST** handle typed Go slices (e.g., `[]string`, `[]int`) produced by the YAML parser via reflection, normalizing them to `[]any` before JSON marshaling.
5. Implementations **SHOULD** apply this serialization consistently across all env-writing code paths: step-level env (`MapToStep`), job-level env (`buildCustomJobs`), and content substitution (`marshalImportInputValue`, `substituteImportInputsInContent`).
6. Implementations **MAY** retain `fmt.Sprint` as a fallback for scalar non-string types (int, bool, float64) where JSON encoding is not required.

### Serialization Helper Scope

1. Implementations **MUST** use a shared serialization helper (e.g., `marshalEnvValue`) for step-level and job-level env serialization within the same package to avoid code duplication.
2. Implementations **MAY** duplicate the reflection logic in other packages (e.g., `parser`) where package boundaries prevent sharing the helper, provided the behavior is equivalent.

### Conformance

An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Specifically, conformance requires that any array or map value passed as an import-input and referenced in a compiled `env:` block is serialized as a valid JSON string, and that Go's default slice formatting (`[a b]`) never appears as an env value for structured types. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance.

---

*ADR created by [adr-writer agent]. Review and finalize before changing status from Draft to Accepted.*
32 changes: 32 additions & 0 deletions pkg/parser/import_field_extractor.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ import (
"fmt"
"maps"
"path/filepath"
"reflect"
"regexp"
"sort"
"strings"
)

Expand Down Expand Up @@ -795,6 +797,8 @@ func substituteImportInputsInContent(content string, inputs map[string]any) stri
}
// Serialize the value: arrays and maps as JSON (valid YAML inline syntax),
// scalars with fmt.Sprintf.
// goccy/go-yaml may produce typed slices (e.g. []string) instead of []any,
// so a reflection fallback handles those cases.
switch v := value.(type) {
case []any:
if b, err := json.Marshal(v); err == nil {
Expand All @@ -804,6 +808,34 @@ func substituteImportInputsInContent(content string, inputs map[string]any) stri
if b, err := json.Marshal(v); err == nil {
return string(b), true
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

substituteImportInputsInContent uses reflect.ValueOf(v) in the default branch. If an input key is present with a nil value (e.g. with: { packages: } in YAML), reflect.ValueOf(nil) produces an invalid Value and rv.Kind() will panic. Add an explicit nil check / case nil: before the reflection fallback so null inputs don’t crash compilation.

Suggested change
}
}
case nil:
// Preserve the existing scalar fallback behavior for null inputs,
// but avoid calling reflect.ValueOf(nil), which yields an invalid
// Value and would panic on Kind().

Copilot uses AI. Check for mistakes.
case nil:
// Null import input — skip substitution to avoid panicking on reflect.ValueOf(nil).
return "", false
default:
rv := reflect.ValueOf(v)
switch rv.Kind() {
case reflect.Slice:
normalized := make([]any, rv.Len())
for i := range rv.Len() {
normalized[i] = rv.Index(i).Interface()
}
if b, err := json.Marshal(normalized); err == nil {
return string(b), true
}
case reflect.Map:
keys := make([]string, 0, rv.Len())
for _, key := range rv.MapKeys() {
keys = append(keys, key.String())
}
sort.Strings(keys)
normalized := make(map[string]any, rv.Len())
for _, k := range keys {
normalized[k] = rv.MapIndex(reflect.ValueOf(k)).Interface()
}
if b, err := json.Marshal(normalized); err == nil {
return string(b), true
}
}
}
return fmt.Sprintf("%v", value), true
}
Expand Down
6 changes: 4 additions & 2 deletions pkg/workflow/compiler_jobs.go
Original file line number Diff line number Diff line change
Expand Up @@ -670,8 +670,10 @@ func (c *Compiler) buildCustomJobs(data *WorkflowData, activationJobCreated bool
for key, val := range envMap {
if valStr, ok := val.(string); ok {
job.Env[key] = valStr
} else {
compilerJobsLog.Printf("Warning: env '%s' in job '%s' has non-string value (type: %T), ignoring", key, jobName, val)
} else if val != nil {
// Arrays and maps are serialized as JSON so that shell consumers
// (e.g. jq --argjson) receive valid JSON.
job.Env[key] = marshalEnvValue(val)
}
}
}
Expand Down
34 changes: 34 additions & 0 deletions pkg/workflow/expression_extraction.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"encoding/json"
"fmt"
"os"
"reflect"
"regexp"
"sort"
"strings"
Expand Down Expand Up @@ -327,6 +328,9 @@ func SubstituteImportInputs(content string, importInputs map[string]any) string
// substitution into both YAML frontmatter and markdown prose.
// Arrays and maps are serialized as JSON (which is valid YAML inline syntax).
// Scalar values use Go's default string formatting.
//
// goccy/go-yaml may produce typed slices (e.g. []string) instead of []any, so
// a reflection fallback converts any slice kind to []any before JSON marshaling.
func marshalImportInputValue(value any) string {
switch v := value.(type) {
case []any:
Expand All @@ -337,6 +341,36 @@ func marshalImportInputValue(value any) string {
if b, err := json.Marshal(v); err == nil {
return string(b)
}
case nil:
// Null import input — return empty string rather than panicking.
return ""
default:
// Handle typed slices (e.g. []string) that goccy/go-yaml may produce
// instead of []any, and typed maps.
rv := reflect.ValueOf(v)
switch rv.Kind() {
Comment on lines +347 to +351
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marshalImportInputValue now uses reflect.ValueOf(v) in the default branch. If the resolved import input exists but is explicitly null in YAML (so the map contains the key with a nil value), reflect.ValueOf(nil) yields an invalid Value and rv.Kind() will panic at runtime. Add a case nil: (or a if value == nil guard) to bypass the reflection path and preserve the previous non-panicking behavior.

Copilot uses AI. Check for mistakes.
case reflect.Slice:
normalized := make([]any, rv.Len())
for i := range rv.Len() {
normalized[i] = rv.Index(i).Interface()
}
if b, err := json.Marshal(normalized); err == nil {
return string(b)
}
case reflect.Map:
keys := make([]string, 0, rv.Len())
for _, key := range rv.MapKeys() {
keys = append(keys, key.String())
}
sort.Strings(keys)
normalized := make(map[string]any, rv.Len())
for _, k := range keys {
normalized[k] = rv.MapIndex(reflect.ValueOf(k)).Interface()
}
if b, err := json.Marshal(normalized); err == nil {
return string(b)
}
}
}
return fmt.Sprintf("%v", value)
}
Expand Down
67 changes: 67 additions & 0 deletions pkg/workflow/expression_extraction_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -478,3 +478,70 @@ func TestApplyWorkflowDispatchFallbacks(t *testing.T) {
})
}
}

// TestMarshalImportInputValue tests the marshalImportInputValue helper for correct
// JSON serialization of array and map types, including typed slices produced by
// goccy/go-yaml (e.g. []string instead of []any).
func TestMarshalImportInputValue(t *testing.T) {
tests := []struct {
name string
value any
want string
}{
{
name: "string scalar",
value: "hello",
want: "hello",
},
{
name: "int scalar",
value: 42,
want: "42",
},
{
name: "bool scalar",
value: true,
want: "true",
},
{
name: "[]any slice",
value: []any{"a", "b", "c"},
want: `["a","b","c"]`,
},
{
// goccy/go-yaml produces []string instead of []any for string arrays
name: "[]string typed slice (goccy/go-yaml output)",
value: []string{"microsoft/apm#main", "github/awesome-copilot/skills/foo"},
want: `["microsoft/apm#main","github/awesome-copilot/skills/foo"]`,
},
{
name: "[]int typed slice",
value: []int{1, 2, 3},
want: `[1,2,3]`,
},
{
name: "map[string]any",
value: map[string]any{"key": "val"},
want: `{"key":"val"}`,
},
{
name: "empty []string",
value: []string{},
want: `[]`,
},
{
name: "nil value",
value: nil,
want: "",
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := marshalImportInputValue(tt.value)
if got != tt.want {
t.Errorf("marshalImportInputValue(%v) = %q, want %q", tt.value, got, tt.want)
}
})
}
}
85 changes: 85 additions & 0 deletions pkg/workflow/imports_inputs_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -277,3 +277,88 @@ This workflow tests that string imports still work.
t.Error("Expected shared content to NOT be inlined (should use runtime-import)")
}
}

// TestImportWithArrayInputs tests that array-typed import-inputs are serialized as
// JSON (e.g. ["a","b"]) rather than the Go default slice format ("[a b]").
// This is the regression test for the bug where goccy/go-yaml returns []string
// instead of []any, causing the Go fmt.Sprintf fallback to produce "[a b]".
func TestImportWithArrayInputs(t *testing.T) {
tempDir := testutil.TempDir(t, "test-import-array-inputs-*")

sharedDir := filepath.Join(tempDir, "shared")
if err := os.MkdirAll(sharedDir, 0755); err != nil {
t.Fatalf("Failed to create shared directory: %v", err)
}

// Shared workflow that accepts an array input and references it in a run: step
sharedContent := `---
import-schema:
packages:
type: array
items:
type: string
required: true
description: "List of packages"
on: workflow_call
jobs:
show:
runs-on: ubuntu-latest
steps:
- env:
PKGS: ${{ github.aw.import-inputs.packages }}
run: echo "$PKGS"
---

Process packages ${{ github.aw.import-inputs.packages }}.
`
sharedPath := filepath.Join(sharedDir, "pkgs.md")
if err := os.WriteFile(sharedPath, []byte(sharedContent), 0644); err != nil {
t.Fatalf("Failed to write shared file: %v", err)
}

// Caller workflow that provides a YAML array for the 'packages' input
callerContent := `---
on: push
permissions:
contents: read
engine: copilot
imports:
- uses: shared/pkgs.md
with:
packages:
- microsoft/apm#main
- github/awesome-copilot/skills/foo
---

Caller workflow.
`
callerPath := filepath.Join(tempDir, "caller.md")
if err := os.WriteFile(callerPath, []byte(callerContent), 0644); err != nil {
t.Fatalf("Failed to write caller file: %v", err)
}

compiler := workflow.NewCompiler()
if err := compiler.CompileWorkflow(callerPath); err != nil {
t.Fatalf("CompileWorkflow failed: %v", err)
}

lockFilePath := stringutil.MarkdownToLockFile(callerPath)
lockFileContent, err := os.ReadFile(lockFilePath)
if err != nil {
t.Fatalf("Failed to read lock file: %v", err)
}
lockContent := string(lockFileContent)

// The substituted value must be valid JSON, not the Go slice format "[a b]"
if strings.Contains(lockContent, "[microsoft/apm#main github/awesome-copilot/skills/foo]") {
t.Error("Array input was serialized as Go slice format '[a b]'; expected JSON array")
}
if !strings.Contains(lockContent, `["microsoft/apm#main","github/awesome-copilot/skills/foo"]`) {
t.Errorf("Expected JSON array in lock file, got:\n%s", lockContent)
}

// No unresolved expressions should remain
if strings.Contains(lockContent, "github.aw.import-inputs.packages") {
t.Error("Unresolved import-inputs expression remained in lock file")
}
}
Loading