Skip to content

feat: add CNCF ModelPack type definitions#859

Open
ilopezluna wants to merge 9 commits intomainfrom
cncf-packaging-types
Open

feat: add CNCF ModelPack type definitions#859
ilopezluna wants to merge 9 commits intomainfrom
cncf-packaging-types

Conversation

@ilopezluna
Copy link
Copy Markdown
Contributor

  • Add CNCF ModelPack media type constants and Model type (modelpack/types.go)
  • Add DockerConfigToModelPack conversion utility (modelpack/convert.go)
  • Add CNCFModel partial type for CNCF format support (partial/cncf_model.go)
  • Add ArtifactType field to OCI Manifest struct (oci/manifest.go)
  • Update partial/mutate layers for format-aware manifest options
  • Replace deprecated WithConfigMediaType with WithManifestOptions
  • Add model-spec v0.0.7 dependency

…manifest ArtifactType support

- Add CNCF ModelPack media type constants and Model type (modelpack/types.go)
- Add DockerConfigToModelPack conversion utility (modelpack/convert.go)
- Add CNCFModel partial type for CNCF format support (partial/cncf_model.go)
- Add ArtifactType field to OCI Manifest struct (oci/manifest.go)
- Update partial/mutate layers for format-aware manifest options
- Replace deprecated WithConfigMediaType with WithManifestOptions
- Add model-spec v0.0.7 dependency
…builder

- Add BuildFormat type and WithFormat option to builder for Docker/CNCF selection
- Implement CNCF artifact creation with media type remapping and config conversion
- Add --format CLI flag to 'docker model package' command with validation
- Improve context size error handling (return error for invalid values)
- Add comprehensive builder tests for CNCF format (artifact validation, media types)
- Add integration tests for CNCF format packaging
- Update CLI reference documentation
- Add DiffID extraction support for CNCF remote layers (oci/remote)
- Add CNCF layer handling in bundle unpack with media type classification
- Detect mmproj files by filepath annotation in CNCF generic weight layers
- Add findModelFileExcluding helper for mmproj-aware model file discovery
- Add comprehensive tests for DiffID extraction, CNCF layer unpacking,
  mmproj detection, and parse with CNCF mmproj GGUF files
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements native support for the CNCF ModelPack specification by integrating the official model-spec dependency and introducing logic to handle CNCF-compliant manifests and configurations. Key changes include the addition of a CNCFModel artifact type, layer classification heuristics in modelpack/convert.go, and support for the artifactType field in OCI manifests. A critical issue was identified in pkg/distribution/internal/partial/partial.go:260, where the current implementation for MediaTypeMultimodalProjector lacks a mapping for CNCF types, which breaks multimodal support for CNCF artifacts and requires a resolution mechanism via annotations or metadata.

@ilopezluna ilopezluna marked this pull request as ready for review April 13, 2026 13:05
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In DockerConfigToModelPack, using time.Now() as a fallback for CreatedAt makes the converted configs non-deterministic and time-dependent; consider leaving CreatedAt nil or passing an explicit time from the caller to keep artifacts reproducible.
  • CNCFModel.Descriptor currently only surfaces Created from the ModelPack descriptor and drops other identity fields (name, version, etc.); if downstream code relies on richer descriptor metadata, consider mapping additional fields or documenting that this descriptor is intentionally minimal.
  • model.m.isCNCFBase reads and inspects the base config on every RawConfigFile call; if this path is hot, consider caching the format decision or the parsed result inside the model wrapper to avoid repeated JSON work.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `DockerConfigToModelPack`, using `time.Now()` as a fallback for `CreatedAt` makes the converted configs non-deterministic and time-dependent; consider leaving `CreatedAt` nil or passing an explicit time from the caller to keep artifacts reproducible.
- CNCFModel.Descriptor currently only surfaces `Created` from the ModelPack descriptor and drops other identity fields (name, version, etc.); if downstream code relies on richer descriptor metadata, consider mapping additional fields or documenting that this descriptor is intentionally minimal.
- model.m.isCNCFBase reads and inspects the base config on every RawConfigFile call; if this path is hot, consider caching the format decision or the parsed result inside the model wrapper to avoid repeated JSON work.

## Individual Comments

### Comment 1
<location path="pkg/distribution/modelpack/convert.go" line_range="142-149" />
<code_context>
+	}
+}
+
+// normalizeParamSize converts a Docker-format parameters string (e.g.
+// "8.03B", "70B") to a model-spec paramSize string (e.g. "8b", "70b").
+// Returns empty string if s is empty.
+func normalizeParamSize(s string) string {
+	if s == "" {
+		return ""
+	}
+	return strings.ToLower(s)
+}
</code_context>
<issue_to_address>
**issue:** The normalizeParamSize comment overpromises compared to the implementation

The current behavior only lowercases the string, while the comment suggests semantic normalization (e.g., converting "8.03B" to "8b"). Either update the comment to describe case normalization only, or extend the function to actually parse and canonicalize the parameter size if that’s required.
</issue_to_address>

### Comment 2
<location path="pkg/distribution/modelpack/convert.go" line_range="113-128" />
<code_context>
+// DockerConfigToModelPack converts a Docker-format model config into a
+// CNCF ModelPack Model config. The diffIDs should already be in
+// digest.Digest ("algorithm:hex") format.
+func DockerConfigToModelPack(
+	cfg types.Config,
+	desc types.Descriptor,
+	diffIDs []digest.Digest,
+) Model {
+	now := time.Now()
+	createdAt := desc.Created
+	if createdAt == nil {
+		createdAt = &now
+	}
+	return Model{
+		Descriptor: ModelDescriptor{
+			CreatedAt: createdAt,
+			// Map architecture to family as the closest available field.
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Defaulting CreatedAt to time.Now can hurt reproducibility of converted configs

Because `desc.Created` is nil, this helper always fills `CreatedAt` with `time.Now()`, so the same Docker config converts to different CNCF configs over time, breaking determinism for digests/manifests. If reproducibility matters here, it’s safer to leave `CreatedAt` nil in that case or make this behavior configurable at the call site.

```suggestion
func DockerConfigToModelPack(
	cfg types.Config,
	desc types.Descriptor,
	diffIDs []digest.Digest,
) Model {
	return Model{
		Descriptor: ModelDescriptor{
			// Preserve CreatedAt from the original descriptor; if it's nil,
			// leave it unset to keep conversions deterministic.
			CreatedAt: desc.Created,
			// Map architecture to family as the closest available field.
			Family: cfg.Architecture,
		},
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Address review feedback from PR #860:

1. remappedLayer.GetDescriptor: pre-compute digest and size at construction
   time via newRemappedLayer(), surfacing errors eagerly instead of silently
   producing invalid OCI descriptors.

2. FromModel format inheritance: when --format is not explicitly set, auto-
   detect the source model's format (Docker vs CNCF) from its config. This
   prevents producing inconsistent artifacts when repackaging a CNCF model
   without an explicit --format flag.

3. CLI conditional --format: only pass WithFormat to the builder when the
   user explicitly sets --format, allowing the builder's auto-detection to
   work for the repackaging case.
…eParamSize comment

- Remove time.Now() fallback in DockerConfigToModelPack: propagate
  desc.Created directly (nil when unset) so callers control timestamps
  and converted configs remain deterministic across runs.
- Fix normalizeParamSize comment to accurately describe its behavior
  (lowercases, does not truncate decimal precision).
@ilopezluna
Copy link
Copy Markdown
Contributor Author

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In model.rawCNCFConfigFile, the base config is unmarshalled on every call to RawConfigFile; consider caching the parsed modelpack.Model (or at least the fact that the base is CNCF) to avoid repeated JSON decode work when configs are requested multiple times.
  • The fallback behavior in ClassifyLayer/classifyByPath treats unknown media types and FileTypeUnknown as KindWeightConfig; if there are likely to be truly unknown weight files, you may want to expand the heuristics (e.g., by extension) or make this default configurable to avoid silently misclassifying weights as config.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `model.rawCNCFConfigFile`, the base config is unmarshalled on every call to `RawConfigFile`; consider caching the parsed `modelpack.Model` (or at least the fact that the base is CNCF) to avoid repeated JSON decode work when configs are requested multiple times.
- The fallback behavior in `ClassifyLayer`/`classifyByPath` treats unknown media types and `FileTypeUnknown` as `KindWeightConfig`; if there are likely to be truly unknown weight files, you may want to expand the heuristics (e.g., by extension) or make this default configurable to avoid silently misclassifying weights as config.

## Individual Comments

### Comment 1
<location path="pkg/distribution/modelpack/convert.go" line_range="112-121" />
<code_context>
+// DockerConfigToModelPack converts a Docker-format model config into a
+// CNCF ModelPack Model config. The diffIDs should already be in
+// digest.Digest ("algorithm:hex") format.
+func DockerConfigToModelPack(
+	cfg types.Config,
+	desc types.Descriptor,
+	diffIDs []digest.Digest,
+) Model {
+	// Preserve determinism by propagating desc.Created directly.
+	// Callers that require a concrete timestamp should set desc.Created
+	// explicitly before calling this function.
+	return Model{
+		Descriptor: ModelDescriptor{
+			CreatedAt: desc.Created,
+			// Map architecture to family as the closest available field.
+			Family: cfg.Architecture,
+		},
+		Config: ModelConfig{
+			Architecture: cfg.Architecture,
+			Format:       string(cfg.Format),
</code_context>
<issue_to_address>
**suggestion (bug_risk):** DockerConfigToModelPack omits some relevant fields (e.g., Precision, Capabilities) when mapping Docker config to CNCF ModelPack

This mapping currently only forwards Architecture, Format, ParamSize (via Parameters), and Quantization into CNCF ModelConfig, dropping other available fields from the upstream spec/Docker config. If this is meant to be the canonical conversion, please also map any supported fields such as Precision (and, if applicable, Capabilities) so that information isn’t silently lost.

Suggested implementation:

```golang
		Config: ModelConfig{
			Architecture: cfg.Architecture,
			Format:       string(cfg.Format),
			ParamSize:    normalizeParamSize(cfg.Parameters),
			Quantization: cfg.Quantization,
			Precision:    cfg.Precision,
			Capabilities: cfg.Capabilities,
		},

```

This assumes that:
1. `types.Config` already has fields `Precision` and `Capabilities`.
2. `ModelConfig` already defines corresponding fields `Precision` and `Capabilities` with compatible types.

If either struct does not yet define these fields, you will need to:
- Add `Precision` and `Capabilities` to `ModelConfig` with the appropriate types and JSON tags, consistent with the CNCF ModelPack spec and existing style in this package.
- Ensure `types.Config` defines `Precision` and `Capabilities` fields that represent the upstream Docker/spec config. If names or types differ, adjust the assignment accordingly (e.g., perform any needed conversions).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

…nfig

The function only lowercased the string, which is unnecessary — the CNCF
model-spec does not mandate lowercase paramSize values. Pass cfg.Parameters
directly to keep the same value as the Docker format.
@ilopezluna ilopezluna requested a review from a team April 13, 2026 13:34
… caching

The CNCF path in FromDirectory was constructing remappedLayer directly,
bypassing the digest/size pre-computation added in newRemappedLayer.
This left cachedDigest/cachedSize zero-valued for layers that aren't
descriptor providers, producing invalid OCI descriptors.
feat: add --format flag to package command and CNCF build support in builder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants