Skip to content

feat(topograph): support trimming upper network tiers#233

Merged
dmitsh merged 1 commit into
mainfrom
ds-tree
Mar 11, 2026
Merged

feat(topograph): support trimming upper network tiers#233
dmitsh merged 1 commit into
mainfrom
ds-tree

Conversation

@dmitsh
Copy link
Copy Markdown
Collaborator

@dmitsh dmitsh commented Mar 9, 2026

No description provided.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 9, 2026

Greptile Summary

This PR adds support for trimming upper network tiers (spine and/or core switch levels) from the three-tier topology graph, configurable via a new trimTiers parameter (valid range 0–2). The feature is wired end-to-end across all providers (AWS, GCP, LambdaI, Nebius, OCI API, OCI IMDS), the gRPC forwarding path, and simulation loaders, and is backed by a new integration test and thorough unit tests.

All previously raised review concerns have been addressed in this revision:

  • trimmedTiers now returns a local slice copy — no mutation of the original InstanceTopology.
  • GetTrimTiers correctly handles both int (YAML) and float64 (JSON) values, and rejects values outside [0, 2].
  • grpc_client.go now reads trimTiers from the request params instead of hardcoding 0.
  • All providers (including the previously missing Nebius path) are fully wired.

Remaining concern:

  • All LoaderSim functions read trimTiers from SimulationParams.TrimTiers (decoded via mapstructure), bypassing the 0–2 range validation that GetTrimTiers enforces for real providers. A simulation request with "trimTiers": 3 is silently accepted and trims all three tiers, while the same value is rejected for real providers.

Confidence Score: 4/5

  • Safe to merge with one minor inconsistency in simulation provider validation.
  • All previously flagged issues are resolved. The only remaining issue is that simulation providers accept out-of-range trimTiers values (> 2) without error because they use mapstructure decoding rather than GetTrimTiers. This only affects simulation paths and does not impact production providers.
  • pkg/providers/providers_sim.go and all *_sim.go provider files — LoaderSim implementations do not call GetTrimTiers and therefore skip range validation.

Important Files Changed

Filename Overview
pkg/topology/graph.go Adds trimTiers int parameter to ToThreeTierGraph and introduces the trimmedTiers helper that returns a local copy of the tier IDs with upper levels zeroed — no mutation of the original InstanceTopology.
pkg/providers/providers.go Refactors StringFromMap into a generic FromMap[T], adds GetTrimTiers with proper float64 coercion (for JSON-decoded params) and a 0–2 range guard. Error messages corrected to "must be of type %T".
pkg/providers/providers_sim.go Adds TrimTiers int to SimulationParams; however all LoaderSim functions use p.TrimTiers directly (via mapstructure) rather than calling GetTrimTiers, bypassing the 0–2 range validation enforced for real providers.
pkg/providers/nebius/provider.go Previously flagged as missing trimTiers support — now fully wired: reads GetTrimTiers, stores in baseProvider, and passes to ToThreeTierGraph.
pkg/server/grpc_client.go Previously flagged hardcoded 0 replaced: now calls GetTrimTiers(tr.Provider.Params) and passes the result to ToThreeTierGraph, keeping the forwarded gRPC path consistent with the local path.
pkg/server/http_server_test.go Adds an integration test (Case 7) for trimmed tree topology via gcp-sim. Minor mixed tab/space indentation on line 102 of the slurmTrimmedTreePayload constant.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[HTTP/gRPC Request\nwith trimTiers param] --> B{Provider type?}
    B -- Real provider --> C[Loader\nGetTrimTiers validates 0-2\nstores in baseProvider.trimTiers]
    B -- Sim provider --> D[LoaderSim\nGetSimulationParams via mapstructure\np.TrimTiers - no range check]
    C --> E[GenerateTopologyConfig]
    D --> E
    E --> F[ClusterTopology.ToThreeTierGraph\nprovider, cis, trimTiers, normalize]
    F --> G[trimmedTiers helper\nreturns local copy of tier IDs\nzeros out top N entries]
    G --> H{swID empty?}
    H -- yes --> I[skip - tier trimmed from graph]
    H -- no --> J[add switch vertex to tree]
    J --> K[Return Vertex tree\nwithout trimmed upper tiers]
    I --> K
Loading

Comments Outside Diff (2)

  1. pkg/providers/providers_sim.go, line 31 (link)

    Simulation providers bypass GetTrimTiers validation

    All LoaderSim functions extract trimTiers from p.TrimTiers (populated via GetSimulationParams → mapstructure), completely bypassing the 0–2 range validation that GetTrimTiers enforces for every real provider. A simulation request with "trimTiers": 3 will succeed — trimmedTiers will happily zero out all three switch levels — while the identical value is rejected for real providers.

    For example, in pkg/providers/gcp/provider_sim.go:

    p, err := providers.GetSimulationParams(cfg.Params)
    ...
    return NewSim(clientFactory, p.TrimTiers), nil  // no range check

    Consider calling GetTrimTiers inside LoaderSim (same as real providers) and using its validated result instead of the raw p.TrimTiers:

    trimTiers, err := providers.GetTrimTiers(cfg.Params)
    if err != nil {
        return nil, httperr.NewError(http.StatusBadRequest, "parameters error: "+err.Error())
    }
    return NewSim(clientFactory, trimTiers), nil

    The same inconsistency exists in the LoaderSim of every provider that was updated in this PR (aws, gcp, lambdai, nebius, oci).

  2. pkg/server/http_server_test.go, line 102 (link)

    Mixed tab/space indentation in JSON string literal

    Line 102 uses a hard tab character (\t) for indentation while every other line in slurmTrimmedTreePayload uses spaces. JSON is whitespace-agnostic so this won't affect parsing, but it creates an inconsistency visible in editors and diffs.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Last reviewed commit: 4e20a43

Comment thread pkg/providers/lambdai/provider.go Outdated
Comment thread pkg/providers/providers.go Outdated
Comment thread pkg/topology/graph.go Outdated
Comment thread pkg/providers/providers.go
Comment thread pkg/providers/nebius/provider.go Outdated
Comment thread pkg/server/grpc_client.go Outdated
Comment thread pkg/topology/graph_test.go
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 63.41463% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.96%. Comparing base (a9b6076) to head (4e20a43).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/providers/aws/provider.go 8.33% 11 Missing ⚠️
pkg/providers/gcp/provider.go 10.00% 9 Missing ⚠️
pkg/providers/nebius/provider.go 10.00% 9 Missing ⚠️
pkg/providers/oci/provider_api.go 11.11% 8 Missing ⚠️
pkg/providers/oci/provider_imds.go 0.00% 6 Missing ⚠️
pkg/server/grpc_client.go 50.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #233      +/-   ##
==========================================
- Coverage   66.99%   66.96%   -0.04%     
==========================================
  Files          82       82              
  Lines        4572     4646      +74     
==========================================
+ Hits         3063     3111      +48     
- Misses       1398     1424      +26     
  Partials      111      111              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread pkg/providers/providers.go
@dmitsh dmitsh force-pushed the ds-tree branch 4 times, most recently from 2043ae3 to e4d05d5 Compare March 10, 2026 22:36
ravisoundar
ravisoundar previously approved these changes Mar 10, 2026
Copy link
Copy Markdown
Collaborator

@ravisoundar ravisoundar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Signed-off-by: Dmitry Shmulevich <dshmulevich@nvidia.com>
@dmitsh dmitsh merged commit d79be88 into main Mar 11, 2026
6 checks passed
@dmitsh dmitsh deleted the ds-tree branch March 11, 2026 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants