Conversation
…re devops scraper Fetch complete release definition JSON for better config tracking, expose deploy steps in environment details, and use environment names as change types for more granular release tracking.
Adds support for extracting release and environment-level configuration variables (excluding secrets), artifact definitions with pipeline/version/branch references, and release reason/description fields from Azure DevOps releases. Includes new data structures for artifacts and configuration variables, and helper functions to flatten and summarize this metadata.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (4)
WalkthroughEnhances Azure DevOps release scraping to fetch full release definitions and enrich release results with artifacts, variables, deploy steps and metadata; adds relationship extraction fixtures and e2e specs; expands extract e2e test harness to pre-populate and assert relationships; simplifies stale-item timeout handling and updates dependencies. Changes
Sequence DiagramssequenceDiagram
participant Scraper as Release Scraper
participant Client as AzureDevopsClient
participant API as Azure DevOps API
participant Processor as buildReleaseResult
Scraper->>Client: GetReleaseDefinition(ctx, project, defID)
Client->>API: Fetch release definition JSON
API-->>Client: defJSON (map[string]any)
Client-->>Scraper: defJSON
Scraper->>Client: GetReleases(ctx, project, defID)
Client->>API: Fetch releases list
API-->>Client: []Release
Client-->>Scraper: releases
Scraper->>Processor: buildReleaseResult(ctx, config, project, def, defJSON, releases, cutoff)
Processor->>Processor: flattenVariables(release.Variables, env.Variables)
Processor->>Processor: summarizeArtifacts(release.Artifacts)
Processor->>Processor: Populate result.Config from defJSON
Processor-->>Scraper: ScrapeResult (enriched)
sequenceDiagram
participant Test as E2E Test
participant Loader as Fixture Loader
participant Store as Config Store
participant Extractor as Extract Scraper
participant Repo as ConfigRelationship Repo
Test->>Loader: Load fixtures/data/relationships_basic.json
Loader-->>Test: Service entity
Test->>Store: Pre-populate Deployment (deploy-web-frontend)
Store-->>Test: Stored
Test->>Extractor: Run extract with transform (backend_ref → App::Deployment)
Extractor->>Repo: Query ConfigRelationship for scraper
Repo-->>Extractor: []ConfigRelationship
Extractor-->>Test: Serialized relationships injected into env
Test->>Test: Assert relationship count/properties
Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
scrapers/stale.go (1)
19-36:⚠️ Potential issue | 🟠 MajorPipeline failure:
staleDurationfrom scrape start is never used.The static analysis correctly identifies that
staleDurationcomputed fromtime.Since(start)on line 22 is always overwritten in the switch statement. According to the summary, this PR removed the comparison logic that previously enforced taking the greater of the parsed value and scrape duration.If the scrape-start-based duration is no longer needed, remove the dead code (lines 19-24). Otherwise, if the intent was to preserve the "don't delete items scraped recently" behavior, the comparison logic needs to be restored.
Option 1: Remove dead code if scrape duration is no longer needed
func DeleteStaleConfigItems(ctx context.Context, staleTimeout string, scraperID uuid.UUID) (int64, error) { var staleDuration time.Duration - if val := ctx.Value(contextKeyScrapeStart); val != nil { - if start, ok := val.(time.Time); ok { - staleDuration = time.Since(start) - } - } switch staleTimeout { case "keep":Option 2: Restore max comparison if scrape duration should be respected
switch staleTimeout { case "keep": return 0, nil case "": - staleDuration = ctx.Properties().Duration("config.retention.stale_item_age", DefaultStaleTimeout) + configDuration := ctx.Properties().Duration("config.retention.stale_item_age", DefaultStaleTimeout) + if configDuration > staleDuration { + staleDuration = configDuration + } default: if parsed, err := duration.ParseDuration(staleTimeout); err != nil { return 0, fmt.Errorf("failed to parse stale timeout %s: %w", staleTimeout, err) } else { - staleDuration = time.Duration(parsed) + if time.Duration(parsed) > staleDuration { + staleDuration = time.Duration(parsed) + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/stale.go` around lines 19 - 36, The variable staleDuration is computed from ctx.Value(contextKeyScrapeStart) but then always overwritten by the switch on staleTimeout; either remove the unused scrape-start code (the ctx.Value(contextKeyScrapeStart) block and the initial staleDuration declaration) or restore the original “take the max” behavior by keeping the scrape-start-derived staleDuration and, after resolving staleTimeout (using ctx.Properties().Duration or duration.ParseDuration), set staleDuration = max(staleDuration, parsedDuration) so the logic in functions using staleDuration (and symbols staleTimeout, duration.ParseDuration, ctx.Properties().Duration, DefaultStaleTimeout) preserves the “don’t delete items scraped more recently” rule.scrapers/azure/devops/pipelines_test.go (1)
367-378:⚠️ Potential issue | 🔴 CriticalReplace
env.Namewith the mappedchangeTypevariable in the ChangeType assignment.The code computes a status-based
changeTypevalue viareleaseEnvStatusToChangeType[env.Status]and uses it in conditional logic (line 274), but then incorrectly assignsenv.NametoChangeTypeinstead. This causes:
Semantic mismatch:
ChangeTypenow holds environment names ("Staging","Prod") instead of status constants (ChangeTypeSucceeded,ChangeTypeInProgress) used elsewhere in the codebase (seerun_state.gofor pipeline runs).Downstream breakage: Code in
db/update.go:366callsevents.GetSeverity(change.ChangeType), which expects status constants, not environment names. This will produce incorrect severity levels.Inconsistent semantics: Approval changes correctly use
ChangeType: changeTypefor status values; environment changes should do the same. The actual deployment status should remain indetails["status"].The fix is straightforward: change line ~330 from
ChangeType: env.Name,toChangeType: changeType,to use the computed status value.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/azure/devops/pipelines_test.go` around lines 367 - 378, In buildReleaseResult, the ChangeType for environment changes is incorrectly set to env.Name (environment label) instead of the mapped status constant; locate the block where env is processed (uses releaseEnvStatusToChangeType[env.Status] assigned to changeType) and replace the assignment to the ChangeType field so it uses changeType (the mapped status) rather than env.Name, leaving details["status"] populated with env.Status/name as before; this keeps semantics consistent with approval changes (which already use ChangeType: changeType) and ensures downstream callers like events.GetSeverity receive the expected status constants.
🧹 Nitpick comments (3)
scrapers/extract_e2e_test.go (1)
252-268: Cleanup order is correct but has minor redundancy.The teardown properly handles FK constraints by deleting
config_relationshipsfirst. However, there's minor redundancy: lines 256-258 deleteconfig_changesfor pre-populated items byconfig_id, while line 265 also deletesconfig_changesfor all items byscraper_id. This is harmless (DELETE is idempotent) but could be simplified.♻️ Optional: Remove redundant config_changes deletion
for _, id := range createdItems { - DefaultContext.DB().Where("config_id = ?", id).Delete(&models.ConfigChange{}) DefaultContext.DB().Delete(&models.ConfigItem{}, "id = ?", id) }Since line 265 already deletes all
config_changesfor items belonging to this scraper, the per-item deletion in the loop is redundant.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/extract_e2e_test.go` around lines 252 - 268, The teardown currently deletes per-item ConfigChange records inside the createdItems loop via DefaultContext.DB().Where("config_id = ?", id).Delete(&models.ConfigChange{}), but later there is a bulk Exec that already deletes config_changes for all items of this scraper (the Exec "DELETE FROM config_changes WHERE config_id IN (SELECT id FROM config_items WHERE scraper_id = ?)"). Remove the per-item deletion line from the loop (leave deletion of ConfigItem and other per-item cleanup intact) so only the bulk Exec handles config_changes; reference symbols: createdItems, models.ConfigChange, DefaultContext.DB().Where(...).Delete, and the Exec("DELETE FROM config_changes WHERE config_id IN (SELECT id FROM config_items WHERE scraper_id = ?)").scrapers/azure/devops/releases_test.go (1)
487-492: Consider using a safer type assertion in test.The unchecked type assertion
details["artifacts"].([]map[string]any)will panic if the type doesn't match. While in test code a panic effectively fails the test, using Gomega's type-safe assertions would provide clearer error messages.♻️ Suggested improvement
- artifacts := details["artifacts"].([]map[string]any) - Expect(artifacts).To(HaveLen(1)) - Expect(artifacts[0]["definition"]).To(Equal("my-pipeline")) + Expect(details).To(HaveKey("artifacts")) + artifacts, ok := details["artifacts"].([]map[string]any) + Expect(ok).To(BeTrue(), "artifacts should be []map[string]any") + Expect(artifacts).To(HaveLen(1)) + Expect(artifacts[0]["definition"]).To(Equal("my-pipeline"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/azure/devops/releases_test.go` around lines 487 - 492, The test currently does an unchecked type assertion on details["artifacts"] which can panic; change it to a safe assertion by first retrieving the value and asserting its type with Gomega (e.g., use a type-check Expect on details["artifacts"] or assert ok after a comma-ok cast) before accessing fields, then continue with Expect checks on artifacts[0]["definition"], artifacts[0]["version"], artifacts[0]["branch"], and artifacts[0]["isPrimary"]; reference the details map access and the artifacts variable in the releases_test.go assertions to locate where to add the safe type check.scrapers/azure/devops/releases.go (1)
97-105: Consider soft-failing GetReleaseDefinition to improve resilience.The current implementation fails the entire scrape if fetching a release definition fails. Consider whether this should be a soft failure (log warning, use nil for defJSON, continue) to improve resilience when a single definition is temporarily unavailable.
♻️ Alternative: soft-fail approach
defJSON, err := releaseClient.GetReleaseDefinition(ctx, project.Name, def.ID) if err != nil { - return results.Errorf(err, "failed to get release definition %d", def.ID) + ctx.Logger.Warnf("failed to get release definition %d: %v (proceeding without definition config)", def.ID, err) + defJSON = nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scrapers/azure/devops/releases.go` around lines 97 - 105, The call to releaseClient.GetReleaseDefinition currently returns an error that aborts the whole scrape; change this to soft-fail by catching the error from releaseClient.GetReleaseDefinition(ctx, project.Name, def.ID), logging a warning (including def.ID and project.Name and the error), setting defJSON to nil, and continuing so buildReleaseResult(ctx, config, project, def, defJSON, releases, cutoff) runs with a nil defJSON; ensure buildReleaseResult can handle a nil defJSON (or add a nil check before calling it) and do not return the error from the failed GetReleaseDefinition so a single missing definition doesn't fail the entire scrape.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@scrapers/azure/devops/pipelines_test.go`:
- Around line 367-378: In buildReleaseResult, the ChangeType for environment
changes is incorrectly set to env.Name (environment label) instead of the mapped
status constant; locate the block where env is processed (uses
releaseEnvStatusToChangeType[env.Status] assigned to changeType) and replace the
assignment to the ChangeType field so it uses changeType (the mapped status)
rather than env.Name, leaving details["status"] populated with env.Status/name
as before; this keeps semantics consistent with approval changes (which already
use ChangeType: changeType) and ensures downstream callers like
events.GetSeverity receive the expected status constants.
In `@scrapers/stale.go`:
- Around line 19-36: The variable staleDuration is computed from
ctx.Value(contextKeyScrapeStart) but then always overwritten by the switch on
staleTimeout; either remove the unused scrape-start code (the
ctx.Value(contextKeyScrapeStart) block and the initial staleDuration
declaration) or restore the original “take the max” behavior by keeping the
scrape-start-derived staleDuration and, after resolving staleTimeout (using
ctx.Properties().Duration or duration.ParseDuration), set staleDuration =
max(staleDuration, parsedDuration) so the logic in functions using staleDuration
(and symbols staleTimeout, duration.ParseDuration, ctx.Properties().Duration,
DefaultStaleTimeout) preserves the “don’t delete items scraped more recently”
rule.
---
Nitpick comments:
In `@scrapers/azure/devops/releases_test.go`:
- Around line 487-492: The test currently does an unchecked type assertion on
details["artifacts"] which can panic; change it to a safe assertion by first
retrieving the value and asserting its type with Gomega (e.g., use a type-check
Expect on details["artifacts"] or assert ok after a comma-ok cast) before
accessing fields, then continue with Expect checks on
artifacts[0]["definition"], artifacts[0]["version"], artifacts[0]["branch"], and
artifacts[0]["isPrimary"]; reference the details map access and the artifacts
variable in the releases_test.go assertions to locate where to add the safe type
check.
In `@scrapers/azure/devops/releases.go`:
- Around line 97-105: The call to releaseClient.GetReleaseDefinition currently
returns an error that aborts the whole scrape; change this to soft-fail by
catching the error from releaseClient.GetReleaseDefinition(ctx, project.Name,
def.ID), logging a warning (including def.ID and project.Name and the error),
setting defJSON to nil, and continuing so buildReleaseResult(ctx, config,
project, def, defJSON, releases, cutoff) runs with a nil defJSON; ensure
buildReleaseResult can handle a nil defJSON (or add a nil check before calling
it) and do not return the error from the failed GetReleaseDefinition so a single
missing definition doesn't fail the entire scrape.
In `@scrapers/extract_e2e_test.go`:
- Around line 252-268: The teardown currently deletes per-item ConfigChange
records inside the createdItems loop via DefaultContext.DB().Where("config_id =
?", id).Delete(&models.ConfigChange{}), but later there is a bulk Exec that
already deletes config_changes for all items of this scraper (the Exec "DELETE
FROM config_changes WHERE config_id IN (SELECT id FROM config_items WHERE
scraper_id = ?)"). Remove the per-item deletion line from the loop (leave
deletion of ConfigItem and other per-item cleanup intact) so only the bulk Exec
handles config_changes; reference symbols: createdItems, models.ConfigChange,
DefaultContext.DB().Where(...).Delete, and the Exec("DELETE FROM config_changes
WHERE config_id IN (SELECT id FROM config_items WHERE scraper_id = ?)").
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 10160dcd-4e0d-42be-b733-b779851e8385
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (15)
fixtures/data/relationships_basic.jsonfixtures/data/relationships_unmatched.jsongo.modjobs/cleanup.goscrapers/azure/devops/client.goscrapers/azure/devops/pipelines_test.goscrapers/azure/devops/releases.goscrapers/azure/devops/releases_test.goscrapers/extract/testdata/e2e/relationships_lookup.yamlscrapers/extract/testdata/e2e/relationships_matched.yamlscrapers/extract/testdata/e2e/relationships_parent.yamlscrapers/extract/testdata/e2e/relationships_unmatched.yamlscrapers/extract_e2e_test.goscrapers/logs/logs.goscrapers/stale.go
…dling Updates commons, duty, gomplate, and is-healthy packages. Refactors stale item age handling to use centralized property resolution instead of repeated fallback logic. Adds LogGroup to log scraper results and simplifies duration parsing in stale config cleanup.
16743b1 to
de0d230
Compare
Summary by CodeRabbit
New Features
Bug Fixes
Tests