Skip to content

fix migration hook lifecycle#560

Merged
fernandol-nvidia merged 1 commit intomainfrom
vpan/fix-pgroll-job
Feb 27, 2026
Merged

fix migration hook lifecycle#560
fernandol-nvidia merged 1 commit intomainfrom
vpan/fix-pgroll-job

Conversation

@vvnpn-nv
Copy link
Copy Markdown
Contributor

Title: Fix migration hook lifecycle and align postgres secret defaults

Summary

  • Fix ConfigMap not found during migration: Changed hook-delete-policy from hook-succeeded to before-hook-creation on both the ConfigMap and Job. With hook-succeeded, the ConfigMap was deleted immediately after the Job completed, causing mount failures on subsequent upgrades when the new Job pod scheduled before the new ConfigMap was available.
  • Keep Job logs accessible: Changing the Job's delete policy to before-hook-creation means completed Jobs remain available for log inspection until the next upgrade, working together with the existing ttlSecondsAfterFinished: 300 for Kubernetes-side cleanup.
  • Align postgres secret defaults with rest of chart: The migration Job used configurable secret name/key defaulting to postgres-secret/password, while every other template hardcodes db-secret/db-password. Updated the defaults to match.

Issue - None

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vvnpn-nv vvnpn-nv requested a review from a team as a code owner February 27, 2026 07:09
@fernandol-nvidia fernandol-nvidia merged commit bcf110b into main Feb 27, 2026
9 checks passed
@fernandol-nvidia fernandol-nvidia deleted the vpan/fix-pgroll-job branch February 27, 2026 14:53
RyaliNvidia added a commit that referenced this pull request Feb 28, 2026
* fix: remove role_arn from router cloudwatch log agent config (#430)

* fix: move backports-tarfile comment to its own line to prevent it from being embedded in wheel metadata (#541)

* Enable Workflow Events in CLI (#533)

* Enable Workflow Events in CLI

* Remove error events from workflow events CLI subcommand

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Remove last_n_lines argument from workflow events subcommand

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Remove old UI (#535)

* Remove //ui from github action

* Remove //ui

* Remove unneeded Node from Bazel

* Address follow ups during the //src/ui move (#536)

* Update to pnpm dev

* Give build/push scripts default arguments

* Add instructions

* Update BUILD_AND_TEST

* buildProductionCsp only if we are indeed a production build

* Add navigation progress bar for slow connections (#540)

* Refine Log Viewer (#539)

* Trim extra space at beginning of log

* Log-viewer expand/collapse row

* Clean up envoy response headers (#519)

* Fix resource table listing (#534)

* Fix resource table listing

* lint

* Lint CSS (#542)

* Lint CSS

* Fix css linting

* fix: default database schema, and redeployment schema drop (#544)

* update default database schema

* fix issues with re-deploy

* update schema version variable to be same across charts

* Allow setting _osmo_session cookies for local -> prod development (#548)

* Allow setting _osmo_session cookies for local -> prod development

* Format

* gh-pages: Switch to action deploy (#551)

For the repo size reduction #543 switch to using actions based
deployment of our github pages.

The pr-preview functionality will return in a later PR.

* Add AI Agentic Skills (#555)

* fix: upgrade fastapi to 0.125.0 to resolve starlette CVE (#556)

Upgrade fastapi from 0.115.5 to 0.125.0, which allows starlette
to resolve from 0.41.3 to 0.50.0, fixing GHSA-7f5h-v6ap-rcq8.

FastAPI 0.125.0 is the latest version that still supports pydantic v1,
maintaining compatibility with the existing pydantic==1.10.13 pin.

* Fix OTel collector memory pressure after SDK upgrade (#558)

Increase collector memory_limiter from 30 MiB to 128 MiB and reduce
metric export frequency from 6s to 15s to prevent data drops under
high-cardinality metric load.

* fix migration hook lifecycle (#560)

* Upgrade UI codegen tooling (Orval v7 -> v8) and regenerate (#557)

* Cleanup backend_todos 14, 17

* Fix backend_todo #3

* Orval v7 -> v8 migration + regenerate autogen code

* Format

* Use stronger types

* Remove unused import

* Fix osmo_barrier.py bug with num_nodes=1 (#561)

Fix bug where osmo_barrier.py hangs when num_nodes=1

* Fix oauth2-proxy TOML parse error when using Kubernetes secrets (#563)

* Tweak Cancel/Resubmit to gracefully handle related/unrelated errors (#564)

* Tweak Cancel/Resubmit to gracefully handle related/unrelated errors

* Add back refresh button into cancel toast

* Stabilize UI CI (#567)

* Add helm upgrade validation and remove deprecated values (#568)

* Add helm upgrade validation for 6.0 → 6.2 breaking changes

* Remove deprecated oauth2Filter and secretPaths from chart defaults

* feat: datasets collections, file browser overhaul, and mock fidelity improvements (#569)

* fix: use push history for file browser path navigation to enable back button

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: surface S3 URI through DatasetFile type

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use basePath-aware proxy, expand text type support, and copy S3 path in file preview

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: replace double-click with always-visible leading open-panel button in datasets table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add Browse files button and clickable version rows to dataset details panel

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: move copy button to fixed leading column in file browser, always visible, copy S3 path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: replace version switcher dropdown with prev/next nav and Details panel on file browser page

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: add Home > Datasets prefix links to file browser breadcrumb

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: collapse deep file browser breadcrumb paths with ellipsis

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: sibling folder popover on breadcrumb segment click

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MidTruncate component and apply to dataset and file name columns

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: format ternary in sibling popover button

* refactor: move open-details button inline in name cell, right-aligned with tooltip

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: hoist dataset details panel to layout level

The details slideout is now mounted once at the /datasets/** route
layout, so it persists across navigation between the list and the file
browser pages. Clicking "Browse files" or a version in the panel no
longer closes and reopens the panel.

Changes:
- Add datasets-panel-store.ts: ephemeral Zustand store (bucket/name/isOpen)
- Add datasets-panel-context.tsx: passes isPanelOpen/openPanel/closePanel to pages
- Add datasets-panel-layout.tsx: ResizablePanel + DatasetPanel at layout level
- Add src/app/(dashboard)/datasets/layout.tsx: Next.js route layout
- DatasetsPageContent: remove local ResizablePanel, use store + context
- DatasetDetailContent: remove outer ResizablePanel, use context for Details toggle
- DatasetPanel: revert accidental ?details=true navigation param

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: move copy-path button inline in file browser name cell

* fix: breadcrumb last segment truncation and copy tooltip confirmation

* feat: merge dataset file browser header into chrome nav

* feat: add onFocusedRowChange callback and j/k/l vim bindings to DataTable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add keyboard navigation to dataset file browser

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: autoplay and loop video in file preview panel

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add CollectionMember type and discriminated DetailResponse to datasets adapter

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add Type column to datasets list table with Collection badge

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add CollectionPanelMembers and update DatasetPanel to handle collections

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: generalize VersionSwitcher and FileBrowserControls to generic SwitcherItem[]

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: render collection members as top-level entries in the file browser

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add collection mock data and interleaved list/info handlers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add mock file manifest for dataset file browser in dev:mock

* fix: route location-files through MSW in mock mode via impl split pattern

* fix: use text/json files in mock manifest so preview panel can render them

* feat: add private dataset 401 and file-proxy MSW interception in mock mode

* fix: isolate copy-path tooltip state per button and add s3 storage_path to mock manifests

- PreviewError now owns its own useCopy() instance so clicking its "Copy path"
  button doesn't also trigger the header copy tooltip
- generateFlatManifest accepts optional locationBase and populates storage_path
  on every RawFileItem so the Copy button appears in the file browser table
- Pass locationUrl from the location-files MSW handler into generateFlatManifest
- Replace http.head + http.get file-proxy handlers with http.all to fix HEAD
  interception failure through the mock port-9999 tunnel

* style: format server-mock-utils.ts

* fix: remove copy path button from error states in file preview panel

* fix: match file browser table header height and border to preview panel header

* style: format data-table and file-preview-panel

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix pool quota api logic and add tests (#570)

* Add RBAC to docs + move keycloak to optional (#562)

* Remove keycloak from docs

* update pat

* lint

* lint

* spell

* Update docs for rbac

* lint

* clean

* update grid;

* update doc

* update names

* remove

* update

* Add keycloak in the appendix

* remove

* Update docs/deployment_guide/appendix/authentication/identity_provider_setup.rst

Co-authored-by: Vivian Pan <vivianp@nvidia.com>

* comments

* more comments

---------

Co-authored-by: Vivian Pan <vivianp@nvidia.com>

---------

Co-authored-by: Ethan Look-Potts <elookpotts@nvidia.com>
Co-authored-by: ethany-nv <ethany@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Fernando L <fernandol@nvidia.com>
Co-authored-by: Vivian Pan <vivianp@nvidia.com>
Co-authored-by: Hans Arnholm <harnholm@nvidia.com>
Co-authored-by: tdewanNvidia <tdewan@nvidia.com>
Co-authored-by: ecolternv <ecolter@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants