VPR-141 feat(healthchecks): add /health endpoints and UI dashboard#159
VPR-141 feat(healthchecks): add /health endpoints and UI dashboard#159
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #159 +/- ##
==========================================
- Coverage 43.40% 42.95% -0.46%
==========================================
Files 863 873 +10
Lines 50476 51046 +570
Branches 4724 4762 +38
==========================================
+ Hits 21909 21925 +16
- Misses 28044 28598 +554
Partials 523 523
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Pull request overview
Adds operational health checking to the web app: anonymous liveness, IP-gated readiness details, and an internal HealthChecks.UI dashboard with UC Davis branding and reduced external probe traffic via adaptive polling.
Changes:
- Introduces
/healthand/health/detailendpoints plus the/healthchecksUI, including IP allowlisting and CSP bypass for the UI bundle. - Adds multiple new health checks (DB contexts, disk space, LDAP, SMTP, CAS/VMACs HTTP probes, AWS SSM) and an adaptive polling decorator to reduce probe frequency when healthy.
- Updates Jenkins deploy stages to poll
/healthafter deploy; adds UI branding assets and a small injected JS enhancer.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | Injected UI script to humanize duration cells and show a campus-status banner. |
| web/wwwroot/healthchecks-ui-logo.png | New logo asset for HealthChecks.UI branding. |
| web/wwwroot/css/healthchecks-ui-branding.css | UC Davis palette + UI CSS tweaks (contrast, layout, banner styling). |
| web/appsettings.json | Expands InternalAllowlist to CIDR ranges for health detail/UI access. |
| web/Viper.csproj | Adds DotNetDiag HealthChecks.UI packages and EF health checks package reference. |
| web/Program.cs | Hooks health check DI + pipeline wiring; conditionally applies CSP outside UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | MailKit-based SMTP reachability/TLS probe. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | Real LDAPS bind probe for directory health. |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | Generic HTTP endpoint reachability probe. |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Centralized health checks registration + endpoint/UI mapping + UI HTML injection. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | Drive space (and optional writability) checks for app/photos/CMS/log paths. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | AWS SSM reachability probe using a lightweight DescribeParameters call. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | Caches health results with status-dependent TTLs to reduce probe load. |
| JenkinsFile | Adds post-deploy /health polling for test and prod. |
There was a problem hiding this comment.
Pull request overview
Adds first-class health check endpoints and an operator-facing HealthChecks.UI dashboard to VIPER, including custom probes (DB, disk, LDAP/CAS/SMTP/SSM) and Jenkins post-deploy verification.
Changes:
- Introduces
/healthliveness,/health/detailreadiness JSON, and/healthchecksUI with IP allowlisting. - Adds multiple custom
IHealthCheckimplementations plus an adaptive polling decorator to reduce external traffic. - Updates Jenkins deploy stages to poll
/healthafter deployment.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | UI-side DOM tweaks (duration humanizer + campus-status banner). |
| web/wwwroot/healthchecks-ui-logo.png | Adds UC Davis-branded logo asset for the dashboard. |
| web/wwwroot/css/healthchecks-ui-branding.css | Custom palette/branding + minor UI layout/accessibility tweaks. |
| web/appsettings.json | Expands InternalAllowlist to CIDR ranges for readiness/UI access. |
| web/Viper.csproj | Adds HealthChecks.UI + EF Core health check package references. |
| web/Program.cs | Hooks health check DI + pipeline wiring; skips CSP on UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | Adds SMTP relay probe via MailKit connect/noop/disconnect. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | Adds LDAPS bind probe matching existing LDAP service settings. |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | Adds generic HTTP reachability probe for CAS/VMACs. |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Centralizes health check registration, endpoint mapping, UI config, and response-body script injection. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | Adds disk free-space (and optional writability) probe for key volumes. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | Adds lightweight SSM reachability probe. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | Adds status-based caching to reduce expensive probe frequency. |
| JenkinsFile | Adds post-deploy /health polling for test and prod stages. |
Bundle ReportBundle size has no change ✅ |
80e474c to
24f416c
Compare
There was a problem hiding this comment.
Pull request overview
Adds first-class health checking to VIPER, including liveness/readiness endpoints for deploy automation and an IP-gated HealthChecks.UI dashboard tailored to campus ops needs.
Changes:
- Introduces
/health(anonymous liveness) and/health/detail(IP-gated readiness with tagged checks) plus HealthChecks.UI at/healthchecks. - Adds health check implementations (LDAP, SMTP, HTTP endpoint probes, disk space, AWS SSM) and an adaptive polling decorator to reduce external probe traffic.
- Updates Jenkins deploy stages to poll
/2/healthpost-deploy; adds UC Davis branding + UI tweaks (duration humanizer + campus-status banner).
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | Injected UI tweaks (duration humanizer, campus-status banner) via MutationObserver. |
| web/wwwroot/css/healthchecks-ui-branding.css | UC Davis palette + UI readability adjustments + campus-status banner styling. |
| web/appsettings.json | Replaces a single internal allowlisted IP with CIDR ranges for staff + infra. |
| web/Viper.csproj | Adds DotNetDiag HealthChecks.UI packages and EF Core health check package. |
| web/Program.cs | Hooks health check DI + pipeline wiring; skips CSP on HealthChecks.UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | MailKit-based SMTP reachability probe. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | Real LDAPS bind probe using existing LDAP service credentials. |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | HTTP(S) reachability probe (treats non-5xx as healthy). |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Centralizes health check registration, endpoint mapping, UI wiring, and IP gating. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | Disk free-space (and optional writability) probe for key volumes/paths. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | AWS SSM reachability probe via DescribeParameters. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | Caches healthy vs unhealthy results for different durations to reduce probe load. |
| JenkinsFile | Adds post-deploy polling of /2/health for TEST and PROD. |
There was a problem hiding this comment.
Pull request overview
Adds operational health monitoring to VIPER by introducing anonymous liveness and IP-gated readiness endpoints, plus a branded HealthChecks.UI dashboard tailored for the campus environment (path base, reduced external probing, and operator-focused UI tweaks).
Changes:
- Adds
/healthliveness and/health/detailreadiness endpoints plus a HealthChecks.UI dashboard (IP-restricted, CSP-exempted for UI paths). - Implements custom health checks (LDAP bind, SMTP connect, HTTP probe, AWS SSM probe, disk space) and an adaptive polling decorator to reduce external traffic.
- Updates Jenkins deploy stages to poll
/healthafter deployment and updates allowlists to CIDR ranges.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | Injected client-side UI enhancements (duration formatting + campus-status banner). |
| web/wwwroot/css/healthchecks-ui-branding.css | HealthChecks.UI UC Davis branding + accessibility/UX tweaks. |
| web/appsettings.json | Replaces single internal IP with CIDR-based allowlist ranges. |
| web/Viper.csproj | Adds DotNetDiag HealthChecks.UI packages and EF Core health check package. |
| web/Program.cs | Wires health checks via extensions and skips CSP for HealthChecks.UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | New MailKit-based SMTP probe health check. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | New LDAPS bind health check (Windows-only). |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | New HTTP reachability probe health check. |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Central DI + pipeline wiring for health endpoints/UI, IP gating, and script injection. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | New disk space (and optional writability) health check. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | New AWS SSM reachability health check. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | New decorator to cache healthy results longer than unhealthy ones. |
| JenkinsFile | Adds post-deploy /health polling for test and prod stages. |
a465446 to
77f2c9f
Compare
864b4a4 to
462496d
Compare
1058a5c to
e2e1665
Compare
|
@bsedwards The detailed health check is at https://secure-test.vetmed.ucdavis.edu/2/healthchecks It is IP restricted to:
|
bsedwards
left a comment
There was a problem hiding this comment.
A couple things:
- Cloudflare is only in front of secure-test at the moment. We are working on getting internal NAT'ed IPs not routed through Cloudflare so we can "see" the internal IP and not the F5's external IP. Cloudflare will be in front of Prod servers at some point, but Ops is still working out the kinks.
- Can we either restrict the detail and UI health check endpoints to just developer and system (e.g. Jenkins) IPs, or require some other token based or basic authentication? Allowing access to entire networks is overly broad.
|
Important Review skippedAuto reviews are limited based on label configuration. 🏷️ Required labels (at least one) (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughIssue: insufficient health observability and incorrect client-IP forwarding through Cloudflare/F5. Fix: add comprehensive HealthChecks + UI, collector auth and token handler, Cloudflare CIDR startup fetch & forwarded-headers wiring, UI assets/tests/project deps, and Jenkins post-deploy /2/health probe steps. ChangesHealth checks, UI, forwarding, and startup
CI post-deploy health verification
Sequence Diagram(s)sequenceDiagram
participant Jenkins as Jenkins Pipeline
participant App as Application
participant CF as Cloudflare
participant Forward as ForwardedHeaders middleware
participant UI as HealthChecks.UI Collector
participant Probe as /2/health endpoint
App->>CF: GET ips-v4 / ips-v6
alt fetch success
CF-->>App: CIDR lists
else fetch failure
CF-->>App: Hardcoded CIDR snapshot
end
App->>Forward: Configure KnownIPNetworks with CF CIDRs\nand set ForwardLimit=2
App->>UI: Register health checks and configure HealthChecks.UI (token handler)
Jenkins->>Probe: Post-deploy retry loop calling /2/health
alt 200 OK
Probe-->>Jenkins: 200 OK (pipeline continues)
else final failure
Probe-->>Jenkins: non-200 / timeout (pipeline fails)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@JenkinsFile`:
- Around line 146-165: The JenkinsFile contains duplicated PowerShell
health-check blocks; extract them into a single reusable Groovy function (e.g.,
verifyHealth(String url)) that wraps the existing powershell step logic and
accepts the URL as a parameter, then replace the two inline powershell blocks
with calls to verifyHealth("<env-specific-url>"); ensure the function preserves
the maxAttempts, delay logic, try/catch and exit codes exactly and keep the
powershell label 'Verify /health returns 200' so behavior and logs remain
unchanged.
In `@web/Classes/CloudflareNetworks.cs`:
- Around line 40-57: The current catch in FetchOrFallback only handles
HttpRequestException and TaskCanceledException, but DNS/socket failures can
surface as other exception types; change the exception handling on
FetchOrFallback to catch broader failures (e.g., catch Exception ex or add
SocketException) so any network/DNS/socket error is logged via logger.Warn and
the method returns HardcodedFallback; keep the existing log message and return
behavior, and ensure the method still uses HardcodedFallback when any
network-related exception occurs.
In `@web/Classes/HealthChecks/AwsSsmHealthCheck.cs`:
- Around line 36-46: The health check in AwsSsmHealthCheck currently calls
DescribeParametersAsync which requires different IAM rights than the config
loader; change the probe to call GetParametersByPathAsync (or GetParameterAsync
for a known key) against the same path/key your configuration loader uses so the
health check uses the same IAM permissions. In practice, inside the health check
method replace the DescribeParametersAsync call with a call to
client.GetParametersByPathAsync(new GetParametersByPathRequest { Path = <your
configured path prefix>, MaxResults = 1 }, cancellationToken) (or
client.GetParameterAsync for a single known parameter), ensure you pass
cancellationToken, and keep the existing exception handling that checks
_healthyWhenMissing and returns HealthCheckResult accordingly; update references
in AwsSsmHealthCheck to use the configuration path field (e.g. the class’s
path/prefix member) along with existing _region and _healthyWhenMissing.
In `@web/Classes/HealthChecks/HealthCheckExtensions.cs`:
- Around line 139-143: Wrap the call that constructs a Uri from
EmailSettings:BaseUrl in a safe check inside
AddViperHealthChecks/HealthCheckExtensions: instead of directly calling new
Uri(baseUrl).AbsolutePath, use Uri.TryCreate to validate the baseUrl (or catch
UriFormatException around new Uri) and if it fails log a warning and fall back
to a safe default path (e.g., "/") before continuing to register the health
check; update the code around the new Uri(baseUrl).AbsolutePath reference so
malformed EmailSettings:BaseUrl no longer crashes startup and the
HttpEndpointHealthCheck receives a validated/fallback path.
- Around line 194-203: HealthCheckExtensions.cs currently constructs a Uri from
configuration["EmailSettings:BaseUrl"] which can throw UriFormatException on
malformed/scheme-less values; change the logic that computes healthEndpointUrl
to validate baseUrl using Uri.TryCreate(..., UriKind.Absolute, out var uri)
(matching the pattern used in VerificationService.cs) and when TryCreate fails
or baseUrl is null/whitespace, set healthEndpointUrl to "/health/detail"; when
TryCreate succeeds use uri.AbsolutePath.TrimEnd('/') to build
$"http://localhost{pathBase}/health/detail".
In `@web/Classes/HealthChecks/HttpEndpointHealthCheck.cs`:
- Around line 39-43: The health check in HttpEndpointHealthCheck currently calls
client.GetAsync(_url, cancellationToken) which buffers the entire response body;
change the call in the CheckHealthAsync method to use the overload that includes
HttpCompletionOption.ResponseHeadersRead so only headers are read
(client.GetAsync(_url, HttpCompletionOption.ResponseHeadersRead,
cancellationToken)), preserving the existing response disposal and status code
logic so the probe returns based on StatusCode without downloading the full
body.
In `@web/wwwroot/css/healthchecks-ui-branding.css`:
- Around line 49-55: Update the decorative left border in the
`#campus-status-banner` rule to use rem units instead of pixels; replace the
hardcoded "4px" value for "border-left" with an equivalent rem measurement
(e.g., 0.25rem or whatever matches your base font sizing) so the selector
"#campus-status-banner" follows the project's sizing guideline using rem units.
In `@web/wwwroot/js/healthchecks-ui-extras.js`:
- Around line 43-54: The hasUnhealthyCampusCheck function uses a fragile DOM
selector (row.querySelector(".hc-status .material-icons") and comparing
icon.textContent to "check_circle") that is tightly coupled to Xabaril UI
internals; add an inline comment immediately above this selector explaining that
this check depends on the current forked UI structure/icon name, that changes
upstream could break it, and mark a TODO to replace with a more robust status
API or data-attribute if/when available (include function name
hasUnhealthyCampusCheck and the selector string in the comment for clarity).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 60ec9da2-248f-413f-bcda-83460be8a905
📒 Files selected for processing (14)
JenkinsFileweb/Classes/CloudflareNetworks.csweb/Classes/HealthChecks/AdaptivePollingHealthCheck.csweb/Classes/HealthChecks/AwsSsmHealthCheck.csweb/Classes/HealthChecks/DiskSpaceHealthCheck.csweb/Classes/HealthChecks/HealthCheckExtensions.csweb/Classes/HealthChecks/HttpEndpointHealthCheck.csweb/Classes/HealthChecks/LdapHealthCheck.csweb/Classes/HealthChecks/SmtpHealthCheck.csweb/Program.csweb/Viper.csprojweb/appsettings.jsonweb/wwwroot/css/healthchecks-ui-branding.cssweb/wwwroot/js/healthchecks-ui-extras.js
a83c9f1 to
b8de30c
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (4)
web/Classes/CloudflareNetworks.cs (1)
52-56:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winException filter may not catch all network/DNS failures.
SocketExceptionand DNS failures are not always wrapped asHttpRequestExceptiondepending on runtime. Past review flagged this.Proposed fix
- catch (Exception ex) when (ex is HttpRequestException or TaskCanceledException) + catch (Exception ex)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/Classes/CloudflareNetworks.cs` around lines 52 - 56, The current exception filter in CloudflareNetworks.cs (the catch block that logs via logger.Warn and returns HardcodedFallback) misses some network/DNS failures like SocketException or cases where those are wrapped as inner exceptions; update the catch logic in the method that fetches Cloudflare IP ranges to also handle SocketException and to inspect inner exceptions for socket/DNS errors (or fall back to a non-filtered catch that detects network-related inner exceptions) so those failures are caught, logged via logger.Warn with the exception instance, and HardcodedFallback is returned as before.web/wwwroot/css/healthchecks-ui-branding.css (1)
53-53:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
4pxborder-left should be0.25rem.Per coding guidelines,
remis required for sizing.Proposed fix
- border-left: 4px solid var(--warningColor); + border-left: 0.25rem solid var(--warningColor);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/wwwroot/css/healthchecks-ui-branding.css` at line 53, The CSS rule in healthchecks-ui-branding.css uses a pixel value for the left border; replace border-left: 4px solid var(--warningColor); with a rem-based value per guidelines by changing it to border-left: 0.25rem solid var(--warningColor); (locate the declaration of border-left in the healthchecks-ui-branding.css rule where var(--warningColor) is used and update the unit to rem).web/Classes/HealthChecks/HttpEndpointHealthCheck.cs (1)
39-39:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
GetAsyncbuffers the full response body; only the status code is needed.Use
HttpCompletionOption.ResponseHeadersReadto avoid downloading potentially large bodies on every health-check poll. Past review flagged this.Proposed fix
- using var response = await client.GetAsync(_url, cancellationToken); + using var response = await client.GetAsync( + _url, + HttpCompletionOption.ResponseHeadersRead, + cancellationToken);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/Classes/HealthChecks/HttpEndpointHealthCheck.cs` at line 39, The health-check is currently calling client.GetAsync(_url, cancellationToken) which buffers the entire response body; change the call in HttpEndpointHealthCheck (where GetAsync is invoked) to use the overload with HttpCompletionOption.ResponseHeadersRead so only headers are read (e.g., call client.GetAsync(_url, HttpCompletionOption.ResponseHeadersRead, cancellationToken)), and ensure you do not attempt to read the response content body afterwards so the response can be disposed promptly.web/wwwroot/js/healthchecks-ui-extras.js (1)
43-54: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueFragile DOM coupling on
.hc-status .material-icons/"check_circle"text.This selector chain and icon name are tightly coupled to the pinned fork's internal bundle. Silent breakage if the fork changes icon naming or structure. Past review flagged this.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/wwwroot/js/healthchecks-ui-extras.js` around lines 43 - 54, The helper hasUnhealthyCampusCheck() is fragile because it relies on the icon DOM structure and literal icon text ("check_circle"); update it to detect health via a stable semantic attribute or class instead: locate the row by the campus name (function: hasUnhealthyCampusCheck), then read a stable indicator from the row's hc-status container (e.g. a data-status attribute like data-health="healthy"/"unhealthy" or a status class such as "status-healthy"/"status-unhealthy") rather than querying ".hc-status .material-icons" and comparing text to "check_circle"; add a safe fallback (treat unknown/missing indicators as unhealthy) so changes to icon names/markup won't silently break checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/HealthChecks/HealthCheckCollectorTokenHandlerTests.cs`:
- Line 33: In HealthCheckCollectorTokenHandlerTests update the call to
invoker.SendAsync so the HttpResponseMessage returned is captured and disposed
same as the other tests: wrap the await of invoker.SendAsync(request,
CancellationToken.None) in a using declaration (matching the pattern used on
lines 15 and 48) to ensure the response is disposed; locate the invocation in
the test method inside HealthCheckCollectorTokenHandlerTests and apply the using
var _ = pattern around the await expression.
In `@web/Classes/HealthChecks/DiskSpaceHealthCheck.cs`:
- Around line 128-131: The calculation of freePercent can divide by zero because
drive.TotalSize (totalBytes) may be 0; update the DiskSpaceHealthCheck code
where freeBytes, totalBytes, freeGb and freePercent are computed to defensively
handle totalBytes == 0 by checking totalBytes before dividing (e.g., if
totalBytes > 0 compute freePercent = Math.Round(freeBytes * 100.0 / totalBytes,
1) else set freePercent to 0 or a safe default/nullable), keep the freeGb
computation unchanged, and ensure any downstream consumers of freePercent can
handle the default value.
In `@web/Classes/HealthChecks/LdapHealthCheck.cs`:
- Around line 35-46: Task.Run wrapping the synchronous LdapConnection.Bind()
call doesn't allow cancellationToken to abort an in-progress Bind(), so external
cancellation (e.g., shutdown) can hang until lc.Timeout elapses; change the
approach in LdapHealthCheck by replacing the direct Task.Run(() => { ...
lc.Bind(); }, cancellationToken) with an async pattern that races the bind
against cancellation—create a Task that performs the synchronous Bind on the
LdapConnection (or use a Task.Run without the token) and then use
Task.WhenAny(bindTask, cancellationToken.AsTask()) to detect cancellation, and
if cancellation wins throw/return early; ensure you still set lc.Timeout and
dispose LdapConnection (LdapDirectoryIdentifier/LdapConnection/lc) properly when
cancelling.
- Around line 60-63: The catch for OperationCanceledException in LdapHealthCheck
(the block returning HealthCheckResult.Unhealthy("LDAP bind timed out.")) has a
misleading message; update the handler in the method that performs the LDAP bind
to return a result that reflects cancellation instead of a timeout (e.g., "LDAP
bind was cancelled") or inspect cancellationToken.IsCancellationRequested and
include that fact in the message; do not claim a timeout (which throws
LdapException) — update the HealthCheckResult.Unhealthy message accordingly in
the OperationCanceledException catch block.
---
Duplicate comments:
In `@web/Classes/CloudflareNetworks.cs`:
- Around line 52-56: The current exception filter in CloudflareNetworks.cs (the
catch block that logs via logger.Warn and returns HardcodedFallback) misses some
network/DNS failures like SocketException or cases where those are wrapped as
inner exceptions; update the catch logic in the method that fetches Cloudflare
IP ranges to also handle SocketException and to inspect inner exceptions for
socket/DNS errors (or fall back to a non-filtered catch that detects
network-related inner exceptions) so those failures are caught, logged via
logger.Warn with the exception instance, and HardcodedFallback is returned as
before.
In `@web/Classes/HealthChecks/HttpEndpointHealthCheck.cs`:
- Line 39: The health-check is currently calling client.GetAsync(_url,
cancellationToken) which buffers the entire response body; change the call in
HttpEndpointHealthCheck (where GetAsync is invoked) to use the overload with
HttpCompletionOption.ResponseHeadersRead so only headers are read (e.g., call
client.GetAsync(_url, HttpCompletionOption.ResponseHeadersRead,
cancellationToken)), and ensure you do not attempt to read the response content
body afterwards so the response can be disposed promptly.
In `@web/wwwroot/css/healthchecks-ui-branding.css`:
- Line 53: The CSS rule in healthchecks-ui-branding.css uses a pixel value for
the left border; replace border-left: 4px solid var(--warningColor); with a
rem-based value per guidelines by changing it to border-left: 0.25rem solid
var(--warningColor); (locate the declaration of border-left in the
healthchecks-ui-branding.css rule where var(--warningColor) is used and update
the unit to rem).
In `@web/wwwroot/js/healthchecks-ui-extras.js`:
- Around line 43-54: The helper hasUnhealthyCampusCheck() is fragile because it
relies on the icon DOM structure and literal icon text ("check_circle"); update
it to detect health via a stable semantic attribute or class instead: locate the
row by the campus name (function: hasUnhealthyCampusCheck), then read a stable
indicator from the row's hc-status container (e.g. a data-status attribute like
data-health="healthy"/"unhealthy" or a status class such as
"status-healthy"/"status-unhealthy") rather than querying ".hc-status
.material-icons" and comparing text to "check_circle"; add a safe fallback
(treat unknown/missing indicators as unhealthy) so changes to icon names/markup
won't silently break checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f4de8484-510e-475e-9fd3-5365e84a4997
📒 Files selected for processing (18)
JenkinsFiletest/HealthChecks/HealthCheckCollectorAuthTests.cstest/HealthChecks/HealthCheckCollectorTokenHandlerTests.csweb/Classes/CloudflareNetworks.csweb/Classes/HealthChecks/AdaptivePollingHealthCheck.csweb/Classes/HealthChecks/AwsSsmHealthCheck.csweb/Classes/HealthChecks/DiskSpaceHealthCheck.csweb/Classes/HealthChecks/HealthCheckCollectorAuth.csweb/Classes/HealthChecks/HealthCheckCollectorTokenHandler.csweb/Classes/HealthChecks/HealthCheckExtensions.csweb/Classes/HealthChecks/HttpEndpointHealthCheck.csweb/Classes/HealthChecks/LdapHealthCheck.csweb/Classes/HealthChecks/SmtpHealthCheck.csweb/Program.csweb/Viper.csprojweb/appsettings.jsonweb/wwwroot/css/healthchecks-ui-branding.cssweb/wwwroot/js/healthchecks-ui-extras.js
0061f85 to
9c925bc
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
web/wwwroot/css/healthchecks-ui-branding.css (1)
49-54:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winReplace
pxunit withremin banner border width.Line 53 uses
4px; switch to an equivalentremvalue to match project sizing rules.🛠️ Proposed fix
- border-left: 4px solid var(--warningColor); + border-left: 0.25rem solid var(--warningColor);As per coding guidelines, "Use
rem(notpx) for spacing, sizing, and typography."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/wwwroot/css/healthchecks-ui-branding.css` around lines 49 - 54, The CSS rule for `#campus-status-banner` uses a px unit for the left border width; replace "border-left: 4px solid var(--warningColor);" with the equivalent rem unit (4px → 0.25rem assuming a 16px root) so the line becomes a rem-based border width while keeping the same color and style.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@JenkinsFile`:
- Around line 113-114: Update the health probe URL assignments that currently
set $url to "https://secure-test.vetmed.ucdavis.edu/2/health" so they point to
the mapped endpoint "/health" instead (i.e., use
"https://secure-test.vetmed.ucdavis.edu/health"); locate both assignments of the
$url variable in the JenkinsFile (the two places that currently contain
"/2/health") and change them to "/health" to prevent liveness probe false
negatives.
In `@web/Classes/HealthChecks/DiskSpaceHealthCheck.cs`:
- Around line 119-123: The current catch block in DiskSpaceHealthCheck that uses
"catch (Exception ex) when (ex is IOException or UnauthorizedAccessException)"
should be replaced with two specific catches to preserve identical behavior: add
a "catch (IOException ex)" and a "catch (UnauthorizedAccessException ex)" each
containing the same best-effort cleanup comment/body that exists now; this keeps
functionality the same but avoids catching broad Exception. Locate the cleanup
logic in the method on the DiskSpaceHealthCheck class and replace the filtered
Exception catch with these two concrete exception handlers (maintain the comment
about unique name / 0-byte file).
In `@web/Classes/HealthChecks/HealthCheckExtensions.cs`:
- Around line 85-99: The code currently only registers the "disk-space-logs"
readiness check when configuration["LoggingPath"] is non-empty, which hides
missing/misspelled LoggingPath; remove the if-check so the builder.AddCheck call
always runs and pass the configuration value (variable loggingPath from
configuration["LoggingPath"]) into DiskSpaceHealthCheck (explicitDrivePath:
loggingPath, requirePathExists: true, verifyWritable: true) so the health check
itself will fail/alert if the path is missing or invalid; keep the same check
name ("disk-space-logs") and tags ("ready").
In `@web/Classes/HealthChecks/SmtpHealthCheck.cs`:
- Around line 82-85: The catch for OperationCanceledException in SmtpHealthCheck
should distinguish external caller cancellation from internal timeout: when
using CreateLinkedTokenSource(cancellationToken) + CancelAfter(_timeout),
inspect the original cancellationToken.IsCancellationRequested inside the catch
to determine if the cancellation came from the caller (return Unhealthy("SMTP
cancelled by caller.") or similar) versus the linked token timing out (return
Unhealthy($"SMTP timed out after {_timeout}.")). Update the catch block that
handles OperationCanceledException accordingly and keep references to
CreateLinkedTokenSource, cancellationToken, _timeout, and the
OperationCanceledException handling in SmtpHealthCheck.
In `@web/wwwroot/js/healthchecks-ui-extras.js`:
- Around line 13-25: The formatDuration function and DURATION_PATTERN only match
when a fractional part exists, so whole-second TimeSpan strings like "00:00:02"
bypass humanization; update DURATION_PATTERN to allow an optional fractional
part (e.g. make the \.\d+ portion optional) and change the regex used inside
formatDuration to handle a missing group 4 by treating the fractional string as
"0" (pad/slice as before) so fracMs computes correctly; ensure formatDuration
still returns the original text if the overall pattern fails but otherwise
proceeds to compute hours/mins/secs/fracMs and humanize as before.
- Around line 49-51: The current check treats missing status icons as healthy
because it only flags rows when an icon exists and is not "check_circle"; update
the condition around the icon (the variable from row.querySelector(".hc-status
.material-icons")) so that missing icons are considered a problem—i.e., change
the logic from "if (icon && icon.textContent.trim() !== 'check_circle') return
true;" to treat "!icon" as a failure case as well (use a condition that returns
true when icon is absent or its textContent trimmed is not "check_circle").
---
Duplicate comments:
In `@web/wwwroot/css/healthchecks-ui-branding.css`:
- Around line 49-54: The CSS rule for `#campus-status-banner` uses a px unit for
the left border width; replace "border-left: 4px solid var(--warningColor);"
with the equivalent rem unit (4px → 0.25rem assuming a 16px root) so the line
becomes a rem-based border width while keeping the same color and style.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b5c0a639-13ee-40ec-914d-c42822f2584a
📒 Files selected for processing (18)
JenkinsFiletest/HealthChecks/HealthCheckCollectorAuthTests.cstest/HealthChecks/HealthCheckCollectorTokenHandlerTests.csweb/Classes/CloudflareNetworks.csweb/Classes/HealthChecks/AdaptivePollingHealthCheck.csweb/Classes/HealthChecks/AwsSsmHealthCheck.csweb/Classes/HealthChecks/DiskSpaceHealthCheck.csweb/Classes/HealthChecks/HealthCheckCollectorAuth.csweb/Classes/HealthChecks/HealthCheckCollectorTokenHandler.csweb/Classes/HealthChecks/HealthCheckExtensions.csweb/Classes/HealthChecks/HttpEndpointHealthCheck.csweb/Classes/HealthChecks/LdapHealthCheck.csweb/Classes/HealthChecks/SmtpHealthCheck.csweb/Program.csweb/Viper.csprojweb/appsettings.jsonweb/wwwroot/css/healthchecks-ui-branding.cssweb/wwwroot/js/healthchecks-ui-extras.js
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@web/Classes/HealthChecks/DiskSpaceHealthCheck.cs`:
- Around line 43-57: The DiskSpaceHealthCheck constructor currently accepts any
numeric thresholds; add validation in the DiskSpaceHealthCheck constructor to
enforce invariants: ensure criticalFreePercent and warningFreePercent are within
a sensible range (e.g., 0.0–100.0), that warningFreePercent is strictly greater
than criticalFreePercent, and reject negative values; if validation fails, throw
ArgumentOutOfRangeException or ArgumentException with a clear message. Update
checks around the fields _criticalFreePercent and _warningFreePercent so
construction fails fast when parameters violate these rules.
In `@web/Program.cs`:
- Around line 95-98: The loop over cloudflareCidrs should guard against
malformed CIDRs by wrapping the call to System.Net.IPNetwork.Parse(cidr) in a
try-catch that catches FormatException, skips the invalid cidr, and logs the
error; update the foreach (var cidr in cloudflareCidrs) block that calls
options.KnownIPNetworks.Add(System.Net.IPNetwork.Parse(cidr)) to perform Parse
inside try, catch FormatException, and emit a descriptive log entry (including
the offending cidr and exception) before continuing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a49a9bf8-a096-49ce-9c9d-9c25e42d84b1
📒 Files selected for processing (2)
web/Classes/HealthChecks/DiskSpaceHealthCheck.csweb/Program.cs
576b611 to
ab04023
Compare
ab04023 to
37a7f95
Compare
64e16bb to
5a816db
Compare
- Cloudflare fronts vetmed.ucdavis.edu, so the proxy chain is User -> CF -> F5 -> app. Without trusting CF as a known proxy, UseForwardedHeaders ignores XFF and RemoteIpAddress stays at the CF edge - breaking every IP-based allowlist in the app, not just health checks. - Fetches CF's published v4/v6 CIDRs from cloudflare.com at startup (cached for the process lifetime) and adds them to KnownIPNetworks. Falls back to a hardcoded snapshot on fetch failure so a CF blip during deploy doesn't break startup. - Bumps ForwardLimit to 2 so the chain walk continues through both proxy hops (F5 and CF) and lands on the real client IP rather than stopping at the CF edge.
- /health: anonymous liveness for Jenkins probes. - /health/detail: per-check JSON (UI format) gated by InternalAllowlist (developer IPs only). Intentionally not CAS-gated so it stays reachable when auth subsystems are degraded. - /healthchecks: HealthChecks.UI dashboard, IP-gated to the same list. - Database, disk, AWS SSM, LDAP, CAS, SMTP, and VMACs probes; campus-* external probes wrapped in adaptive polling so a healthy check is cached for an hour and a failing one re-probes every 5 min. - The in-app HealthChecksUI collector polls /health/detail through the IP filter without widening the allowlist: a process-startup random token is stamped on every outbound poll by a delegating handler (UseApiEndpointDelegatingHandler), and the endpoint filter bypasses the IP check on a constant-time match. Token cycles per process, so it can't be replayed across deploys. - Unit tests cover the constant-time matcher, header lookup (case-insensitive), and the delegating handler's stamp/replace/ forward behaviors.
5a816db to
28e397f
Compare
Summary by CodeRabbit
New Features
Improvements
Tests