Skip to content

Build CVE Matter-Analysis OS: Python 3.11 blue-team platform with ML, K8s, and GPU support#1

Open
Copilot wants to merge 11 commits into
mainfrom
copilot/build-private-cve-analysis-repo
Open

Build CVE Matter-Analysis OS: Python 3.11 blue-team platform with ML, K8s, and GPU support#1
Copilot wants to merge 11 commits into
mainfrom
copilot/build-private-cve-analysis-repo

Conversation

Copy link
Copy Markdown

Copilot AI commented Nov 12, 2025

CVE Matter-Analysis OS - Code Quality Improvements

Changes Made:

Code Quality & DRY Principle:

  • ✅ Created shared cve_matter/constants.py module for SEVERITY_MAP constant
  • ✅ Refactored procrustes.py and cca.py to use shared constant (eliminates code duplication)
  • ✅ Fixed CV score computation to respect criteria parameter in model_selection.py
  • ✅ Simplified AIC to always compute (removed unnecessary conditional)

Documentation:

  • ✅ Fixed TERRAFORM.md to use correct "NoSchedule" casing (was "NO_SCHEDULE")

Infrastructure:

  • ✅ Improved Terraform output for GPU node pool to provide clearer message when disabled

Benefits:

  • Maintainability: Single source of truth for severity mappings
  • Consistency: CV scores only computed when explicitly requested
  • Clarity: Better error messages and documentation accuracy

Test Results:

  • ✅ All 20 tests passing
  • ✅ No breaking changes
  • ✅ Improved code organization

Summary:

Addressed 5 code review comments to improve code quality, eliminate duplication, and fix documentation errors. The severity mapping is now centralized in a constants module, making it easier to maintain and ensuring consistency across alignment modules.

Original prompt

Build a private blue-team repo: CVE Matter-Analysis OS. Python 3.11 CLI with modules: ingest (NVD), alignment (Procrustes/CCA), arbiter (super-learner), refractors (ε), evidence (BIC/WAIC). Add tests; Docker + optional CUDA; K8s: gVisor RuntimeClass, AdmissionWebhook + PolicyTrigger CRD; Argo Workflow for GPU ε-sweep; Terraform for GKE + GPU nodepool; GitHub Actions (CI, CodeQL, Trivy); SECURITY.md (CVD); .copilot tasks; config/matter.yaml. No offensive/crypto-breaking.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 4 commits November 12, 2025 19:24
Co-authored-by: igor-holt <125706350+igor-holt@users.noreply.github.com>
Co-authored-by: igor-holt <125706350+igor-holt@users.noreply.github.com>
Co-authored-by: igor-holt <125706350+igor-holt@users.noreply.github.com>
Co-authored-by: igor-holt <125706350+igor-holt@users.noreply.github.com>
Copilot AI changed the title [WIP] Build private blue-team repo for CVE Matter-Analysis OS Build CVE Matter-Analysis OS: Python 3.11 blue-team platform with ML, K8s, and GPU support Nov 12, 2025
@igor-holt Igor Holt (igor-holt) requested review from Copilot and removed request for Igor Holt (igor-holt) November 12, 2025 20:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a comprehensive defensive security platform for CVE analysis called "CVE Matter-Analysis OS". The implementation includes a Python 3.11 CLI with multiple analysis modules, complete infrastructure setup for Kubernetes and GKE, Docker containerization with optional CUDA support, CI/CD pipelines, and extensive documentation.

Key Changes:

  • Python 3.11 CLI with 5 core modules (ingest, alignment, arbiter, refractors, evidence)
  • Multi-stage Docker builds with CPU and CUDA support
  • Kubernetes deployment with gVisor RuntimeClass, AdmissionWebhook, and PolicyTrigger CRD
  • Argo Workflows for GPU-accelerated batch processing
  • Terraform configuration for GKE cluster with GPU node pools
  • Complete CI/CD with GitHub Actions (CodeQL, Trivy, testing)
  • Comprehensive security documentation and CVD policy

Reviewed Changes

Copilot reviewed 41 out of 42 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
tests/*.py Unit tests for all core modules with 67% coverage
tests/conftest.py Pytest fixtures for test data generation
cve_matter/ingest/ NVD API client with rate limiting and mock fallback
cve_matter/alignment/ Procrustes and CCA implementations for feature alignment
cve_matter/arbiter/ Super-learner ensemble with multiple base learners
cve_matter/refractors/ Epsilon sensitivity analysis with optional CUDA support
cve_matter/evidence/ BIC/WAIC model selection criteria implementation
cve_matter/cli.py Click-based CLI with 5 commands
terraform/*.tf GKE infrastructure with CPU and GPU node pools
k8s/*.yaml Kubernetes manifests for gVisor, CRDs, and webhooks
argo/*.yaml GPU-accelerated workflow for epsilon sweep
Dockerfile Multi-stage build with CPU and CUDA targets
docker-compose.yml Local development setup for both CPU and GPU
.github/workflows/*.yml CI/CD pipelines for testing, CodeQL, and Trivy
pyproject.toml Python project configuration with dependencies
config/matter.yaml Centralized configuration template
*.md Comprehensive documentation for all components

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread terraform/variables.tf Outdated
Comment on lines +42 to +45
variable "enable_gpu" {
description = "Enable GPU node pool"
type = bool
default = true
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The variable enable_gpu defaults to true which will attempt to provision GPU nodes and may fail if GPU quotas are not available or regions don't support the specified GPU type. Consider defaulting to false to avoid unexpected provisioning failures and cost overruns for users who don't need GPU support.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed enable_gpu default to false in commit 607513d. GPU provisioning is now opt-in to avoid quota issues and unexpected costs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already applied in commit 607513d. The enable_gpu variable now defaults to false to prevent unexpected provisioning failures and costs.

Comment thread cve_matter/evidence/model_selection.py Outdated
Comment on lines +26 to +27
def compute_evidence_from_file(self, input_path: Path,
criteria: list[str] = ['bic', 'waic']) -> dict[str, Any]:
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The parameter name criteria uses a list with default value as a mutable default argument in the function signature. While Click handles this correctly, the function also accepts criteria as a tuple from Click but casts it to list(criteria). For consistency and to follow Python best practices, consider using criteria: list[str] | None = None and defaulting to ['bic', 'waic'] inside the function body.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed mutable default argument in commit 037b079. Now uses criteria: list[str] | None = None with default initialization inside the function body.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already applied in commit 037b079. The criteria parameter now uses None default with initialization inside the function body to avoid mutable default arguments.

Comment thread terraform/main.tf Outdated
Comment on lines +17 to +20
backend "gcs" {
bucket = "cve-matter-terraform-state"
prefix = "terraform/state"
}
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Terraform backend configuration uses a hardcoded bucket name that may not exist. This will cause terraform init to fail for users who haven't created this specific GCS bucket. Consider either:

  1. Removing the backend block and documenting manual backend configuration in TERRAFORM.md
  2. Using a variable or remote backend configuration file
  3. Making the bucket name configurable via an environment variable

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already applied in commit 607513d. Removed hardcoded GCS bucket from backend configuration. Users must now provide backend config via CLI or configuration file.

Comment thread k8s/admission-webhook.yaml Outdated
Comment thread terraform/gke.tf Outdated
taint {
key = "nvidia.com/gpu"
value = "true"
effect = "NO_SCHEDULE"
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The taint effect "NO_SCHEDULE" should be "NoSchedule" (CamelCase). Kubernetes taint effects are case-sensitive and the correct value is "NoSchedule", not "NO_SCHEDULE".

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already applied in commit 607513d. The taint effect is now correctly set to "NoSchedule" (CamelCase) as required by Kubernetes.

Comment thread cve_matter/ingest/__init__.py Outdated
cves.append(cve_data)

except requests.exceptions.RequestException as e:
print(f"Warning: Failed to fetch from NVD API: {e}")
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The error message "Warning: Failed to fetch from NVD API" is printed to stdout using print() instead of proper logging. Consider using Python's logging module or at least printing to stderr for warning messages to avoid mixing warnings with normal output.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed warning messages to print to stderr in commit 037b079. Both NVD API and CUDA warnings now use print(..., file=sys.stderr).

Comment on lines +119 to +127
# Simplified WAIC computation
# In practice, this requires sampling from posterior
pointwise_log_likelihood = np.log(y_pred_proba[np.arange(n_samples), y] + 1e-10)
lppd = np.sum(pointwise_log_likelihood)
p_waic = np.var(pointwise_log_likelihood)
waic = -2 * (lppd - p_waic)
result['waic'] = float(waic)
result['p_waic'] = float(p_waic)

Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CVSS threshold computation in WAIC is incorrect. The code uses np.var(pointwise_log_likelihood) for p_waic, but the correct WAIC computation requires the variance across posterior samples, not the variance of pointwise log-likelihoods from a single model fit. This simplified implementation doesn't properly estimate the effective number of parameters. Consider adding a comment acknowledging this limitation or implementing proper WAIC with posterior sampling.

Suggested change
# Simplified WAIC computation
# In practice, this requires sampling from posterior
pointwise_log_likelihood = np.log(y_pred_proba[np.arange(n_samples), y] + 1e-10)
lppd = np.sum(pointwise_log_likelihood)
p_waic = np.var(pointwise_log_likelihood)
waic = -2 * (lppd - p_waic)
result['waic'] = float(waic)
result['p_waic'] = float(p_waic)
# WARNING: The following WAIC computation is NOT statistically valid.
# Proper WAIC requires sampling from the posterior distribution (e.g., via MCMC).
# This is a placeholder using a single model fit and should not be interpreted as a true WAIC.
pointwise_log_likelihood = np.log(y_pred_proba[np.arange(n_samples), y] + 1e-10)
lppd = np.sum(pointwise_log_likelihood)
p_waic = np.nan # Not computable without posterior samples
waic = np.nan # Not computable without posterior samples
result['waic'] = waic
result['p_waic'] = p_waic
result['waic_warning'] = (
"WAIC and p_waic are not valid: proper WAIC requires posterior sampling. "
"These values are placeholders only."
)

Copilot uses AI. Check for mistakes.
Comment thread cve_matter/ingest/__init__.py Outdated

for i in range(count):
mock_cves.append({
'id': f'CVE-2024-{10000 + i}',
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock data generation uses hardcoded CVE IDs starting from "CVE-2024-10000" which may conflict with real CVE IDs if this data is used in testing or mixed with real data. Consider using a more distinct prefix like "MOCK-CVE-2024-10000" or "TEST-CVE-2024-10000" to clearly differentiate mock data from real CVE records.

Suggested change
'id': f'CVE-2024-{10000 + i}',
'id': f'MOCK-CVE-2024-{10000 + i}',

Copilot uses AI. Check for mistakes.
Comment thread cve_matter/alignment/cca.py Outdated
Comment on lines +66 to +70
feature_vec = [
cve.get('cvss_score', 0.0),
len(cve.get('references', [])),
len(cve.get('description', '')),
hash(cve.get('severity', '')) % 100,
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using hash modulo for categorical encoding (hash(severity) % 100) is not deterministic across Python sessions due to hash randomization (PYTHONHASHSEED). This will produce different feature values for the same severity string in different runs, leading to inconsistent CCA results. Consider using a fixed mapping dictionary like {'LOW': 0, 'MEDIUM': 25, 'HIGH': 50, 'CRITICAL': 75} for deterministic encoding.

Suggested change
feature_vec = [
cve.get('cvss_score', 0.0),
len(cve.get('references', [])),
len(cve.get('description', '')),
hash(cve.get('severity', '')) % 100,
severity_map = {'LOW': 0, 'MEDIUM': 25, 'HIGH': 50, 'CRITICAL': 75}
severity = cve.get('severity', '').upper()
feature_vec = [
cve.get('cvss_score', 0.0),
len(cve.get('references', [])),
len(cve.get('description', '')),
severity_map.get(severity, 0),

Copilot uses AI. Check for mistakes.
Comment thread cve_matter/refractors/epsilon.py Outdated
Comment on lines +114 to +133
# Compute stability metric with epsilon perturbation
if self.use_gpu:
noise = cp.random.randn(*features_gpu.shape) * epsilon
perturbed = features_gpu + noise
stability = float(cp.mean(cp.abs(perturbed - features_gpu)))
else:
noise = np.random.randn(*features.shape) * epsilon
perturbed = features + noise
stability = float(np.mean(np.abs(perturbed - features)))

stability_scores.append(stability)

result = {
'status': 'success',
'epsilon_range': [float(epsilon_min), float(epsilon_max)],
'n_steps': n_steps,
'epsilon_values': epsilon_values.tolist(),
'stability_scores': stability_scores,
'gpu_used': self.use_gpu,
'optimal_epsilon': float(epsilon_values[np.argmin(stability_scores)]),
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stability score computation is measuring the mean absolute difference between perturbed and original features, which will always equal the mean absolute value of the noise (approximately proportional to epsilon). This means the "optimal epsilon" will always be the minimum epsilon value, making this metric not very useful for finding an optimal epsilon. Consider using a more meaningful stability metric, such as prediction variance or model performance degradation under perturbation.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved epsilon stability metric in commit 037b079. Now uses prediction variance across 10 perturbation trials. Optimal epsilon maximizes stability score (inverse of variance) rather than minimizing a meaningless metric. Returns additional metrics: variance_scores, optimal_stability, and method.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 41 out of 42 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread terraform/gke.tf Outdated
taint {
key = "nvidia.com/gpu"
value = "true"
effect = "NO_SCHEDULE"
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The taint effect should be "NoSchedule" (camelCase) instead of "NO_SCHEDULE" (uppercase with underscore). Terraform for GCP uses the Kubernetes taint effect format which expects "NoSchedule", "PreferNoSchedule", or "NoExecute".

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +14
# TODO: Replace the following with the base64-encoded CA certificate for the webhook server.
# caBundle: <base64-encoded-CA-cert>
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caBundle contains a placeholder certificate that decodes to "-----BEGIN CERTIFICATE----------END CERTIFICATE-----" (empty certificate body). This will cause the webhook to fail in production. The caBundle should be generated using actual TLS certificates for the webhook service, typically using cert-manager or a script to generate proper certificates. Consider adding a comment indicating this needs to be replaced with actual certificates before deployment.

Suggested change
# TODO: Replace the following with the base64-encoded CA certificate for the webhook server.
# caBundle: <base64-encoded-CA-cert>
# IMPORTANT: Replace the following value with the base64-encoded CA certificate for the webhook server.
caBundle: REPLACE_WITH_BASE64_ENCODED_CA_CERT

Copilot uses AI. Check for mistakes.
Comment thread terraform/gke.tf Outdated
Comment on lines +58 to +60
sandbox_config {
sandbox_type = "gvisor"
}
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sandbox_config block at the cluster level is deprecated and should be removed. In GKE, gVisor sandboxing is configured at the node pool level (which is correctly done at lines 116-118), not at the cluster level. The cluster-level sandbox_config block has no effect and should be removed to avoid confusion.

Copilot uses AI. Check for mistakes.
…nfiguration

Co-authored-by: igor-holt <125706350+igor-holt@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 41 out of 42 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Dockerfile
# Install Python dependencies including CUDA support
RUN pip3 install --no-cache-dir --upgrade pip setuptools wheel && \
pip3 install --no-cache-dir click requests numpy scipy pandas scikit-learn pyyaml pydantic joblib
pip3 install --no-cache-dir click requests numpy scipy pandas scikit-learn pyyaml pydantic joblib
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate pip install command detected. Line 76 is an exact duplicate of line 75. This will cause the installation to run twice unnecessarily, wasting build time.

Suggested change
pip3 install --no-cache-dir click requests numpy scipy pandas scikit-learn pyyaml pydantic joblib

Copilot uses AI. Check for mistakes.
Comment thread cve_matter/ingest/nvd.py
Comment on lines +1 to +4
"""NVD CVE data ingestion module - re-export."""
from cve_matter.ingest import NVDIngestor

__all__ = ['NVDIngestor']
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing import for NVDIngestor. The file imports from cve_matter.ingest which would be the __init__.py, but at line 2 it tries to re-export NVDIngestor from that module. This creates a circular dependency: nvd.py imports from __init__.py, and if __init__.py were to import from nvd.py, it would fail. This file appears unnecessary since NVDIngestor is already defined in __init__.py.

Suggested change
"""NVD CVE data ingestion module - re-export."""
from cve_matter.ingest import NVDIngestor
__all__ = ['NVDIngestor']
# (File deleted)

Copilot uses AI. Check for mistakes.
Comment thread argo/epsilon-sweep-workflow.yaml Outdated
Comment on lines +21 to +27
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-t4

tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Argo workflow, the GPU node selector uses cloud.google.com/gke-accelerator: nvidia-tesla-t4 (line 22), but this selector is defined at the workflow level. This means all steps in the workflow, including data-preparation which uses the CPU image, will be scheduled on GPU nodes. This is inefficient as the data preparation step doesn't require GPU resources. Consider moving the nodeSelector and tolerations to only the epsilon-calculation template instead of the workflow level.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in commit c9617a7. Moved GPU nodeSelector and tolerations from workflow-level to epsilon-calculation template only. CPU-only steps (data-preparation, result-aggregation) now run on standard nodes, saving GPU resources.

Comment thread cve_matter/cli.py
if config_path.exists():
with open(config_path) as f:
ctx.obj['config'] = yaml.safe_load(f)
else:
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the CLI, when an invalid config path is provided via --config, the error is silently ignored and an empty dict is used instead (lines 27-31). This could lead to unexpected behavior where users think their configuration is being used but it's not. Consider logging a warning or error message when the config file doesn't exist, especially since a default path is specified.

Suggested change
else:
else:
click.echo(f"Warning: Config file '{config_path}' not found. Using default configuration.", err=True)

Copilot uses AI. Check for mistakes.
admissionReviewVersions: ["v1", "v1beta1"]
sideEffects: None
timeoutSeconds: 5
failurePolicy: Fail
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The admission webhook configuration has a TODO comment for the CA bundle (line 13-14), but the webhook is configured with failurePolicy: Fail (line 23). This means if the webhook is not properly configured with a valid CA certificate, all PolicyTrigger CREATE/UPDATE operations will be blocked. For initial deployment, consider using failurePolicy: Ignore until the webhook is properly configured with certificates, or ensure the setup documentation clearly indicates this is a blocking issue.

Suggested change
failurePolicy: Fail
failurePolicy: Ignore

Copilot uses AI. Check for mistakes.
@igor-holt
Copy link
Copy Markdown
Member

Copilot apply changes based on the comments in this thread

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

Comment thread cve_matter/arbiter/super_learner.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

Comment thread cve_matter/arbiter/super_learner.py
@igor-holt
Copy link
Copy Markdown
Member

Copilot apply changes based on the comments in this thread

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Nov 13, 2025

Copilot apply changes based on the comments in this thread

Applied changes from the review thread in commit eaeaf4a:

  1. Removed duplicate pip install line in Dockerfile
  2. Deleted unnecessary nvd.py file (circular dependency)
  3. Added config file warning in CLI for non-default paths
  4. Changed CLI criteria default from list to tuple
  5. Clarified AIC is always computed for comparison
  6. Changed webhook failurePolicy to Ignore (safe for initial deployment)
  7. Moved GPU nodeSelector to epsilon-calculation template only (CPU steps won't use GPU nodes)
  8. Improved mypy CI feedback

All 20 tests passing.

Co-authored-by: igor-holt <125706350+igor-holt@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Nov 13, 2025

Copilot apply changes based on the comments in this thread

All requested changes from the review thread have been applied:

  1. ✅ Duplicate pip install removed (previous commit)
  2. ✅ Circular dependency nvd.py deleted (previous commit)
  3. ✅ Config file warning added (previous commit)
  4. ✅ CLI criteria default fixed (previous commit)
  5. ✅ AIC computation clarified (previous commit)
  6. ✅ Webhook failurePolicy set to Ignore (previous commit)
  7. ✅ GPU nodeSelector moved to epsilon-calculation only (commit c9617a7)
  8. ✅ Mypy CI improved (previous commit)

All 20 tests passing.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 41 out of 42 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cve_matter/evidence/model_selection.py Outdated
Comment on lines +136 to +138
cv_scores = cross_val_score(model, X, y, cv=5)
result['cv_accuracy_mean'] = float(np.mean(cv_scores))
result['cv_accuracy_std'] = float(np.std(cv_scores))
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent calculation: Line 132 computes AIC but the function parameter criteria already filtered what should be computed. Line 135-138 computes cross-validation scores unconditionally, even when not requested in criteria. These metrics should only be computed when explicitly requested in the criteria parameter to maintain consistency with the conditional logic for BIC, WAIC, and AIC.

Suggested change
cv_scores = cross_val_score(model, X, y, cv=5)
result['cv_accuracy_mean'] = float(np.mean(cv_scores))
result['cv_accuracy_std'] = float(np.std(cv_scores))
if 'cv' in criteria:
cv_scores = cross_val_score(model, X, y, cv=5)
result['cv_accuracy_mean'] = float(np.mean(cv_scores))
result['cv_accuracy_std'] = float(np.std(cv_scores))

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e4701fa. CV scores now only computed when 'cv' is in criteria parameter. Also simplified AIC to always compute for comparison purposes (removed unnecessary conditional).

Comment thread cve_matter/alignment/procrustes.py Outdated
Comment on lines +64 to +65
# Deterministic mapping for severity levels
severity_map = {'LOW': 0, 'MEDIUM': 25, 'HIGH': 50, 'CRITICAL': 75, 'NONE': -1}
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated code: The severity_map dictionary is duplicated across multiple alignment modules (procrustes.py and cca.py). Consider extracting this mapping to a shared constants module or utility function to maintain DRY (Don't Repeat Yourself) principle and ensure consistency across modules.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e4701fa. Extracted SEVERITY_MAP to new shared cve_matter/constants.py module. Both procrustes.py and cca.py now import from this central location.

Comment thread cve_matter/alignment/cca.py Outdated
Comment on lines +64 to +65
# Deterministic mapping for severity levels
severity_map = {'LOW': 0, 'MEDIUM': 25, 'HIGH': 50, 'CRITICAL': 75, 'NONE': -1}
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated code: The severity_map dictionary is duplicated from procrustes.py. Consider extracting this mapping to a shared constants module or utility function to maintain DRY (Don't Repeat Yourself) principle and ensure consistency across modules.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e4701fa. Now uses shared SEVERITY_MAP constant from cve_matter/constants.py instead of duplicate definition.

Comment thread TERRAFORM.md Outdated
Comment thread terraform/outputs.tf Outdated
Copy link
Copy Markdown
Member

@igor-holt Igor Holt (igor-holt) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot Verify that your system has a compatible GPU installed and that the necessary drivers are up-to-date.
Ensure that the required libraries (e.g., CUDA, cuDNN) are installed and properly configured for your GPU.
Confirm that your Python environment includes the necessary packages for GPU support, such as torch or tensorflow, with GPU-enabled versions.
Test the GPU configuration by running a simple script to check if the GPU is detected:

import torchif torch.cuda.is_available():    print("GPU is available and ready!")    print(f"Device Name: {torch.cuda.get_device_name(0)}")else:    print("GPU is not available. Please check your configuration.")
If using Docker, ensure that the container has access to the GPU by using the --gpus flag:

docker run --gpus all <image_name>
For cloud environments, confirm that the instance type includes GPU support and that the necessary drivers are pre-installed or configured.

…onstant, fix Terraform docs

Co-authored-by: igor-holt <125706350+igor-holt@users.noreply.github.com>
@igor-holt Igor Holt (igor-holt) linked an issue Nov 17, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

✨ Set up Copilot instructions

3 participants