Skip to content

Commit

Permalink
Merge pull request #3389 from jparrill/release-4.14
Browse files Browse the repository at this point in the history
[release-4.14] OCPBUGS-26526: Documented to disable UWM telemetry writer in disconnected envs
  • Loading branch information
openshift-merge-bot[bot] committed Feb 5, 2024
2 parents 2762195 + 2e77fe3 commit ad3bd20
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 4 deletions.
46 changes: 46 additions & 0 deletions docs/content/how-to/disconnected/known-issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Known Issues

## OLM default catalog sources in ImagePullBackOff state

When you work in a disconnected environment the OLM catalog sources will be still pointing to their original source, so all of these container images will keep it in ImagePullBackOff state even if the OLMCatalogPlacement is set to `Management` or `Guest`. From this point you have some options ahead:

1. Disable those OLM default catalog sources and using the oc-mirror binary, mirror the desired images into your private registry, creating a new Custom Catalog Source.
2. Mirror all the Container Images from all the catalog sources and apply an ImageContentSourcePolicy to request those images from the private registry.

The most practical one is the first choice. To proceed with this option, you will need to follow [these instructions](https://docs.openshift.com/container-platform/4.14/installing/disconnected_install/installing-mirroring-disconnected.html). The process will make sure all the images get mirrored and also the ICSP will be generated properly.

Additionally when you're provisioning the HostedCluster you will need to add a flag to indicate that the OLMCatalogPlacement is set to `Guest` because if that's not set, you will not be able to disable them.

## Hypershift operator is failing to reconcile in Disconnected environments

If you are operating in a disconnected environment and have deployed the Hypershift operator, you may encounter an issue with the UWM telemetry writer. Essentially, it exposes Openshift deployment data in your RedHat account, but this functionality does not operate in a disconnected environments.

**Symptoms:**

- The Hypershift operator appears to be running correctly in the `hypershift` namespace but even if you creates the Hosted Cluster nothing happens.
- There will be a couple of log entries in the Hypershift operator:

```
{"level":"error","ts":"2023-12-20T15:23:01Z","msg":"Reconciler error","controller":"deployment","controllerGroup":"apps","controllerKind":"Deployment","Deployment":{"name":"operator","namespace":"hypershift"},"namespace":"hypershift","name":"operator","reconcileID":"451fde3c-eb1b-4cf0-98cb-ad0f8c6a6288","error":"cannot get telemeter client secret: Secret \"telemeter-client\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}
{"level":"debug","ts":"2023-12-20T15:23:01Z","logger":"events","msg":"Failed to ensure UWM telemetry remote write: cannot get telemeter client secret: Secret \"telemeter-client\" not found","type":"Warning","object":{"kind":"Deployment","namespace":"hypershift","name":"operator","uid":"c6628a3c-a597-4e32-875a-f5704da2bdbb","apiVersion":"apps/v1","resourceVersion":"4091099"},"reason":"ReconcileError"}
```

**Solution:**

To resolve this issue, the solution will depend on how you deployed Hypershift:

- **The HO was deployed using ACM/MCE:** In this case you will need to create a ConfigMap in the `local-cluster` namespace (the namespace and ConfigMap name cannot be changed) called `hypershift-operator-install-flags` with this content:

```
---
apiVersion: v1
kind: ConfigMap
metadata:
name: hypershift-operator-install-flags
namespace: local-cluster
data:
installFlagsToRemove: --enable-uwm-telemetry-remote-write
```

- **The HO was deployed using the Hypershift binary:** In this case you will just need to remove the flag `--enable-uwm-telemetry-remote-write` from the hypershift deployment command.
6 changes: 3 additions & 3 deletions docs/content/how-to/metrics-sets.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
HyperShift creates ServiceMonitor resources in each control plane namespace that allow
a Prometheus stack to scrape metrics from the control planes. ServiceMonitors use metrics relabelings
to define which metrics are included or excluded from a particular component (etcd, Kube API server, etc)
The number of metrics produced by control planes has a direct impact on resource requirements of
the monitoring stack scraping them.
The number of metrics produced by control planes has a direct impact on resource requirements of
the monitoring stack scraping them.

Instead of producing a fixed number of metrics that apply to all situations, HyperShift allows
configuration of a "metrics set" that identifies a set of metrics to produce per control plane.
Expand All @@ -13,7 +13,7 @@ The following metrics sets are supported:

* `Telemetry` - metrics needed for telemetry. This is the default and the smallest
set of metrics.
* `SRE` - Configurable metrics set, intended to include necessary metrics to produce alerts and
* `SRE` - Configurable metrics set, intended to include necessary metrics to produce alerts and
allow troubleshooting of control plane components.
* `All` - all the metrics produced by standalone OCP control plane components.

Expand Down
5 changes: 4 additions & 1 deletion hypershift-operator/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ func run(ctx context.Context, opts *StartOptions, log logr.Logger) error {
}

// If enabled, start controller to ensure UWM stack is enabled and configured
// to remotely write telemetry metrics
// to remotely write telemetry metrics.
if opts.EnableUWMTelemetryRemoteWrite {
if err := (&uwmtelemetry.Reconciler{
Namespace: opts.Namespace,
Expand All @@ -385,6 +385,9 @@ func run(ctx context.Context, opts *StartOptions, log logr.Logger) error {
}).SetupWithManager(mgr); err != nil {
return fmt.Errorf("unable to create uwm telemetry controller: %w", err)
}
log.Info("UWM telemetry remote write controller enabled")
} else {
log.Info("UWM telemetry remote write controller disabled")
}

// Start controllers to manage dedicated request serving isolation
Expand Down

0 comments on commit ad3bd20

Please sign in to comment.