Skip to content

Conversation

@iambriccardo
Copy link
Contributor

@iambriccardo iambriccardo commented Oct 28, 2025

This PR introduces a new notification system for the ETL pipeline that forwards errors to the Supabase API's system endpoint, enabling real-time alerting of pipeline failures to users.

The notification system captures errors from ETL and transmits them to the Supabase API endpoint. Each error is hashed before transmission, which the receiving endpoint uses as a deduplication key to filter repeated occurrences of the same failure. This ensures users receive timely failure notifications while maintaining signal quality by suppressing noise from repetitive errors.

The PR also cleans up some configurations and improves the design to make sure at compile time, that we are not serializing a configuration that contains secrets.

@iambriccardo iambriccardo changed the title riccardobusetti/feat/add error notifications feat(errors): Implement mechanism for sending errors as notifications Oct 28, 2025
}
}

pub async fn read_pipeline_components(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved this here to better organize the code.

pub destination: DestinationConfigWithoutSecrets,
/// Configuration for the replication pipeline.
pub pipeline: PipelineConfigWithoutSecrets,
/// Optional Sentry configuration for error tracking.
Copy link
Contributor Author

@iambriccardo iambriccardo Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want the sentry config in the replicator config without secrets since it has the likelihood of exposing them, now we enforce this at compile time.

/// source and destination configurations, builds the replicator configuration, and applies
/// all resources to the cluster.
#[allow(clippy::too_many_arguments)]
pub async fn create_or_update_pipeline_resources_in_k8s(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also moved all of these functions for better organization.

@iambriccardo iambriccardo marked this pull request as ready for review October 29, 2025 15:53
@iambriccardo iambriccardo requested a review from a team as a code owner October 29, 2025 15:53
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +161 to +170
/// Computes a stable hash for an error.
///
/// This provides a consistent identifier across multiple occurrences of the
/// same error type, enabling grouping and deduplication in monitoring systems.
pub fn compute_error_hash<H: Hash>(error_hash: H) -> String {
let mut hasher = DefaultHasher::new();
error_hash.hash(&mut hasher);
let hash_value = hasher.finish();

format!("{hash_value:016x}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use deterministic hasher for error dedup keys

The compute_error_hash helper builds the deduplication key with std::collections::hash_map::DefaultHasher. That hasher is intentionally randomized per process, so the same EtlError will produce different hashes after each pod restart or across replicas. Because the notification service relies on a stable hash to collapse repeated failures, this randomness prevents deduplication from ever kicking in once the replicator restarts or scales out. Consider switching to a hasher with fixed seeds (e.g. SipHasher with constant keys or a cryptographic digest) so identical errors produce identical hashes across runs.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are fine with this, since we assume that ETL is a long-running process, so even if we have multiple different hashes for the same error, it might lead to more noise but for now it doesn't seem worthy of adding an extra dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants