Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redshift: Quote column names and send recovery statsd metrics #1329

Closed
wants to merge 3 commits into from

Conversation

istreeter
Copy link
Contributor

No description provided.

@istreeter istreeter force-pushed the fix/redshift-quote-column-names branch from ac8d459 to 2f46859 Compare December 14, 2023 11:57
@istreeter istreeter force-pushed the fix/redshift-quote-column-names branch from 2f46859 to 02da1fc Compare December 22, 2023 09:11
@istreeter istreeter changed the title Quote column names in Redshift load statements Redshift: Quote column names and send recovery statsd metrics Dec 22, 2023
Comment on lines +149 to +152
for {
_ <- setStage(Stage.Committing)
_ <- Manifest.add[F](discovery.origin.toManifestItem)
} yield LoadSuccess(loadedRecoveryTableNames)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drawing attention to this change. Previously, we did an extra lookup to the manifest table, and the only reason was to get a timestamp (essentially "now"). After this change we simply use the loader's local version of when is "now". The value is used in monitoring payloads.

Arguably it doesn't belong in this PR, which is supposed to be about adding a statsd metric for recovery tables.

But.... it is related to how we pass around data relating to the load. Data which is used for building metrics and monitoring payloads. So I smuggled it into this PR.

maxTstamp,
Some(shredderStart),
Some(shredderEnd),
Some(recoveryTablesLoaded)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we should omit (not emit) the metric if it is zero? Or if we're loading to snowflake/databricks for which it has no meaning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this question depends on the downstream dependencies - we have statsD, then we have the metrics relay/custom tooling to push that data to eg. cloudwatch. I'm not sure if I remember this well enough but I think some of those things can fail if they expect a metric that isn't present - but I might be wrong.

I think my opinion stands whatever the answer to the above consideration is though:

  • For Snowflake/Databricks, I would omit it, assuming the tf stacks are sufficiently separated that doing so is trivial and doesn't introduce complexity.
  • For Redshift I would not omit it, even if there were no recovery tables produced - in the case of some confusing issue, knowing that 0 were produced can be as valuable as knowing that 1 or more were produced. I encountered such a scenario recently in QA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed an extra commit, which disables the metric for Snowflake/Databricks loaders.

val expected = Metrics.KVMetrics.LoadingCompleted(
KVMetric.CountGood(countGood),
KVMetric.CountBad(countBad),
Some(KVMetric.CollectorLatencyMin(collectorLatencyMin)),
Some(KVMetric.CollectorLatencyMax(collectorLatencyMax)),
KVMetric.ShredderLatencyStart(shredderStartLatency),
KVMetric.ShredderLatencyEnd(shredderEndLatency)
KVMetric.ShredderLatencyEnd(shredderEndLatency),
KVMetric.RecoveryTablesLoaded(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this tells us how many recovery tables were used at all, which would include tables that existed before this load, is that correct?

I think this would be useful, but I'd also like to know whether a new recovery table was created in this batch - since I imagine support will usually be most interested in whether or not there is a new problem, when there's more than one existing recovery table in use. Is this possible to do?

(Not entirely sure I've explained the thinking correctly but shout and I'll be happy to clarify!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in a meeting - we provide the ability to see that information in the data we send to ops1 - it would be possible to construct a query that tells us when a new recovery table is used.

The implementation in this PR gives us the ability to get fast feedback for rollout, and tells us broadly about recovery table usage, so I'm happy that it's sufficient.

Copy link
Contributor

@colmsnowplow colmsnowplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good as far as I can see. :)

Copy link
Contributor

@colmsnowplow colmsnowplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the additional commit looks like a clean way to handle things. :)

@istreeter
Copy link
Contributor Author

I manually merged this into develop branch, so closing it here.

@istreeter istreeter closed this Jan 3, 2024
@istreeter istreeter deleted the fix/redshift-quote-column-names branch January 3, 2024 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants