bug: possible regression in state messages and progress_markers handling #1704

laurentS · 2023-05-15T14:01:22Z

Singer SDK Version

0.23.0

Is this a regression?

Yes

Python Version

NA

Bug scope

Taps (catalog, state, etc.)

Operating System

linux

Description

CI checks in MeltanoLabs/tap-github#209 are now failing on a test related to progress_markers.

I've narrowed down the regression (?) to between sdk v0.22.1 (pass) and v0.23.0 (fail), and I think it is specifically #1436 which caused the regression. Tagging @aaronsteers as the author of that PR.

The test that fails was added to tap-github in this PR after this related PR was merged in the SDK. In short, progress_markers now appears in the last state message issued by the sdk, which used to signify that the message wasn't "final".

It looks like there are some changes in how state is handled, so I'm opening this as a discussion to understand if there was indeed a regression, or if the test in tap-github needs to be modified to accomodate the new behaviour in the sdk.

Code

No response

The text was updated successfully, but these errors were encountered:

aaronsteers · 2023-05-15T14:37:19Z

@laurentS - thanks for reporting. 👍👍

Cc @kgpayne for extra pair of eyes and since he contributed also on the noted PR.

aaronsteers · 2023-05-15T14:44:34Z

@laurentS - it's possible that this is functioning as designed and that the test should be modified... it will depend on the desired output of the test.

Essentially, the key question is whether the test actually did "complete" the steam that is being tested - or if the stream sync was "aborted abnormally" due to max record abort signal. If the latter, then the correct result would be an unfinalized state.

If you want to update the stream to deal with abort scenarios cleanly, you can mark the stream as sorted/resumable, and/or you can add handling to deal with abort-type exceptions.

Probably there are other ways to address as well, and it's possible that there's another issue going on, but this at least should kick off the conversation... 😁

laurentS · 2023-05-16T10:04:36Z

Thanks for clarifying @aaronsteers!

My understanding so far:

the test in tap-github (test_last_state_message_is_valid) was added to check that we do get a valid final state message in cases where state_partitioning_keys is overridden in a stream. With sdk v0.13.1, we were having problems with streams restarting on each run because of a missing final state message. I think this test needs to remain as is.
in fix: handle sync abort, reduce duplicate STATE messages, rename _MAX_RECORD_LIMIT as ABORT_AT_RECORD_COUNT #1436, you mention:

Tests which actually want a sync_all() behavior can call sync_all() directly, as this change reverts the behavior so that sync_all() will once again sync all records. This is the only way to ensure tests will get valid and finalized state messages. Otherwise, the aborted streams will have bookmarks left in an unfinalized, non-resumeable state.

This is what the test does (call sync_all()), so I'd expect it to pass based on the info I have so far. My understanding is that ABORT_AT_RECORD_COUNT is aimed at speeding up tests when lots of records are expected, but this isn't really what we're testing here. Did I get this right?

kgpayne · 2023-05-16T15:26:06Z

@laurentS @aaronsteers @edgarrmondragon I think the ABORT_AT_RECORD_COUNT related changes may be a red herring - as part of those works we also 'fixed' the noisy duplicate STATE messages. I believe that either there is a bug in our record_index logic for choosing when to write state messages, or a missing self._is_state_flushed = False somewhere that would be keeping the final state from firing 🤔 tap-github uses state partitioning, which also affects how/when state is updated.

Looking into this now 👍

edgarrmondragon · 2023-05-16T15:41:53Z

Looking into this now 👍

Thanks @kgpayne!

aaronsteers · 2023-05-16T17:58:27Z

I think the ABORT_AT_RECORD_COUNT related changes may be a red herring...

Yeah, makes sense. Your draft PR #1708 looks like it addresses the issue. 👍 I added a comment there in the PR about some internals, but nothing that would block your PR.

tayloramurphy · 2023-07-10T18:18:52Z

@edgarrmondragon reassigning this to you since Ken is out. Can you take a look and let me know if we need to make progress on this?

edgarrmondragon · 2023-07-14T16:20:38Z

~~I'll be taking a look at this next week~~ Done this already

laurentS added kind/Bug Something isn't working valuestream/SDK labels May 15, 2023

kgpayne self-assigned this May 16, 2023

kgpayne mentioned this issue May 16, 2023

fix: Finalize and write last state message with dedupe #1708

Merged

laurentS mentioned this issue May 29, 2023

Invalid SCHEMA messages are produced for deselected streams MeltanoLabs/tap-github#212

Open

tayloramurphy assigned edgarrmondragon and unassigned kgpayne Jul 10, 2023

edgarrmondragon closed this as completed in #1708 Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: possible regression in state messages and progress_markers handling #1704

bug: possible regression in state messages and progress_markers handling #1704

laurentS commented May 15, 2023

aaronsteers commented May 15, 2023

aaronsteers commented May 15, 2023 •

edited

Loading

laurentS commented May 16, 2023

kgpayne commented May 16, 2023

edgarrmondragon commented May 16, 2023

aaronsteers commented May 16, 2023

tayloramurphy commented Jul 10, 2023

edgarrmondragon commented Jul 14, 2023 •

edited

Loading

bug: possible regression in state messages and progress_markers handling #1704

bug: possible regression in state messages and progress_markers handling #1704

Comments

laurentS commented May 15, 2023

Singer SDK Version

Is this a regression?

Python Version

Bug scope

Operating System

Description

Code

aaronsteers commented May 15, 2023

aaronsteers commented May 15, 2023 • edited Loading

laurentS commented May 16, 2023

kgpayne commented May 16, 2023

edgarrmondragon commented May 16, 2023

aaronsteers commented May 16, 2023

tayloramurphy commented Jul 10, 2023

edgarrmondragon commented Jul 14, 2023 • edited Loading

aaronsteers commented May 15, 2023 •

edited

Loading

edgarrmondragon commented Jul 14, 2023 •

edited

Loading