Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message on state initialization #39553

Merged
merged 1 commit into from
Jun 18, 2024

Conversation

maxi297
Copy link
Contributor

@maxi297 maxi297 commented Jun 18, 2024

What

Addresses https://github.com/airbytehq/airbyte-internal-issues/issues/8257

The goal is to improve our visibility on errors that are actually unexpected and there we need to take action. All the cases we've seen for this error is users not doing the proper migration steps.

How

Raising an error if there is a state but it does not contain states

User Impact

This error should now be reported as a config error and therefore improve the accuracy of our error reporting. For the user that is trying to sync the data, it shouldn't not change anything

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Tests

This was tested by manually upgrading the version of the CDK to point to my local version using airbyte-cdk = {path = "../../../airbyte-cdk/python/", develop = true} in source-gitlab.

Before

{
    "type": "TRACE",
    "trace": {
        "type": "ERROR",
        "emitted_at": 1718728438219.623,
        "error": {
            "message": "Something went wrong in the connector. See the logs for more details.",
            "internal_message": "'states'",
            "stack_trace": "Traceback (most recent call last):\n  File \"/Users/maxime/devel/code/airbyte/airbyte-integrations/connectors/source-gitlab/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 136, in read\n    yield from self._read_stream(\n  File \"/Users/maxime/devel/code/airbyte/airbyte-integrations/connectors/source-gitlab/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 222, in _read_stream\n    stream_instance.state = stream_state  # type: ignore # we check that state in the dir(stream_instance)\n  File \"/Users/maxime/devel/code/airbyte/airbyte-integrations/connectors/source-gitlab/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/declarative_stream.py\", line 85, in state\n    self.retriever.state = state\n  File \"/Users/maxime/devel/code/airbyte/airbyte-integrations/connectors/source-gitlab/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py\", line 385, in state\n    self.cursor.set_initial_state(value)\n  File \"/Users/maxime/devel/code/airbyte/airbyte-integrations/connectors/source-gitlab/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py\", line 86, in set_initial_state\n    for state in stream_state[\"states\"]:\nKeyError: 'states'\n",
            "failure_type": "system_error",
            "stream_descriptor": {
                "name": "commits"
            }
        }
    }
}

After

{
    "type": "TRACE",
    "trace": {
        "type": "ERROR",
        "emitted_at": 1718728359946.674,
        "error": {
            "message": "The state for is format invalid. Validate that the migration steps included a reset and that it was performed properly. Otherwise, please contact Airbyte support.",
            "internal_message": "Could not sync parse the following state: {'25157276': {'created_at': '2121-03-18T12:51:05+00:00'}}",
            "stack_trace": "Traceback (most recent call last):\n  File \"/Users/maxime/devel/code/airbyte/airbyte-cdk/python/airbyte_cdk/sources/abstract_source.py\", line 135, in read\n    yield from self._read_stream(\n  File \"/Users/maxime/devel/code/airbyte/airbyte-cdk/python/airbyte_cdk/sources/abstract_source.py\", line 217, in _read_stream\n    stream_instance.state = stream_state  # type: ignore # we check that state in the dir(stream_instance)\n  File \"/Users/maxime/devel/code/airbyte/airbyte-cdk/python/airbyte_cdk/sources/declarative/declarative_stream.py\", line 87, in state\n    self.retriever.state = state\n  File \"/Users/maxime/devel/code/airbyte/airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py\", line 436, in state\n    self.cursor.set_initial_state(value)\n  File \"/Users/maxime/devel/code/airbyte/airbyte-cdk/python/airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py\", line 118, in set_initial_state\n    raise AirbyteTracedException(\nairbyte_cdk.utils.traced_exception.AirbyteTracedException: Could not sync parse the following state: {'25157276': {'created_at': '2121-03-18T12:51:05+00:00'}}\n",
            "failure_type": "config_error",
            "stream_descriptor": {
                "name": "commits"
            }
        }
    }
}

Release Plan

Following the release of this change, I'll update the following connectors as we are seeing issues for them in prod:

  • Gitlab
  • Jira
  • Pinterest
  • Typeform

@maxi297 maxi297 requested a review from a team as a code owner June 18, 2024 16:38
Copy link

vercel bot commented Jun 18, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Jun 18, 2024 4:38pm

@octavia-squidington-iii octavia-squidington-iii added the CDK Connector Development Kit label Jun 18, 2024
@maxi297 maxi297 requested a review from lazebnyi June 18, 2024 16:46
@maxi297 maxi297 merged commit 7d56e19 into master Jun 18, 2024
30 checks passed
@maxi297 maxi297 deleted the issue-8257/improve-error-message-on-init-state branch June 18, 2024 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants