-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alex/limit paginator #17197
Alex/limit paginator #17197
Conversation
inject_into: request_parameter | ||
field_name: page_size | ||
field_name: limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to limit as per comment at https://github.com/airbytehq/airbyte/pull/17197/files#diff-b36b8735005712a942bee18c2b6b9b824d9b08be821b16c4ea527d07e2f8901cR26
|
||
Iterating over pages of result is different from iterating over stream slices. | ||
Stream slices have semantic value, for instance, a Datetime stream slice defines data for a specific date range. Two stream slices will have data for different date ranges. | ||
Conversely, pages don't have semantic value. More pages simply means that more records are to be read, without specifying any meaningful difference between the records of the first and later pages. | ||
|
||
The paginator is defined by | ||
|
||
- `page_size`: The number of records to fetch in a single request | ||
- `limit_option`: How to specify the page size in the outgoing HTTP request | ||
- `page_size_option`: How to specify the page size in the outgoing HTTP request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add complete schema definitions in a follow up PR
page_token_option (RequestOption): the request option to set the page token | ||
pagination_strategy (PaginationStrategy): Strategy defining how to get the next page token | ||
config (Config): connection config | ||
url_base (Union[InterpolatedString, str]): endpoint's base url | ||
decoder (Decoder): decoder to decode the response | ||
""" | ||
|
||
page_size: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to the strategy
@@ -16,7 +16,7 @@ | |||
author="Airbyte", | |||
author_email="contact@airbyte.io", | |||
packages=find_packages(), | |||
install_requires=["airbyte-cdk", "dataclasses-jsonschema==2.15.1"], | |||
install_requires=["airbyte-cdk==0.1.89", "dataclasses-jsonschema==2.15.1"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
freeze the version until greenhouse.yaml
is updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
@@ -16,7 +16,7 @@ definitions: | |||
type: "BearerAuthenticator" | |||
api_token: "{{ config.apikey }}" | |||
cursor_paginator: | |||
type: LimitPaginator | |||
type: DefaultPaginator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sendgrid is safe to update because it is not used in production
@@ -18,7 +18,7 @@ definitions: | |||
type: "BearerAuthenticator" | |||
api_token: "{{ config.auth_token }}" | |||
paginator: | |||
type: LimitPaginator | |||
type: DefaultPaginator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sentry is safe to update because it is not used in production
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i had a question related to why the cursor paginator needs to have a field for the page size. If anything we might want to leave it out of language til proven we needed it. But nothing to block on!
airbyte-cdk/python/CHANGELOG.md
Outdated
## 0.1.89 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were these formatting changes intended? Did the linter/formatter get changed?
inject_into: header | ||
field_name: page_size | ||
pagination_strategy: | ||
type: "OffsetIncrement" | ||
page_size: 5 | ||
page_token: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this also be renamed to page_token_option
? not the changed line, but the one below it (line 54)
@@ -30,6 +31,7 @@ class CursorPaginationStrategy(PaginationStrategy, JsonSchemaMixin): | |||
cursor_value: Union[InterpolatedString, str] | |||
config: Config | |||
options: InitVar[Mapping[str, Any]] | |||
page_size: Optional[int] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case for page_size
in a cursor paginator? We need the method to be defined to satisfy the interface, but maybe we should just return None
? Also useful for making it easy to throw an error in the DefaultPaginator if the developer sets a page_size_option
for a cursor paginator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some APIs use a page size with a cursor (request 5 records per page and get the next page from the cursor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah okay, makes sense!
@@ -16,7 +16,7 @@ | |||
author="Airbyte", | |||
author_email="contact@airbyte.io", | |||
packages=find_packages(), | |||
install_requires=["airbyte-cdk", "dataclasses-jsonschema==2.15.1"], | |||
install_requires=["airbyte-cdk==0.1.89", "dataclasses-jsonschema==2.15.1"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
/publish-cdk dry-run=true
|
/publish-cdk dry-run=false
|
* remove page size from limit paginator * Rename LimitPaginator to DefaultPaginator * rename limit_options * rename method * cleanup * comment * update schema file * Update changelog * pin greenhouse connector * fix doc in comment * fix changelog * missing - * bump * bump
What
page_size
from the LimitPaginator definition #16562 and Renamelimit_option
inLimitPaginator
to matchpage_size
#15575 by removing duplicate field from LimitPaginatorLimitPaginator
toDefaultPaginator
because it's not specific to limitsFollow up item:
Recommended reading order
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/default_paginator.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/pagination_strategy.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/offset_increment.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/page_increment.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/class_types_registry.py
🚨 User Impact 🚨
Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.