Skip to content

Commit

Permalink
Merge branch 'master' into docs/schema_evolution_docs
Browse files Browse the repository at this point in the history
  • Loading branch information
AstrakhantsevaAA committed Mar 18, 2024
2 parents 46e101a + 4c1f4d3 commit b000b25
Showing 1 changed file with 12 additions and 1 deletion.
13 changes: 12 additions & 1 deletion docs/website/docs/general-usage/incremental-loading.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,18 @@ def repo_events(
We just yield all the events and `dlt` does the filtering (using `id` column declared as
`primary_key`).

Github returns events ordered from newest to oldest so we declare the `rows_order` as **descending** to [stop requesting more pages once the incremental value is out of range](#declare-row-order-to-not-request-unnecessary-data). We stop requesting more data from the API after finding first event with `created_at` earlier than `initial_value`.
Github returns events ordered from newest to oldest. So we declare the `rows_order` as **descending** to [stop requesting more pages once the incremental value is out of range](#declare-row-order-to-not-request-unnecessary-data). We stop requesting more data from the API after finding the first event with `created_at` earlier than `initial_value`.

:::note
**Note on Incremental Cursor Behavior:**
When using incremental cursors for loading data, it's essential to understand how `dlt` handles records in relation to the cursor's
last value. By default, `dlt` will load only those records for which the incremental cursor value is higher than the last known value of the cursor.
This means that any records with a cursor value lower than or equal to the last recorded value will be ignored during the loading process.
This behavior ensures efficiency by avoiding the reprocessing of records that have already been loaded, but it can lead to confusion if
there are expectations of loading older records that fall below the current cursor threshold. If your use case requires the inclusion of
such records, you can consider adjusting your data extraction logic, using a full refresh strategy where appropriate or using `last_value_func` as discussed in the subsquent section.
:::


### max, min or custom `last_value_func`

Expand Down

0 comments on commit b000b25

Please sign in to comment.