Skip to content

Commit f1e2d8b

Browse files
Apply suggestions from code review
Co-authored-by: Daniil Gusev <daniil@quix.io>
1 parent 1d8700c commit f1e2d8b

File tree

2 files changed

+6
-5
lines changed

2 files changed

+6
-5
lines changed

docs/joins.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,6 @@ if __name__ == '__main__':
115115
```
116116

117117

118-
119118
#### State expiration
120119
`StreamingDataFrame.join_asof` stores the right records to the state.
121120
The `grace_ms` parameter regulates the state's lifetime (default - 7 days) to prevent it from growing in size forever.
@@ -131,6 +130,7 @@ Adjust `grace_ms` based on the expected time gap between the left and the right
131130
### Limitations
132131

133132
- Joining dataframes belonging to the same topics (aka "self-join") is not supported.
133+
- Join types "right" and "outer" are not supported.
134134
- As-of join preserves headers only for the left dataframe. If you need headers of the right side records, consider adding them to the value.
135135

136136
### Message ordering between partitions
@@ -283,7 +283,8 @@ The merging behavior is controlled by the `on_merge` parameter, which works the
283283
284284
### Limitations
285285

286-
- Joining dataframes belonging to the same topics (aka "self-join") is not supported
287-
- The `backward_ms` must not be greater than the `grace_ms` to avoid losing data
286+
- Joining dataframes belonging to the same topic (aka "self-join") is not supported.
287+
- The `backward_ms` must not be greater than the `grace_ms` to avoid losing data.
288288
- Interval join does not preserve any headers. If you need headers from any side, consider adding them to the value.
289-
- Performance of the join depends on the density of the data. If streams on both sides move very fast (a lot of messages) then the performance may drop significantly.
289+
- The performance of the interval join depends on the density of the data.
290+
If both streams have too many matching messages falling within the interval, the performance may drop significantly due to the large number of produced outputs.

quixstreams/dataframe/dataframe.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1737,7 +1737,7 @@ def join_asof(
17371737
def join_interval(
17381738
self,
17391739
right: "StreamingDataFrame",
1740-
how: Literal["inner", "left"] = "inner",
1740+
how: JoinHow = "inner",
17411741
on_merge: Union[OnOverlap, Callable[[Any, Any], Any]] = "raise",
17421742
grace_ms: Union[int, timedelta] = timedelta(days=7),
17431743
name: Optional[str] = None,

0 commit comments

Comments
 (0)