You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using Spark Structured Streaming with asuze-cosmos-spark_3-1_2-2 % "4.16.0"
I need to cap how much data I can read.
Tried setting the following:
spark.cosmos.changeFeed.itemCountPerTriggerHint = 10
spark.cosmos.read.maxItem = 10
Hi, itemCountPerTriggerHint will allow you to modify the max. memory/resource consumption per micro batch. it is only a hint because change feed in Cosmos DB will always include at least all documents of a single atomic transaction (all sharing the same LSN - log sequence number, because they were modified in the same atomic transaction). So, you will always get at least the documents for a single atomic transaction per physical partition. But from a memory footprint/resource consumption perspective that should be more than sufficient - because the number of document updates in a transaction is also capped (worst case a single bulk/batch might update around 1-5 thousand documents if they are really very small)
I am using Spark Structured Streaming with asuze-cosmos-spark_3-1_2-2 % "4.16.0"
I need to cap how much data I can read.
Tried setting the following:
spark.cosmos.changeFeed.itemCountPerTriggerHint = 10
spark.cosmos.read.maxItem = 10
spark.cosmos.changeFeed.startFrom = "Beginning"
spark.cosmos.changeFeed.mode = "Incremental"
Using
val changeFeedDF = spark.readStream
.schema(customSchema)
.format("cosmos.oltp.changeFeed")
.options(readConfig)
.load
The text was updated successfully, but these errors were encountered: