You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem description
Consider an example where hundreds of cameras are writing video to a single Pravega stream with dozens of segments. The routing key would be the camera ID. One use case is to perform analytics on the entire stream (all cameras) using Flink. This is covered perfectly by the current event reader API. However, another use case on the same stream is to read the video of just a single camera and display it on the screen or perform adhoc analysis on it. Our event reader API will not work here because it would require reading all segments, something that a simple single-threaded app cannot do.
I propose a new event reader API that accepts as input the stream name and the routing key. It will then return only those events that are in the segments that contain that routing key. This will limit the quantity of events that a reader would need to read to those of a single segment.
An alternative solution would be to use multiple streams, each with a small number of segments (e.g. cameras1-4, cameras5-8, cameras9-12). This is not ideal though because apps that did need to read all cameras would have to determine the names of all of the streams. Adding and removing cameras, as well as changing data rates, could be difficult. These are problems that Pravega solves very well when a single large stream is used.
Problem location
Pravega client event reader
Suggestions for an improvement
See above.
Bonus: Instead of limiting reads to a single routing key, sometimes a reader will want to read the segments of 2 or 3 routing keys. When # routing keys to read < # segments, this will still be better than reading the entire stream.
The text was updated successfully, but these errors were encountered:
One way to do this would be to have a reader side model where readers are 1:1 with segments. So every time there is a scaling event the number of readers changes. This would require a different API to manage this sort of dynamic creation of readers.
On the writer side it is possible to create a model were keys map directly to segments. (This would obviously be bad if there are a very large number of keys)
Taken together it would essentially provide a dynamic group of streams that are all related.
Problem description
Consider an example where hundreds of cameras are writing video to a single Pravega stream with dozens of segments. The routing key would be the camera ID. One use case is to perform analytics on the entire stream (all cameras) using Flink. This is covered perfectly by the current event reader API. However, another use case on the same stream is to read the video of just a single camera and display it on the screen or perform adhoc analysis on it. Our event reader API will not work here because it would require reading all segments, something that a simple single-threaded app cannot do.
I propose a new event reader API that accepts as input the stream name and the routing key. It will then return only those events that are in the segments that contain that routing key. This will limit the quantity of events that a reader would need to read to those of a single segment.
An alternative solution would be to use multiple streams, each with a small number of segments (e.g. cameras1-4, cameras5-8, cameras9-12). This is not ideal though because apps that did need to read all cameras would have to determine the names of all of the streams. Adding and removing cameras, as well as changing data rates, could be difficult. These are problems that Pravega solves very well when a single large stream is used.
Problem location
Pravega client event reader
Suggestions for an improvement
See above.
Bonus: Instead of limiting reads to a single routing key, sometimes a reader will want to read the segments of 2 or 3 routing keys. When # routing keys to read < # segments, this will still be better than reading the entire stream.
The text was updated successfully, but these errors were encountered: