-
Notifications
You must be signed in to change notification settings - Fork 64
[RSDK 4654] Add 'Include Binary' Option #396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this! One minor spelling fix and one question, otherwise lgtm :)
src/viam/app/data_client.py
Outdated
self, | ||
filter: Optional[Filter] = None, | ||
dest: Optional[str] = None, | ||
self, filter: Optional[Filter] = None, dest: Optional[str] = None, include_binary: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is defaulting to False
the right behavior here? I would've expected True
to be a better default but will defer to @dannenberg on this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you'll wanna do some extra work to make this nice (ex add nice errors and a better bool name)... but defaulting to false is likely correct as you can only return one actual file. https://viaminc.slack.com/archives/C0307PGHR7E/p1692649912349079?thread_ts=1692649715.146639&cid=C0307PGHR7E
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can totally see it going both ways. I only made it default to false because True would make the amount of data received of course much larger and noticeably slow down the execution of the function, so I opted for the faster choice. Also @dannenberg the method itself will return as many files as fit into the filter. The 100 vs 1 limit thing only affects how many files are included in a single request. The method repeatedly makes requests over and over again until it gets the right number of files, regardless of the value of include_binary. So if a filter matches 7 files, calling this method will always return a list of those 7 files, they just may or may not actually contain the binary data of the file depending on the value of this boolean. Do you still think it should default to false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as i understand it (based on devin hilly's comment), you will not get any binary data unless you set the limit to 1. the code should be changed to reflect this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dannenberg this true on a per-request basis. In a single request for data, the limit must be 1 if you want the binary data to be included. The limit can be something else if include_binary is false. We account for this here:
data_request = DataRequest(filter=filter, limit=1 if include_file_data else 100, last=last)
This method makes multiple requests to get the relevant files. So if there 10 files matching the filter you pass the method AND you want the binary file data included, the method will make 10 individual gRPC requests (each with a limit of 1, so that the requests actually work) and store each response in a list that is eventually returned. If you don't want the binary file data, then all 10 relevant files can and will be returned in a single response following a single request that has a limit that isn't 1. None of these internal mechanics are exposed to the caller of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha. thanks.
i see. thank you
…On Tue, Aug 22, 2023 at 10:31 AM Bashar ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/viam/app/data_client.py
<#396 (comment)>
:
> @@ -174,15 +174,15 @@ async def tabular_data_by_filter(
return data
async def binary_data_by_filter(
- self,
- filter: Optional[Filter] = None,
- dest: Optional[str] = None,
+ self, filter: Optional[Filter] = None, dest: Optional[str] = None, include_binary: bool = False
@dannenberg <https://github.com/dannenberg> this true on a per-request
basis. In a single request for data, the limit must be 1 if you want the
binary data to be included. The limit can be something else if
include_binary is false. We account for this here:
data_request = DataRequest(filter=filter, limit=1 if include_file_data
else 100, last=last)
This method makes multiple requests to get the relevant files. So if there
10 files matching the filter you pass the method AND you want the binary
file data included, the method will make 10 individual gRPC requests (each
with a limit of 1, so that the requests actually work) and store each
response in a list that is eventually returned. If you don't want the
binary file data, then all 10 relevant files can and will be returned in a
single response following a single request that has a limit that isn't 1.
None of these internal mechanics are exposed to the caller of the function.
—
Reply to this email directly, view it on GitHub
<#396 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACPWCFFE7PS5SNAULZ2PWDXWS7E3ANCNFSM6AAAAAA3ZKM4VQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
--
Matt Dannenberg
Solution Engineering
viam.com
|
@dannenberg @stuqdog Okay I just changed up the method a bit so that it takes in another variable capping the number of binary files returned. it defaults to 100 if we are not including the actual binary data, but 10 if we are. The user can of course set a number to override either of these values, or pass "0" to indicate no limit (i.e., just return everything that matches). Also, the new boolean now defaults to True. |
|
This PR introduces a new input parameter - a boolean specifying whether or not to actually include the binary file data associated with the requested files - to a binary data retrieval method. Before, this variable was simply never passed when the requests were made, and so no actual data was ever actually returned, only metadata. The PR allows for actual data to be retrieved. Also included in the PR is an update to the relevant class's testing suite and some necessary resulting changes to the method taking on this new input parameter.. and also a docstring update.