-
Notifications
You must be signed in to change notification settings - Fork 755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add generic protobuf #2634
Merged
Merged
Changes from 5 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
5b6e1c6
Added generic proto file with basic Value message
Proto007 3231233
Added basic Array message to represent arrays
Proto007 513c0f0
Added Tuple message to represent tuples
Proto007 8f9dca2
Added timestamp and duration datatypes
Proto007 66114a7
Added concise todo comments
Proto007 0ae6f11
Changed field names, message name `NumpyNdarray`
Proto007 b4e5139
Applied changed message name, moved todo comment
Proto007 447dc75
Added trailing newline
Proto007 5e512f3
Fixed sint32 and sint64 field names
Proto007 File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
syntax = "proto3"; | ||
|
||
import "google/protobuf/timestamp.proto"; | ||
import "google/protobuf/duration.proto"; | ||
|
||
/* Value represents a single instance of the supported datatypes. | ||
* Naming convention: Follows `numpy.dtype` where appropriate. | ||
* `numpy.dtype`: https://numpy.org/doc/stable/reference/arrays.dtypes.html#specifying-and-constructing-data-types */ | ||
message Value{ | ||
oneof dtype{ | ||
float f4=1; | ||
double f8=2; | ||
uint32 u4=3; | ||
uint64 u8=4; | ||
sint32 i4=5; | ||
sint64 i8=6; | ||
google.protobuf.Timestamp timestamp_=7; | ||
google.protobuf.Duration duration_=8; | ||
bool bool_=9; | ||
string str_=10; | ||
bytes bytes_=11; | ||
Value value_=12; | ||
Array array_=13; | ||
Tuple tuple_=14; | ||
// TODO: int32, int64, fixed32, fixed64, sfixed32, sfixed64 | ||
} | ||
} | ||
|
||
/* Tuple represents a repeated field containing data with same or different datatypes */ | ||
message Tuple{ | ||
repeated Value value_=1; | ||
} | ||
|
||
// TODO: message complex types | ||
|
||
/* Array contains a dtype which identifies the type of the array. | ||
* The repeated field for the identified dtype contains the array. | ||
* Naming convention: Follows `numpy.dtype` where appropriate. | ||
* `numpy.dtype`: https://numpy.org/doc/stable/reference/arrays.dtypes.html#specifying-and-constructing-data-types */ | ||
message Array { | ||
string dtype=1; | ||
repeated float f4=2; | ||
repeated double f8=3; | ||
repeated uint32 u4=4; | ||
repeated uint64 u8=5; | ||
repeated sint32 i4=6; | ||
repeated sint64 i8=7; | ||
repeated google.protobuf.Timestamp timestamp_=8; | ||
repeated google.protobuf.Duration duration_=9; | ||
repeated bool bool_=10; | ||
repeated string str_=11; | ||
repeated bytes bytes_=12; | ||
repeated Array array_=13; | ||
repeated Tuple tuple_=14; | ||
// TODO: int32, int64, fixed32, fixed64, sfixed32, sfixed64 | ||
} | ||
Proto007 marked this conversation as resolved.
Show resolved
Hide resolved
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also i think the filename should be
protos/io_descriptors/numpy.proto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the message name, I thought we were keeping it
Array
to prevent confusion from clients using languages other than python. If I separate all the io descriptors in their own proto, there will be one generated file for each proto. Having one io_descriptor proto means users can access them from one generated file. For example:io_descriptor_pb.Array()
to create a new array andio_descriptor_pb.PandasDataframe()
to create a pandas dataframe and so on.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think NumpyNdarray provides a better distinction than normal Array. Since a lot of the proto message is Numpy specific, thus the name
NumpyNdarray
makes it more clear than justArray
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't do this in JSON, and I'm not sure we want to do this here. Numpy is what's being used in the server under the hood to process generic array data coming from clients, I don't think clients should care what the input format is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the change to
NumpyNdarray
. If you guys want, I can change it back toArray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sauyon imo we should follow the naming in io_descriptors for consistency