Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upPort FlatBuffers to Python. #112
Conversation
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Dec 23, 2014
Collaborator
This is a feature-complete Python port, derived from the Go version. I'm not trying to collide with other PRs, but instead provide more options for whatever ends up being merged.
|
This is a feature-complete Python port, derived from the Go version. I'm not trying to collide with other PRs, but instead provide more options for whatever ends up being merged. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Dec 23, 2014
Collaborator
Please provide feedback on my use of ctypes; it seems mostly unnecessary now but I would benefit from additional opinions.
|
Please provide feedback on my use of ctypes; it seems mostly unnecessary now but I would benefit from additional opinions. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
shaxbee
Dec 23, 2014
Contributor
I've tried ctypes and ended up using struct because it enforces value ranges automatically, takes care of endianess and returns native python data types. Also it seems to be much faster:
ctypes:
%%timeit raw = struct.pack('<i', 534217); head = 0
n = 0
n |= raw[head]
n |= raw[head + 1] << 8
n |= raw[head + 2] << 16
n |= raw[head + 3] << 24
ctypes.c_int32(n).value
....:
1000000 loops, best of 3: 994 ns per loop
struct:
%%timeit raw = struct.pack('<i', 534217); fmt = struct.Struct('<i'); offset = 0
fmt.unpack_from(raw, offset)[0]
....:
1000000 loops, best of 3: 268 ns per loop
I'll steal the idea of explicit type definitions though.
|
I've tried ctypes and ended up using struct because it enforces value ranges automatically, takes care of endianess and returns native python data types. Also it seems to be much faster: ctypes:
struct:
I'll steal the idea of explicit type definitions though. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Dec 23, 2014
Collaborator
Nice microbenchmarks.
I am personally more interested in correctness than speed at this point,
hence my emphasis on quite a bit of testing.
On Dec 22, 2014 10:33 PM, "Zbigniew Mandziejewicz" notifications@github.com
wrote:
I've tried ctypes and ended up using struct because it enforces value
ranges automatically, takes care of endianess and returns native python
data types. Also it seems to be much faster:ctypes:
%%timeit raw = struct.pack('<i', 534217); head = 0
n = 0
n |= raw[head]
n |= raw[head + 1] << 8
n |= raw[head + 2] << 16
n |= raw[head + 3] << 24
ctypes.c_int32(n).value
....:
1000000 loops, best of 3: 994 ns per loopstruct:
%%timeit raw = struct.pack('<i', 534217); fmt = struct.Struct('<i'); offset = 0
fmt.unpack_from(raw, offset)[0]
....:
1000000 loops, best of 3: 268 ns per loopI'll steal the idea of explicit type definitions though.
—
Reply to this email directly or view it on GitHub
#112 (comment).
|
Nice microbenchmarks. I am personally more interested in correctness than speed at this point,
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Dec 23, 2014
Collaborator
Also, what do you mean by steal?
On Dec 22, 2014 11:23 PM, "Robert Winslow" me@rwinslow.com wrote:
Nice microbenchmarks.
I am personally more interested in correctness than speed at this point,
hence my emphasis on quite a bit of testing.
On Dec 22, 2014 10:33 PM, "Zbigniew Mandziejewicz" <
notifications@github.com> wrote:I've tried ctypes and ended up using struct because it enforces value
ranges automatically, takes care of endianess and returns native python
data types. Also it seems to be much faster:ctypes:
%%timeit raw = struct.pack('<i', 534217); head = 0
n = 0
n |= raw[head]
n |= raw[head + 1] << 8
n |= raw[head + 2] << 16
n |= raw[head + 3] << 24
ctypes.c_int32(n).value
....:
1000000 loops, best of 3: 994 ns per loopstruct:
%%timeit raw = struct.pack('<i', 534217); fmt = struct.Struct('<i'); offset = 0
fmt.unpack_from(raw, offset)[0]
....:
1000000 loops, best of 3: 268 ns per loopI'll steal the idea of explicit type definitions though.
—
Reply to this email directly or view it on GitHub
#112 (comment).
|
Also, what do you mean by steal?
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
shaxbee
Dec 23, 2014
Contributor
If you don't mind I'll use parts of your builder implementation and put datatype definitions in separate module as you did and remove hardcoded datatype sizes.
|
If you don't mind I'll use parts of your builder implementation and put datatype definitions in separate module as you did and remove hardcoded datatype sizes. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Dec 23, 2014
Collaborator
I'd rather you keep your PR at its current scope..."lifting" code is not
the kind way to do things.
On Dec 23, 2014 6:45 AM, "Zbigniew Mandziejewicz" notifications@github.com
wrote:
If you don't mind I'll use parts of your builder implementation and put
datatype definitions in separate module as you did and remove hardcoded
datatype sizes.—
Reply to this email directly or view it on GitHub
#112 (comment).
|
I'd rather you keep your PR at its current scope..."lifting" code is not
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
shaxbee
Dec 23, 2014
Contributor
I don't understand - you've mentioned combining my PR with yours in #110. I'll go ahead and provide own implementation of Builder then...
|
I don't understand - you've mentioned combining my PR with yours in #110. I'll go ahead and provide own implementation of Builder then... |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Dec 23, 2014
Collaborator
I think we are both miscommunicating. Reconciling our efforts is my goal.
I'd rather not have flames here :-)
On Dec 23, 2014 7:01 AM, "Zbigniew Mandziejewicz" notifications@github.com
wrote:
I don't understand - you've mentioned combining my PR with yours in #110
#110. I'll go ahead and
provide own implementation of Builder then...—
Reply to this email directly or view it on GitHub
#112 (comment).
|
I think we are both miscommunicating. Reconciling our efforts is my goal.
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
I think we are talking about same thing then :-) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Where do our implementations overlap? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
shaxbee
Jan 2, 2015
Contributor
They don't seem to overlap that much as #110 is based on C++ API and uses struct / array heavily.
My focus was on performance and therefore I reduced the scope of v1 features to reading.
|
They don't seem to overlap that much as #110 is based on C++ API and uses struct / array heavily. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
A joint PR would be great :) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
shaxbee
Jan 6, 2015
Contributor
@rw: I've merged encode / numtypes code, using struct module but keeping the methods. Could we sync via email?
|
@rw: I've merged encode / numtypes code, using struct module but keeping the methods. Could we sync via email? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@shaxbee Yes, please email me. Address is on my profile page. :-) |
added a commit
to shaxbee/flatbuffers
that referenced
this pull request
Jan 29, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Mar 4, 2015
Collaborator
Beta testers: please evaluate this PR for merging.
I force-pushed some updates:
- Add read benchmark for the gold example data (currently at 2100+ traversals/sec).
- Add write benchmark for the gold example data (currently at 1300+ builds/sec).
- Add the PythonUsage.md documentation file.
- Include dependencies when installing with
setup.py install. - Add code coverage reports to the test runner (currently reports 87%).
- Use faster
structpacking.
This branch has feature parity with the Java and Go versions, supports both Python 2 and 3, and is thoroughly tested.
|
Beta testers: please evaluate this PR for merging. I force-pushed some updates:
This branch has feature parity with the Java and Go versions, supports both Python 2 and 3, and is thoroughly tested. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
gkogan
commented
Mar 5, 2015
|
Looks good to me. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordan52
commented
Mar 7, 2015
|
this looks to be in really good shape. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Mar 8, 2015
Collaborator
- Reified exceptions into their own types (in the
exceptions.pyfile). - Sped up read traversals by 50%, to 3000/sec, by removing a supermethod call in the
packer.pyhotpath.
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Mar 9, 2015
Collaborator
Increase code coverage to 97%: Add cases to generate and test conditionals not traversed with the 'gold' example Monster data.
|
Increase code coverage to 97%: Add cases to generate and test conditionals not traversed with the 'gold' example Monster data. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
shaxbee
Mar 9, 2015
Contributor
+1 for this PR, i don't have capacity at the moment to finish up #110.
On Mon, 9 Mar 2015 at 14:12 Robert notifications@github.com wrote:
Increase code coverage to 97%: Add cases to generate and test conditionals
not traversed with the 'gold' example Monster data.—
Reply to this email directly or view it on GitHub
#112 (comment).
|
+1 for this PR, i don't have capacity at the moment to finish up #110. On Mon, 9 Mar 2015 at 14:12 Robert notifications@github.com wrote:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
gwvo
Apr 8, 2015
Contributor
Robert, do you want to do more to this commit, or is it "good enough" for now?
|
Robert, do you want to do more to this commit, or is it "good enough" for now? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
thedrow
commented
Apr 12, 2015
|
This PR needs to be rebased. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Apr 12, 2015
Collaborator
Pushed some updates:
- Rebase on top of latest master.
- Update use of GenComment to match the new type signature.
- Squash commits.
- Comment out the 'typed Python'-style asserts.
|
Pushed some updates:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
ready then? :) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@gwvo Yep! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
gwvo
Apr 13, 2015
Contributor
Python2 tests:
Traceback (most recent call last):
File "py_test.py", line 17, in
import MyGame.Example.Any # refers to generated code
File "/usr/local/google/home/wvo/rep/vendor/unbundled_google/libs/flatbuffers/tests/MyGame/Example/Any.py", line 5, in
from enum import Enum
ImportError: No module named enum
|
Python2 tests: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Apr 17, 2015
Collaborator
Improved compatibility:
- Remove all external dependencies for the runtime library and generated code.
- Add Python 2.6 compatibility. CPython 2.6, 2.7, 3+, and PyPy are now passing, using the same generated code.
|
Improved compatibility:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
gwvo
Apr 20, 2015
Contributor
Still get this: ImportError: No module named enum in __init__.py on all 3 tests. My actual Python version is 2.7.6. On Linux.
Also maybe have the files generates in tests rather than tests/py_gen, that way they line up with Java/C#.
|
Still get this: Also maybe have the files generates in |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
Apr 21, 2015
Collaborator
More tweaks:
Lift generated Python files into tests/MyGame, like we do for the other ports.
Tests will run if at least one Python interpreter is found.
Tests will run if no coverage utility is found.
Disable benchmarks by default.
|
More tweaks: Lift generated Python files into tests/MyGame, like we do for the other ports. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Now we disable Go/Java comparison checks by default. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
layzerar
Apr 29, 2015
Contributor
@rw I found a little bug in the setup script, the following patch should fix.
python/setup.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/setup.py b/python/setup.py
index f067b38..44f5ee7 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -8,7 +8,7 @@ setup(
author_email='me@rwinslow.com',
url='https://github.com/python/flatbuffers/python',
long_description='Python runtime library and code generator for use with the Flatbuffers serialization format.',
- packages=['flatbuffers'],
+ packages=['flatbuffers', 'flatbuffers.vendor'],
include_package_data=True,
requires=[],
description='Runtime library and code generator for use with the Flatbuffers serialization format.',|
@rw I found a little bug in the setup script, the following patch should fix. python/setup.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/setup.py b/python/setup.py
index f067b38..44f5ee7 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -8,7 +8,7 @@ setup(
author_email='me@rwinslow.com',
url='https://github.com/python/flatbuffers/python',
long_description='Python runtime library and code generator for use with the Flatbuffers serialization format.',
- packages=['flatbuffers'],
+ packages=['flatbuffers', 'flatbuffers.vendor'],
include_package_data=True,
requires=[],
description='Runtime library and code generator for use with the Flatbuffers serialization format.', |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
After reinstall this package, everything is OK for me. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Thanks @layzerar! Pushed. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Life is short, how about this? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@layzerar Almost there, incorporating some out-of-band feedback. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
May 12, 2015
Collaborator
Factored out some duplicated code, made the runtime library PEP8-compliant, and made a number of other stylistic fixes. The structure of the code has not changed.
This is looking OK to me. Anyone else want to give feedback?
|
Factored out some duplicated code, made the runtime library PEP8-compliant, and made a number of other stylistic fixes. The structure of the code has not changed. This is looking OK to me. Anyone else want to give feedback? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
layzerar
May 13, 2015
Contributor
flatcfailed to compile in VS2013, error log is here:
1>------ Rebuild All started: Project: flatc, Configuration: Release Win32 ------
2>------ Rebuild All started: Project: flatsamplebinary, Configuration: Release Win32 ------
3>------ Rebuild All started: Project: flatsampletext, Configuration: Release Win32 ------
4>------ Rebuild All started: Project: flattests, Configuration: Release Win32 ------
1> idl_gen_fbs.cpp
2> sample_binary.cpp
4> idl_gen_fbs.cpp
3> idl_parser.cpp
1> idl_gen_general.cpp
4> idl_gen_general.cpp
3> idl_gen_text.cpp
1> idl_gen_go.cpp
4> idl_parser.cpp
3> sample_text.cpp
2> flatsamplebinary.vcxproj -> C:\Users\layz\Desktop\flatbuffers\flatbuffers\build\VS2010\Release\flatsamplebinary.exe
1> idl_gen_python.cpp
3> Generating Code...
4> idl_gen_text.cpp
1>..\..\src\idl_gen_python.cpp(577): error C2065: 'S_IRWXU' : undeclared identifier
1>..\..\src\idl_gen_python.cpp(577): error C2065: 'S_IRGRP' : undeclared identifier
1>..\..\src\idl_gen_python.cpp(577): error C2065: 'S_IXGRP' : undeclared identifier
1>..\..\src\idl_gen_python.cpp(577): error C2065: 'S_IROTH' : undeclared identifier
1>..\..\src\idl_gen_python.cpp(577): error C2065: 'S_IXOTH' : undeclared identifier
1> idl_parser.cpp
4> test.cpp
1> idl_gen_cpp.cpp
4> Generating Code...
1>..\..\src\idl_gen_cpp.cpp(276): error C2220: warning treated as error - no 'object' file generated
1>..\..\src\idl_gen_cpp.cpp(276): warning C4189: 'nested_root' : local variable is initialized but not referenced
1> idl_gen_text.cpp
3> flatsampletext.vcxproj -> C:\Users\layz\Desktop\flatbuffers\flatbuffers\build\VS2010\Release\flatsampletext.exe
1> flatc.cpp
1> Generating Code...
4> flattests.vcxproj -> C:\Users\layz\Desktop\flatbuffers\flatbuffers\build\VS2010\Release\flattests.exe
========== Rebuild All: 3 succeeded, 1 failed, 0 skipped ==========
- py_test.py run with following error (Python2.6) :
Traceback (most recent call last):
File "C:\Users\layz\Desktop\flatbuffers\flatbuffers\tests\py_test.py", line 1331, in <module>
main()
File "C:\Users\layz\Desktop\flatbuffers\flatbuffers\tests\py_test.py", line 1294, in main
('<benchmark read count> <benchmark build count>')
TypeError: 'str' object is not callable
- this patch would fix:
tests/py_test.py | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/tests/py_test.py b/tests/py_test.py
index c6861f1..2934394 100644
--- a/tests/py_test.py
+++ b/tests/py_test.py
@@ -1290,13 +1290,13 @@ def main():
import os
import sys
if not len(sys.argv) == 4:
- sys.stderr.write(('Usage: %s <benchmark vtable count>')
- ('<benchmark read count> <benchmark build count>')
- ('\n' % sys.argv[0]))
- sys.stderr.write((' Provide COMPARE_GENERATED_TO_GO=1 to check')
- ('for bytewise comparison to Go data.\n'))
- sys.stderr.write((' Provide COMPARE_GENERATED_TO_JAVA=1 to check')
- ('for bytewise comparison to Java data.\n'))
+ sys.stderr.write('Usage: %s <benchmark vtable count>'
+ '<benchmark read count> <benchmark build count>'
+ '\n' % sys.argv[0])
+ sys.stderr.write(' Provide COMPARE_GENERATED_TO_GO=1 to check'
+ 'for bytewise comparison to Go data.\n')
+ sys.stderr.write(' Provide COMPARE_GENERATED_TO_JAVA=1 to check'
+ 'for bytewise comparison to Java data.\n')
sys.stderr.flush()
sys.exit(1)
tests/py_test.py | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/tests/py_test.py b/tests/py_test.py
index c6861f1..2934394 100644
--- a/tests/py_test.py
+++ b/tests/py_test.py
@@ -1290,13 +1290,13 @@ def main():
import os
import sys
if not len(sys.argv) == 4:
- sys.stderr.write(('Usage: %s <benchmark vtable count>')
- ('<benchmark read count> <benchmark build count>')
- ('\n' % sys.argv[0]))
- sys.stderr.write((' Provide COMPARE_GENERATED_TO_GO=1 to check')
- ('for bytewise comparison to Go data.\n'))
- sys.stderr.write((' Provide COMPARE_GENERATED_TO_JAVA=1 to check')
- ('for bytewise comparison to Java data.\n'))
+ sys.stderr.write('Usage: %s <benchmark vtable count>'
+ '<benchmark read count> <benchmark build count>'
+ '\n' % sys.argv[0])
+ sys.stderr.write(' Provide COMPARE_GENERATED_TO_GO=1 to check'
+ 'for bytewise comparison to Go data.\n')
+ sys.stderr.write(' Provide COMPARE_GENERATED_TO_JAVA=1 to check'
+ 'for bytewise comparison to Java data.\n')
sys.stderr.flush()
sys.exit(1) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
gwvo
May 13, 2015
Contributor
For S_IRWXU etc: make sure you use the mkdir related function in util.h instead.
The nested_root thing I can fix.
|
For |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rw
May 13, 2015
Collaborator
@layzerar Thanks, fixed string catenation in py_test.py's error message.
|
@layzerar Thanks, fixed string catenation in |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@gwvo Switched from |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Ok, merge at will :) |
added a commit
that referenced
this pull request
May 13, 2015
rw
merged commit f8139b0
into
google:master
May 13, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
thedrow
commented
May 13, 2015
|
Woot! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Briliant! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
We've got it on PyPI: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Downchuck
Sep 25, 2015
Is ./flatc still needed to do code gen? I'm hitting issues with its python output.
EDIT: My bad -- looks like "namespace" is required when generating python output.
Downchuck
commented
Sep 25, 2015
|
Is ./flatc still needed to do code gen? I'm hitting issues with its python output. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@rw Congrats on getting this through! :-) |
rw commentedDec 23, 2014
Implement code generation and runtime library for Python 2 and 3, derived
from the Go implementation. Additionally, the test suite verifies:
the exact bytes in the Builder buffer during many scenarios,
vtable deduplication, and
table construction, via a fuzzer derived from the Go implementation.