Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added serialization, bitstreams, traits for int sign, TagPayloadType, some fixes to std #1775

Merged
merged 6 commits into from Feb 1, 2019
Merged

Added serialization, bitstreams, traits for int sign, TagPayloadType, some fixes to std #1775

merged 6 commits into from Feb 1, 2019

Conversation

tgschultz
Copy link
Contributor

Added BitInStream/BitOutStream to std.io:
These streams wrap any other In/Out stream and allow you to read/write bitfields with a specified endianess.

Added Serialzer/Deserializer to std.io:
These wrap any other In/Out stream and provide the ability to read/write almost any type, including packed types, with specified endianess. Operates in either packed or unpacked mode. The former will bit-pack everything, the latter will byte-pack everything except packed structs which are bit-packed and then padded to the nearest byte.

Added traits for int sign to std.meta.trait:
trait.isSignedInt, trait.isUnsignedInt. These should be self-explanatory.

Added TagPayloadType to std.meta:
Given an enum tag and a union type, returns the type of the union field corresponding to the tag. Solves #1765.

Fixes:
std.meta.bitCount and std.meta.alignment now return comptime_int. std.meta.eql now handles Optionals.

@jayschwa
Copy link
Sponsor Contributor

jayschwa commented Nov 23, 2018

The [de]serializer doc comments should probably mention the format used (e.g. JSON, CBOR, Flat Buffers, Protocol Buffers, a custom format, etc.)

@winksaville
Copy link
Contributor

I'd like to suggest assert tests for edge cases such as u0, {ui}1,7,8,9 15,16,17,31,32,33,63,64,65,127,128 in big and little endian. As well as NaN and Inf for f32, f64.

@tgschultz
Copy link
Contributor Author

I'd like to suggest assert tests for edge cases such as u0, {ui}1,7,8,9 15,16,17,31,32,33,63,64,65,127,128 in big and little endian. As well as NaN and Inf for f32, f64.

If you look at this function in io_test.zig, you will see that every bit width between 0 and max_test_bitsize is tested for both signed and unsigned, big endian and little endian, bit-packed and byte-packed. I set max_test_bitsize to 17 for this push, but before the push I had it at 128 (which added significantly to the compile time). I brought it down because I had no reason to believe that any larger bit width would fail if 17 succeeded, excepting due to compiler and llvm bugs.

I didn't test float edge cases, which is a good idea, however I expect they will work fine since floats are just bit-casted to an equivelent unsigned integer before being serialized since zig guarantees a float representation.

@winksaville
Copy link
Contributor

Thanks for pointing out testIntSerializerDeserializer, I missed that, I saw the explict tests in BitInStream and BitOutStream and thought there should be more tests. My suggestion would be if its not pratical to test everything with testIntSerializerDeserializer the maybe add additional tests to BitInStream/BitOutStream probably adding the float tests there too.

One other suggestion, prepend all of the test names with "io_test.", like test "io_test.BitInStream", find it makes running tests much faster.

With --test-filter io the time is less than a second:

$ time zig test std/io.zig --test-filter io
Test 1/10 io.SliceOutStream...OK
Test 2/10 io.NullOutStream...OK
Test 3/10 io.CountingOutStream...OK
Test 4/10 import io tests...OK
Test 5/10 io_test.write a file, read it, then delete it...OK
Test 6/10 io_test.BufferOutStream...OK
Test 7/10 io_test.SliceInStream...OK
Test 8/10 io_test.PeekStream...OK
Test 9/10 io_test.io.SliceOutStream...OK
Test 10/10 math overflow functions...OK
All tests passed.

real	0m0.747s
user	0m0.675s
sys	0m0.062s

Compared to nearly 10s with no filter:

$ time zig test std/io.zig
Test 1/694 io.SliceOutStream...OK
Test 2/694 io.NullOutStream...OK
Test 3/694 io.CountingOutStream...OK
Test 4/694 import io tests...OK
Test 5/694 std...OK
Test 6/694 mem.secureZero...OK
...
Test 691/694 zig fmt: Block after if...OK
Test 692/694 zig fmt: use...OK
Test 693/694 zig fmt: string identifier...OK
Test 694/694 zig fmt: error return...OK
692 passed; 2 skipped.

real	0m9.673s
user	0m9.072s
sys	0m0.685s
```

@winksaville
Copy link
Contributor

I see you made the test changes just as I was commenting, LGTM!

Please consider renaming the tests.

@tgschultz
Copy link
Contributor Author

I normally name my tests after the package path (i.e: std.meta.x, std.meta.trait.x), but in this case I followed the practice of the files I added to because I didn't want to arbitrarily change a bunch of other people's stuff, you know?

@winksaville
Copy link
Contributor

I understand, if you don't want to, I'll create a PR after this one gets merged. Making the name change as a separate PR is probably a good idea anyway.

@andrewrk
Copy link
Member

This looks like good stuff, it's just a lot, so I hope you don't mind if it takes me a while to merge.

Do you mind if I re-organize your commits? I'll try to make sure you still get authorship, but it'll end up breaking stuff into smaller changes (and it would remove the "verified" status of the commits). The other option is for you to submit each independent thing separately.

@tgschultz
Copy link
Contributor Author

I said it in IRC but I'll repeat it here for the record, I don't mind if you re-organize my commits.

@bfloch
Copy link

bfloch commented Dec 8, 2018

Great work @tgschultz !

I'm only struggling with the POD requirement for serialization.

How do you feel about an optional callback function for handling pointers as opposed to a compiletime error.
Use cases:

  • Nested serialization (store address as unique id of current state an serialize target at offset with index)
  • Nulling pointers and let deserialization in client code reinitialize it
  • Serialization of a programs state where client knows it is safe to de-serialize to same address as state did not change

If the callback is not given the current behaviour would persist.

Currently I write my own serialization and I can not replace it with the proposal because of the limitation.
But if you feel the scope is too broad for std or there is too much rope involved I wouldn't mind. Just curious about opinions. Thanks.

@bfloch
Copy link

bfloch commented Dec 8, 2018

Just notice that you can implement fn serialize on your type. That's were serializations should go that require domain knowledge. Awesome!

@tgschultz
Copy link
Contributor Author

tgschultz commented Dec 8, 2018

Thanks, bfloch.

The idea behind this set of tools was to be very generic, and the problem with serializing pointers in structs, in my opinion, is that there just isn't an obvious way to do it that isn't dependent on the use case and is unsurprising (extern unions have a similar problem). As you've discovered, I included custom serializer/deserializer support to allow the user to handle such cases in whatever way their use case demands. Hopefully that will cover your needs.

…side of the struct and aliased inside it. This will enable alternate generic serializers (i.e. one that follows pointers) on a struct-by-struct basis.
@andrewrk andrewrk merged commit bbe857b into ziglang:master Feb 1, 2019
@tgschultz tgschultz deleted the stdlib-serialization branch February 6, 2019 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants