Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add async serialize option via coroutines #4

Merged
merged 4 commits into from
May 11, 2024

Conversation

oratory
Copy link
Contributor

@oratory oratory commented Apr 12, 2021

This adds an async=true config option to coroutine.yield() in the serialize function. This removes or reduces framerate loss while processing large tables.

Copy link
Owner

@rossnichols rossnichols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea, but it needs some more work before I can merge it.

LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
tests.lua Outdated Show resolved Hide resolved
tests.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated
@@ -1265,10 +1279,13 @@ end

function LibSerialize:SerializeEx(opts, ...)
self:_ClearReferences()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the library stores state on itself during serialization, you're going to have a bad time if someone (perhaps another addon) tries to serialize something while another serialization is happening asynchronously. You'll need to update all such state to be managed at the function call level. Also add a test case for this (make sure it fails in your current version).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably rename combinedOpts and use it as the state for the current serialization, since it's already plumbed everywhere. Just need to make sure the loop that writes user-specified keys is restricted to valid options.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not store any global states in the library.

The state can be a function local variable, because the state is cleared at the beginning and at the end of Serialization/Deserialization,

Change the prototype of self:_ClearReferences() to self:_ClearReferences(state), where "state" is initialized as a table.
Store "refsDirty", "stringRefs" and "tableRefs" in the variable "state" instead.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I probably shouldn't have done it that way to begin with, but it didn't make a difference until now :) unless people tried to dig into the library's internals, I suppose, but that was never a support scenario and I'm not concerned about breaking them.

LibSerialize.lua Show resolved Hide resolved
@SafeteeWoW
Copy link

LibDeflate got a similar PR.

I suggest a DeserializeBegin/Continue/End API, instead of coroutine, so the serialization can be done by chunk.
Need to add a state machine in the code.
Much more complicated than coroutine though

@rossnichols
Copy link
Owner

rossnichols commented Apr 12, 2021

LibDeflate got a similar PR.

I suggest a DeserializeBegin/Continue/End API, instead of coroutine, so the serialization can be done by chunk.
Need to add a state machine in the code.
Much more complicated than coroutine though

Yeah I was thinking about that, but as you said it's much more disruptive to the entire serialization process, and given that the current code uses recursion and iterators as implicit state while serializing, it'd be quite a large undertaking. We do have regression tests, but a change of that magnitude still worries me.

Given that the coroutine is opt-in and quite undisruptive to the code, I think it's preferable - you're taking on the overhead of the coroutine, sure, but it's an opt-in feature to avoid mini-hangs so I think the tradeoff is reasonable. But in case we did want to refactor to remove it in the future, I'd still want the coroutine management to be moved into the library, as I suggested above.

For reference, the LibDeflate PR: SafeteeWoW/LibDeflate#6

@rossnichols
Copy link
Owner

@oratory i haven't heard from you since you first opened the PR - are you planning on making these changes?

@oratory
Copy link
Contributor Author

oratory commented Apr 13, 2021

Hey yes I'm hoping to make these changes today or tomorrow.

@oratory
Copy link
Contributor Author

oratory commented Apr 14, 2021

Coroutines are now all managed within the lib, and cleaned up a bunch. SerializeAsync() and DeserializeAsync() are separate functions rather than having an async option.

One thing I'd still like to add is a check if an async process is currently running since it will break if a 2nd is started.

@rossnichols
Copy link
Owner

One thing I'd still like to add is a check if an async process is currently running since it will break if a 2nd is started.

This needs to be supported - this is a global library and may be used by any number of addons, which should not have to handle the case of "I want to serialize something but I'm told to come back later".

Copy link
Owner

@rossnichols rossnichols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current iteration still has issues:

  • Introduces non-WoW dependencies
  • Doesn't support multiple simultaneous serialization/deserialization
  • Insufficient test coverage

The API shape as implemented isn't possible to implement in pure Lua, so it'll need updating. I'd recommend adding a section to the readme to describe how to incorporate it into WoW's environment, though, using an OnUpdate frame script to drive the API.

LibSerialize.lua Outdated
Comment on lines 158 to 152
Callback arguments:
* `...`: the deserialized value(s)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What callback is this referring to?

LibSerialize.lua Outdated

Arguments:
* `callback`: a function called when serialization is complete
`callback` is called with the serialized string as its argument, or nil on failure
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this, you have the explicit section below.

The serialization API didn't have a concept of a non-error()ing failure mode before - can you propagate any failure from within the coroutine and "rethrow" it?

LibSerialize.lua Outdated
Arguments:
* `input`: a string previously returned from `LibSerialize:Serialize()`
* `callback`: a function called when deserialization is complete
`callback` is called with `success` (Boolean) as the first parameter,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - remove it since you have the specific section below.

LibSerialize.lua Outdated

Arguments:
* `input`: a string previously returned from `LibSerialize:Serialize()`
* `callback`: a function called when deserialization is complete
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - remove it since you have the specific section below. But this time you actually don't have the section, so you need to add it :)

LibSerialize.lua Outdated
@@ -182,6 +230,14 @@ The following serialization options are supported:
table encountered during serialization. The function must return true for
the pair to be serialized. It may be called multiple times on a table for
the same key/value pair. See notes on reeentrancy and table modification.
* `asyncYieldSize`: `number` (default 50000)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be documented and enforced as only being valid from SerializeAsync().

LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated
}

local canSerializeFnOptions = {
errorOnUnserializableType = false
}

local CreateFrame = CreateFrame
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library is designed to be runnable on Lua 5.1.4+ with no modifications - you can't introduce WoW-specific dependencies. Unfortunately this will complicate the API, since you can't drive the async loop yourself, but it'll fix the other issue of not supporting multiple simultaneous callers.

For reference, this is what I do on Windows:

  1. Get Lua from https://sourceforge.net/projects/luabinaries/files/5.1.4/Tools%20Executables/lua5_1_4_Win64_bin.zip/download
  2. lua.exe tests.lua

All tests should pass.

LibSerialize.lua Outdated
return FlushWriter()
end

function LibSerialize:SerializeAsyncEx(callback, opts, ...)
if not CreateFrame then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this isn't sufficient of a deterrent, since the tests still need to run outside the WoW environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea here that if the lib is run outside of WoW (or without a callback) then the async function will instead return some simple handlers to operate the coroutine. I'll work on that and update.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't be sufficient - the async version needs to be able to be tested outside WoW.

LibSerialize.lua Outdated
self:_ClearReferences()
local ReadBytes, ReaderBytesLeft = CreateReader(input)

self._readBytes = ReadBytes
opts = opts or {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere - inconsistent indentation (you're using 2 instead of 4?)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not modify function arguments. Make a copy

tests.lua Outdated
@@ -66,6 +66,21 @@ function LibSerialize:RunTests()
assert(tab.nested.c == nil)
end

do
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need more test cases for the other scenarios we've brought up, e.g. actually triggering a yield, multiple simultaneous async serializations/deserializations (including mixing those two categories), propagating an error from a failed serialization/deserialization.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to remove all global states that may share between API calls.
Test for multiple simultaneous async serializations/deserializations is still needed.

LibSerialize.lua Outdated
@@ -1265,10 +1279,13 @@ end

function LibSerialize:SerializeEx(opts, ...)
self:_ClearReferences()
local WriteString, FlushWriter = CreateWriter()
local WriteString, FlushWriter, DoYield = CreateWriter()

self._writeString = WriteString

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not modify "self" in API. This introduces global states

LibSerialize.lua Outdated

self._writeString = WriteString
self:_WriteByte(SERIALIZATION_VERSION)
if opts.async then
self._doYield = DoYield

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not modify "self" in API. This introduces global states

@SafeteeWoW
Copy link

Currently, this library uses lots of global states, so coroutine will have lots of problems.

Do not modify "self" in API.
Do not modify file-scope variable in API.

Store the states in a local table variable, and pass it through function

LibSerialize.lua Outdated
@@ -1346,6 +1464,26 @@ function LibSerialize:DeserializeValue(input)
return unpack(output, 1, outputSize)
end

function LibSerialize:DeserializeValueAsync(input, callback, opts)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a WoW specific API is needed here.
Better to put it in a example code

LibSerialize.lua Outdated
function LibSerialize:DeserializeAsync(input, callback, opts)
opts = opts or {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not modify user arguments. Make a copy

@rossnichols
Copy link
Owner

@oratory as mentioned in several comments, adding this functionality is difficult due to the library storing state effectively globally. Would you like me to make a separate PR fixing that, so you don't have to worry about it in this one? You'd just have to rebase onto it.

@oratory
Copy link
Contributor Author

oratory commented Apr 15, 2021

If you want to make that change anyway, then sure. Then we won't need to worry about multiple serializing processes running at once.

@rossnichols
Copy link
Owner

If you want to make that change anyway, then sure. Then we won't need to worry about multiple serializing processes running at once.

Well this PR won't be merged until that's supported, so the work has to be done by one of us :). It's a hard requirement that calling into the library to serialize/deserialize doesn't fail because some random other addon happens to be doing the same thing. Especially since it impacts synchronous users of the API as well.

@oratory
Copy link
Contributor Author

oratory commented Apr 15, 2021

My original plan was to cheat and just add a isWorking boolean to the state and make new requests wait, but this method seems better. :)

@rossnichols
Copy link
Owner

My original plan was to cheat and just add a isWorking boolean to the state and make new requests wait, but this method seems better. :)

Your way wouldn't work either, since as I said even synchronous API requests are impacted if there's an ongoing async request. There's no way to block a synchronous request unless you decide to immediately/synchronously finish the work for the async one.

I'll take a look at updating state tracking over the weekend.

@rossnichols
Copy link
Owner

@oratory I've just merged the update to encapsulate state - see #5. Please rebase your branch before your next round of changes.

@rossnichols
Copy link
Owner

I wonder if the yield time should be expressed in microseconds, rather than number of items?

@oratory
Copy link
Contributor Author

oratory commented Apr 16, 2021

I was thinking that - then on the WoW side it could be set by the current FPS.

@rossnichols
Copy link
Owner

I was thinking that - then on the WoW side it could be set by the current FPS.

I don't think the library itself would do that, but rather you'd include an end-to-end example in the readme that covers the WoW-specific integration (using a frame's OnUpdate handler, calculating the yield time based on current FPS, etc.

@rossnichols
Copy link
Owner

@oratory any update?

@oratory
Copy link
Contributor Author

oratory commented Apr 22, 2021

Ya I've got some updates in the works, just a bit swamped lately. Will update this week.

@oratory
Copy link
Contributor Author

oratory commented Apr 26, 2021

I changed things around a bit. Instead of being a typical async function, I'm calling it Chunks Mode which returns a handler which gets called on repeat until processing is finished.

Some rough benchmarks

Seralize & Compress

Small table: 115 Bytes

Standard Chunks Mode
Serialize: 0.21 ms Serialize: 1.19 ms
Compress: 1.30 ms Compress: 10.59 ms
Total: 1.51 ms Total: 11.78 ms

Medium Table: 163584 Bytes

Standard Chunks Mode
Serialize: 10 ms Serialize: 12 ms
Compress: 579 ms Compress: 674 ms
Total: 589 ms Total: 686 ms

Large Table: 1750073 Bytes

Standard Chunks Mode
Serialize: 560 ms Serialize: 609 ms
Compress: 7578 ms Compress: 8323 ms
Total: 8128 ms Total: 8932 ms

X-Large Table: 3511073 Bytes

Standard Chunks Mode
Serialize: 1130 ms Serialize: 1223 ms
Compress: 16489 ms Compress: 16700 ms
Total: 17589 ms Total: 17923 ms

Decompress and Deserialize

Small table: 115 Bytes

Standard Chunks Mode
Decompress: 0.24 ms Decompress: 9 ms
Deserialize: 0.10 ms Deserialize: 11 ms
Total: 0.34 ms Total: 20 ms

Medium Table: 163584 Bytes

Standard Chunks Mode
Decompress: 192 ms Decompress: 264 ms
Deserialize: 16 ms Deserialize: 27 ms
Total: 208 ms Total: 291 ms

Large Table: 1750073 Bytes

Standard Chunks Mode
Decompress: 3094 ms Decompress: 3668 ms
Deserialize: 574 ms Deserialize: 772 ms
Total: 3668 ms Total: 4440 ms

X-Large Table: 3511073 Bytes

Standard Chunks Mode
Decompress: 6308 ms Decompress: 7465 ms
Deserialize: 1074 ms Deserialize: 1536 ms
Total: 7382 ms Total: 9001 ms

Chunks Mode does have a small increase in processing time but has the benefit (in WoW and likely other environments) of not locking up the game and making the OS think it's non-responsive when processing large data.

@rossnichols
Copy link
Owner

Please rebase on top of latest master commit instead of merging, so that the PR doesn't have my recent changes in it.

@oratory
Copy link
Contributor Author

oratory commented Apr 26, 2021

Alrighty I think that did it.

Copy link
Owner

@rossnichols rossnichols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still a lot to be done before I can approve this. If you'd like me to work on this feature instead of iterating on this PR, let me know - I could probably spend some time on it in the next week or so.

LibSerialize.lua Outdated
@@ -479,7 +558,7 @@ local function CreateWriter()
bufferSize = 0
return flushed
end

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this extra space (and reconfigure your editor to trim trailing spaces on save imo!)

LibSerialize.lua Outdated
@@ -633,6 +712,7 @@ end
local LibSerializeInt = {}

local function CreateSerializer(opts)
opts = opts or {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary - remove.

LibSerialize.lua Outdated
@@ -700,6 +790,15 @@ end
function LibSerializeInt:_ReadObject()
local value = self:_ReadByte()

self.currentSize = (self.currentSize or 0) + value/8
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to _currentSize, initialize in CreateDeserializer().

LibSerialize.lua Outdated
@@ -700,6 +790,15 @@ end
function LibSerializeInt:_ReadObject()
local value = self:_ReadByte()

self.currentSize = (self.currentSize or 0) + value/8
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value has no relationship with the size.

LibSerialize.lua Outdated
if self._opts.chunksMode and (
(self.currentTime and self._opts.timeFn() - self.currentTime > self._opts.yieldOnChunkTime)
or self.currentSize > self._opts.yieldOnChunkSize) then
if self._opts.timeFn then self.currentTime = self._opts.timeFn() end
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeFn isn't included in the default options table (even as a nil entry, just for code documentation)

LibSerialize.lua Outdated
@@ -182,6 +209,24 @@ The following serialization options are supported:
table encountered during serialization. The function must return true for
the pair to be serialized. It may be called multiple times on a table for
the same key/value pair. See notes on reeentrancy and table modification.
* `chunksMode`: `boolean` (default false)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to async

LibSerialize.lua Outdated
* `chunksMode`: `boolean` (default false)
* `true`: the serialize function will return a coroutine handler to process
the serialization. If true, the following additional options are considered:
* `yieldOnChunkSize`: `number` How large to allow the buffer before yielding
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to yieldOnObjectCount

LibSerialize.lua Outdated
* `true`: the serialize function will return a coroutine handler to process
the serialization. If true, the following additional options are considered:
* `yieldOnChunkSize`: `number` How large to allow the buffer before yielding
* `yieldOnChunkTime`: `number` Max duration between yields
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to yieldOnElapsedTime

LibSerialize.lua Outdated
* `yieldOnChunkTime`: `number` Max duration between yields
* `timeFn`: `function` To return the time in `number` format relevant to the
environment.
* `false`: the serialize function will return a the serialized string directly.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

@@ -145,6 +145,30 @@ function LibSerialize:RunTests()
end


--[[---------------------------------------------------------------------------
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing is still insufficient.

@oratory
Copy link
Contributor Author

oratory commented Apr 30, 2021

Tests 1 (nil) and 35 (0/0) are currently failing on async, not sure why yet. Otherwise the other requested updates are completed.

LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
tests.lua Show resolved Hide resolved
@rossnichols
Copy link
Owner

@oratory any update?

@oratory
Copy link
Contributor Author

oratory commented May 16, 2021

Ya I've been working on it here and there, should have something to post tomorrow.

LibSerialize.lua Outdated
* `timeFn`: `function` To return the time in `number` format relevant to the
environment.
* `false`: the serialize function will return the serialized string directly.
When using `SerializeAsyncEx()`, these additional options are supported:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my comments in the LibDeflate PR, I think these can be collapsed into a single option, where you supply either:

yieldCheckFn: function(objectCount: number) => boolean - a yield will occur if this function returns true, or
yieldOnObjectCount: number - a yield will occur after this number of objects

And then if neither are provided, yield after objects. With this approach, the library code just calls yieldCheckFn and does bookkeeping on the count of objects since yield, and the caller (or default args) handles the specifics. A time-based yield is then up to the caller to implement, which is easy (they just need to reset their own "elapsed" when returning true).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The yieldCheckFn works well, makes yieldOnObjectCount redundant even since it's now all self contained.

LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
LibSerialize.lua Outdated Show resolved Hide resolved
tests.lua Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
LibSerialize.lua Show resolved Hide resolved
tests.lua Show resolved Hide resolved
@rossnichols
Copy link
Owner

Can you please retarget your PR to be into the new async-staging branch I just created?

@Stanzilla
Copy link

@rossnichols hey we are revisiting this and were wondering if you can take a look at this again?

@rossnichols
Copy link
Owner

@Stanzilla sorry for the delay here. I've stopped playing WoW and would be happy to let others from the community help make changes here. Would you like to be added?

@Stanzilla
Copy link

@Stanzilla sorry for the delay here. I've stopped playing WoW and would be happy to let others from the community help make changes here. Would you like to be added?

Hey there, I wouldn't say "want" but I think it might be a good idea. We were considering a fork for Wago related content but I really don't want more on my/our plate so rather avoid that. So far Async is all we need so If we can get this PR into a good shape and merge (Can do that If you don't have time) that would be great.

cc @oratory @Nnoggie

@rossnichols
Copy link
Owner

Sounds good - if anyone from the community wants to be added to the repo I'm happy to do so. For this PR specifically, IIRC I was mostly wanting to ensure adequate test coverage and validating that existing tests are unaffected. I'd have to spend some time going back through it to refresh my memory on the actual code changes. It'd be great if you can take a pass and try to make it ready, I'll spend some time looking at it as well.

@Nnoggie
Copy link

Nnoggie commented May 1, 2024

Hello! I tested this PR against our use cases at Wago.
We aim to serialize (and compress) lots of big tables at once, hence the need for async versions of all the libraries that do that.
In the specific case for LibSerialize we want to export a lot of WeakAuras to export strings.

When accounting for the above mentioned issue it looks to be working nicely for our purposes and has similar performance compared to a seperate implementation that i temporarily used.
image

As Stan mentioned we are intested in this PR getting merged. I am more than happy to assist with the tests or other work.
Let me know what you think!

@rossnichols
Copy link
Owner

Alright well since this is targeting my async-staging branch, I'll go ahead and merge this and then take a look at merging into master. Thanks!

@rossnichols rossnichols merged commit b871125 into rossnichols:async-staging May 11, 2024
rossnichols pushed a commit that referenced this pull request May 16, 2024
* Add chunks mode for coroutine processing

* Requested updates

* Split sync/async functions, others

* Further updates
rossnichols pushed a commit that referenced this pull request May 16, 2024
* Add chunks mode for coroutine processing

* Requested updates

* Split sync/async functions, others

* Further updates
rossnichols added a commit that referenced this pull request May 20, 2024
* Add async serialize option via coroutines (#4)

* Add chunks mode for coroutine processing

* Requested updates

* Split sync/async functions, others

* Further updates

* Implement support for custom readers and writers (#12)

* Implement generic support for custom readers

This permits a new options table to be supplied to the Deserialize
and DeserializeValue functions on the library which may contain two
keys that control how the library reads bytes from the stream for
deserialization, and how it detects the end-of-stream state.

As-is this implementation is fully backwards compatible and allows
users of the library to throw processing into a coroutine for large
inputs if required, meeting some use cases that have prompted similar
PRs whilst also being more flexible and much simpler.

The new signatures for Deserialize and DeserializeValue are as follows:

  function LibSerialize:DeserializeValue(input[, options])
  function LibSerialize:Deserialize(input[, options])

The options parameter, if supplied, is expected to be a table that
may contain the following optional keys:

  - readBytes
  - atEnd

The "readBytes" key if present is expected to point to a function
of the signature "readBytes(input, i, j)" and must return a string
whose length is equal to (j - i) + 1). The "input" parameter is the
same as-supplied to Deserialize/DeserializeValue verbatim. The "i"
and "j" parameter represent an inclusive range of indices ([i, j])
typically as byte offsets within the input value. This convention is
identical to that provided by many Lua functions, such as string.sub,
table.concat, and unpack.

The "atEnd" key if present is expected to point to a function of the
signature "atEnd(input, i)". As with "readBytes", the "input" parameter
is whatever value was supplied to Deserialize/DeserializeValue, and the
"i" parameter represents the offset of the stream.

By default if neither of these keys are supplied, or no options table
is given, then the implementation will just default to using string.sub
for "readBytes" and for "atEnd" will test if the offset exceeds the
length of the input as-provided by the length operator.

Examples are provided in the test cases, where we test the following
scenarios:

  - A case where "input" may have additional data after the payload that
    we don't want deserializing.
  - A case where the "input" is a table of character bytes.
  - A case where we yield from within "readBytes" and feed data from the
    main thread, simulating receiving data with a potentially unknown
    bound.

* Implement support for custom writers

This mirrors recent changes submitted for the deserialization API and
provides a generic system for custom writer support to control how and
where the byte stream generated during serialization is written. The
change as-is is fully backwards compatible and passes the existing and
expanded test cases.

Primary use cases would align with existing async serialization attempts
but don't put the complexity of thread management or throttling logic
into the library, instead leaving the minutia of how those aspects are
handled to callers. It also enables some other advanced use cases such
as streaming the data to a destination without buffering it fully
beforehand, which could enable efficiencies elsewhere.

* Convert generic reader/writer to use objects instead of closures

* Hoist async into reader/writer layer instead of serialize/deserialize

* Rename function to GenericReader_AtEnd

* Expand tests, add example for reader/writer protocol

* Update readme

* Fix formatting

* Globals cleanup

---------

Co-authored-by: Mark Bosley <mark.bosley@gmail.com>
Co-authored-by: Daniel Yates <me@meorawr.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants