-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve stack usage for flood filling #100
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/identify.c
Outdated
if (depth >= FLOOD_FILL_MAX_DEPTH) | ||
return; | ||
/* Set up the first context */ | ||
next_vars = &stack[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking:
next_vars = &stack[0]; | |
next_vars = stack; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
lib/quirc.c
Outdated
*/ | ||
|
||
num_vars = h * 2 / 3; | ||
vars = malloc(sizeof(*vars) * num_vars); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to use malloc instead of calloc here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just because it doesn't need to be initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uninitialized is fine, I meant for the overflow check safety net.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i added a few checks. is this what you meant?
Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping. |
i completely agree. |
@dlbeer |
On Mon, May 10, 2021 at 06:34:53PM -0700, YAMAMOTO Takashi wrote:
> > Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping.
>
> i completely agree.
> i don't want to do it in this PR though.
@dlbeer
do you want me to do it within this PR?
generally, i don't think it's a good idea to do too many things in a single PR.
but i can do that if it's considered as a blocker of this PR.
Well, unless you wanted to create a new one. I like what you're trying
to do here, but I think the mechanical translation from a recursive
algorithm is a bit too hard to follow and modify.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
|
* Make flood filling logic iterative (vs recursive) I basically tried one-to-one conversions here to avoid mistakes. probably it has a room for later optimizations. * Use explicit malloc (vs variables on stack) to allocate the work area. * Estimate the amount of memory for the work area dynamically from the image size, instead of using a constant FLOOD_FILL_MAX_DEPTH, which is too big in the most cases.
Also, avoid malloc(0), which is not too portable.
btw,
do you mean to have a single "y" var? |
On Mon, May 10, 2021 at 07:09:07PM -0700, YAMAMOTO Takashi wrote:
btw,
> you probably only need to keep track of fixed y, and incrementing x for each span
do you mean to have a single "y" var?
i don't think it works for complex shapes.
No, I meant that each span would need a y value and a next x-value (in
other words, each stack level).
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
On Mon, May 10, 2021 at 07:04:01PM -0700, YAMAMOTO Takashi wrote:
> On Mon, May 10, 2021 at 06:34:53PM -0700, YAMAMOTO Takashi wrote: > > Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping. > > i completely agree. > i don't want to do it in this PR though. @dlbeer do you want me to do it within this PR? generally, i don't think it's a good idea to do too many things in a single PR. but i can do that if it's considered as a blocker of this PR.
> Well, unless you wanted to create a new one. I like what you're trying to do here, but I think the mechanical translation from a recursive algorithm is a bit too hard to follow and modify.
* i want to work on it. (just expressing honest intention. no promise)
* i want to do it in a separate PR.
* if this PR will not likely be merged anytime soon, i can add it as separate commits to this PR.
Are you talking about doing a separate PR with just a direct replacement
of the existing code with a simpler iterative implementation? If that's
what you mean, then I'd definitely be happy to merge such a PR.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
i meant a separate PR on the top of this one.
|
On Mon, May 10, 2021 at 07:59:21PM -0700, YAMAMOTO Takashi wrote:
> On Mon, May 10, 2021 at 07:04:01PM -0700, YAMAMOTO Takashi wrote: > On Mon, May 10, 2021 at 06:34:53PM -0700, YAMAMOTO Takashi wrote: > > Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping. > > i completely agree. > i don't want to do it in this PR though. @dlbeer do you want me to do it within this PR? generally, i don't think it's a good idea to do too many things in a single PR. but i can do that if it's considered as a blocker of this PR. > Well, unless you wanted to create a new one. I like what you're trying to do here, but I think the mechanical translation from a recursive algorithm is a bit too hard to follow and modify. * i want to work on it. (just expressing honest intention. no promise) * i want to do it in a separate PR. * if this PR will not likely be merged anytime soon, i can add it as separate commits to this PR.
> Are you talking about doing a separate PR with just a direct replacement of the existing code with a simpler iterative implementation? If that's what you mean, then I'd definitely be happy to merge such a PR.
i meant a separate PR on the top of this one.
Ah ok, yes that's fine -- once the second PR is ready I can merge both.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
i'm not sure if i understand.
|
well, if you don't want to merge this PR until both are ready,
|
This reduces the memory usage a bit.
On Mon, May 10, 2021 at 08:29:51PM -0700, YAMAMOTO Takashi wrote:
> On Mon, May 10, 2021 at 07:59:21PM -0700, YAMAMOTO Takashi wrote: > On Mon, May 10, 2021 at 07:04:01PM -0700, YAMAMOTO Takashi wrote: > On Mon, May 10, 2021 at 06:34:53PM -0700, YAMAMOTO Takashi wrote: > > Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping. > > i completely agree. > i don't want to do it in this PR though. @dlbeer do you want me to do it within this PR? generally, i don't think it's a good idea to do too many things in a single PR. but i can do that if it's considered as a blocker of this PR. > Well, unless you wanted to create a new one. I like what you're trying to do here, but I think the mechanical translation from a recursive algorithm is a bit too hard to follow and modify. * i want to work on it. (just expressing honest intention. no promise) * i want to do it in a separate PR. * if this PR will not likely be merged anytime soon, i can add it as separate commits to this PR. > Are you talking about doing a separate PR with just a direct replacement of the existing code with a simpler iterative implementation? If that's what you mean, then I'd definitely be happy to merge such a PR. i meant a separate PR on the top of this one.
> Ah ok, yes that's fine -- once the second PR is ready I can merge both.
well, if you don't want to merge this PR until both are ready,
i will just add commits to this PR.
Ok, sounds good to me.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
On Mon, May 10, 2021 at 08:28:28PM -0700, YAMAMOTO Takashi wrote:
> On Mon, May 10, 2021 at 07:09:07PM -0700, YAMAMOTO Takashi wrote: btw, > you probably only need to keep track of fixed y, and incrementing x for each span do you mean to have a single "y" var? i don't think it works for complex shapes.
> No, I meant that each span would need a y value and a next x-value (in other words, each stack level).
i'm not sure if i understand.
as far as we keep the current processing order, i think we need to keep some context
(left, right, i, which y-direction we were looking at) to restore the context.
maybe we can unify x and left though.
Yeah, you're right -- I think the bare minimum you can get away with is
3 pieces of state: y value, current/leftmost x, and rightmost x.
Rough outline of a depth-first span-based flood-fill then is:
* Examine top of stack
- if top.x == top.right, pop frame and try again
- otherwise:
- seed (top.x, top.y - 1)
- seed (top.x, top.y + 1)
The procedure seed (x, y) is:
* If (x, y) is already filled, do nothing
* Otherwise:
- x = left-most unfilled pixel in this span
- right = right-most unfilled pixel in this span
- push a new record { y, x, right }
Haven't thought that through very carefully, so beware of bugs. The main
thing though is to make sure the whole span gets filled before seeding
above/below, so you avoid looping over the same pixels.
…--
Daniel Beer ***@***.***> http://dlbeer.co.nz/
PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B
|
This restructure allows us to unify "call" and "return_from_call".
This is also a preparation to eliminate the uses of goto.
Just for readability.
thank you for the comment. |
except that my implementation uses separate "counters" for above/below seeding, i think it's basically the same as what you explained. |
Ok, looks good -- thanks! |
Make flood filling logic iterative (vs recursive)
I basically tried one-to-one conversions here to avoid mistakes.
probably it has a room for later optimizations.
Use explicit malloc (vs variables on stack) to allocate the work area.
Estimate the amount of memory for the work area dynamically
from the image size, instead of using a constant FLOOD_FILL_MAX_DEPTH,
which is too big in the most cases.