Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve stack usage for flood filling #100

Merged
merged 13 commits into from
May 11, 2021
Merged

Conversation

yamt
Copy link
Contributor

@yamt yamt commented Apr 5, 2021

  • Make flood filling logic iterative (vs recursive)
    I basically tried one-to-one conversions here to avoid mistakes.
    probably it has a room for later optimizations.

  • Use explicit malloc (vs variables on stack) to allocate the work area.

  • Estimate the amount of memory for the work area dynamically
    from the image size, instead of using a constant FLOOD_FILL_MAX_DEPTH,
    which is too big in the most cases.

Copy link
Collaborator

@kaworu kaworu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @yamt and thanks for the PR. Honestly I'm not smart enough to understand the goto flow into various loops, so I'll defer to @dlbeer here.

lib/identify.c Outdated
if (depth >= FLOOD_FILL_MAX_DEPTH)
return;
/* Set up the first context */
next_vars = &stack[0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking:

Suggested change
next_vars = &stack[0];
next_vars = stack;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

lib/quirc.c Outdated
*/

num_vars = h * 2 / 3;
vars = malloc(sizeof(*vars) * num_vars);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to use malloc instead of calloc here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just because it doesn't need to be initialized.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uninitialized is fine, I meant for the overflow check safety net.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a few checks. is this what you meant?

@dlbeer
Copy link
Owner

dlbeer commented Apr 18, 2021

Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping.

@yamt
Copy link
Contributor Author

yamt commented Apr 18, 2021

Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping.

i completely agree.
i don't want to do it in this PR though.

@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping.

i completely agree.
i don't want to do it in this PR though.

@dlbeer
do you want me to do it within this PR?
generally, i don't think it's a good idea to do too many things in a single PR.
but i can do that if it's considered as a blocker of this PR.

@dlbeer
Copy link
Owner

dlbeer commented May 11, 2021 via email

@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

On Mon, May 10, 2021 at 06:34:53PM -0700, YAMAMOTO Takashi wrote: > > Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping. > > i completely agree. > i don't want to do it in this PR though. @dlbeer do you want me to do it within this PR? generally, i don't think it's a good idea to do too many things in a single PR. but i can do that if it's considered as a blocker of this PR.
Well, unless you wanted to create a new one. I like what you're trying to do here, but I think the mechanical translation from a recursive algorithm is a bit too hard to follow and modify.

  • i want to work on it. (just expressing honest intention. no promise)
  • i want to do it in a separate PR.
  • if this PR will not likely be merged anytime soon, i can add it as separate commits to this PR.

yamt added 2 commits May 11, 2021 11:05
* Make flood filling logic iterative (vs recursive)
  I basically tried one-to-one conversions here to avoid mistakes.
  probably it has a room for later optimizations.

* Use explicit malloc (vs variables on stack) to allocate the work area.

* Estimate the amount of memory for the work area dynamically
  from the image size, instead of using a constant FLOOD_FILL_MAX_DEPTH,
  which is too big in the most cases.
Also, avoid malloc(0), which is not too portable.
@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

btw,

you probably only need to keep track of fixed y, and incrementing x for each span

do you mean to have a single "y" var?
i don't think it works for complex shapes.

@dlbeer
Copy link
Owner

dlbeer commented May 11, 2021 via email

@dlbeer
Copy link
Owner

dlbeer commented May 11, 2021 via email

@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

On Mon, May 10, 2021 at 07:04:01PM -0700, YAMAMOTO Takashi wrote: > On Mon, May 10, 2021 at 06:34:53PM -0700, YAMAMOTO Takashi wrote: > > Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping. > > i completely agree. > i don't want to do it in this PR though. @dlbeer do you want me to do it within this PR? generally, i don't think it's a good idea to do too many things in a single PR. but i can do that if it's considered as a blocker of this PR. > Well, unless you wanted to create a new one. I like what you're trying to do here, but I think the mechanical translation from a recursive algorithm is a bit too hard to follow and modify. * i want to work on it. (just expressing honest intention. no promise) * i want to do it in a separate PR. * if this PR will not likely be merged anytime soon, i can add it as separate commits to this PR.
Are you talking about doing a separate PR with just a direct replacement of the existing code with a simpler iterative implementation? If that's what you mean, then I'd definitely be happy to merge such a PR.

i meant a separate PR on the top of this one.


-- Daniel Beer @.***> http://dlbeer.co.nz/ PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B

@dlbeer
Copy link
Owner

dlbeer commented May 11, 2021 via email

@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

On Mon, May 10, 2021 at 07:09:07PM -0700, YAMAMOTO Takashi wrote: btw, > you probably only need to keep track of fixed y, and incrementing x for each span do you mean to have a single "y" var? i don't think it works for complex shapes.
No, I meant that each span would need a y value and a next x-value (in other words, each stack level).

i'm not sure if i understand.
as far as we keep the current processing order, i think we need to keep some context
(left, right, i, which y-direction we were looking at) to restore the context.
maybe we can unify x and left though.


-- Daniel Beer @.***> http://dlbeer.co.nz/ PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B

@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

On Mon, May 10, 2021 at 07:59:21PM -0700, YAMAMOTO Takashi wrote: > On Mon, May 10, 2021 at 07:04:01PM -0700, YAMAMOTO Takashi wrote: > On Mon, May 10, 2021 at 06:34:53PM -0700, YAMAMOTO Takashi wrote: > > Apologies for the delay in getting to this. I think you might end up with a better result by implementing the span-based flood-fill directly with a simpler stack structure (you probably only need to keep track of fixed y, and incrementing x for each span) and a loop that examines the top of the stack looking for new seeds to push. I understand that this is a direct translation of the recursive procedure, and I think that's probably led to a lot of unnecessary book-keeping. > > i completely agree. > i don't want to do it in this PR though. @dlbeer do you want me to do it within this PR? generally, i don't think it's a good idea to do too many things in a single PR. but i can do that if it's considered as a blocker of this PR. > Well, unless you wanted to create a new one. I like what you're trying to do here, but I think the mechanical translation from a recursive algorithm is a bit too hard to follow and modify. * i want to work on it. (just expressing honest intention. no promise) * i want to do it in a separate PR. * if this PR will not likely be merged anytime soon, i can add it as separate commits to this PR. > Are you talking about doing a separate PR with just a direct replacement of the existing code with a simpler iterative implementation? If that's what you mean, then I'd definitely be happy to merge such a PR. i meant a separate PR on the top of this one.
Ah ok, yes that's fine -- once the second PR is ready I can merge both.

well, if you don't want to merge this PR until both are ready,
i will just add commits to this PR.


-- Daniel Beer @.***> http://dlbeer.co.nz/ PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B

@dlbeer
Copy link
Owner

dlbeer commented May 11, 2021 via email

@dlbeer
Copy link
Owner

dlbeer commented May 11, 2021 via email

@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

On Mon, May 10, 2021 at 08:28:28PM -0700, YAMAMOTO Takashi wrote: > On Mon, May 10, 2021 at 07:09:07PM -0700, YAMAMOTO Takashi wrote: btw, > you probably only need to keep track of fixed y, and incrementing x for each span do you mean to have a single "y" var? i don't think it works for complex shapes. > No, I meant that each span would need a y value and a next x-value (in other words, each stack level). i'm not sure if i understand. as far as we keep the current processing order, i think we need to keep some context (left, right, i, which y-direction we were looking at) to restore the context. maybe we can unify x and left though.
Yeah, you're right -- I think the bare minimum you can get away with is 3 pieces of state: y value, current/leftmost x, and rightmost x. Rough outline of a depth-first span-based flood-fill then is: * Examine top of stack - if top.x == top.right, pop frame and try again - otherwise: - seed (top.x, top.y - 1) - seed (top.x, top.y + 1) The procedure seed (x, y) is: * If (x, y) is already filled, do nothing * Otherwise: - x = left-most unfilled pixel in this span - right = right-most unfilled pixel in this span - push a new record { y, x, right } Haven't thought that through very carefully, so beware of bugs. The main thing though is to make sure the whole span gets filled before seeding above/below, so you avoid looping over the same pixels.

-- Daniel Beer @.***> http://dlbeer.co.nz/ PGP: BA6E 0B26 1F89 246C E3F3 C910 1E58 C43A 160A 553B

thank you for the comment.
unfortunately i've just implemented what i was thinking before i noticed your comment. :-)
i pushed what i implemented to this PR.

@yamt
Copy link
Contributor Author

yamt commented May 11, 2021

except that my implementation uses separate "counters" for above/below seeding, i think it's basically the same as what you explained.

@dlbeer dlbeer merged commit 5adb759 into dlbeer:master May 11, 2021
@dlbeer
Copy link
Owner

dlbeer commented May 11, 2021

Ok, looks good -- thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants