Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Single Fold to Paper #21

Closed
jboarman opened this issue Jul 8, 2021 · 19 comments
Closed

Add Single Fold to Paper #21

jboarman opened this issue Jul 8, 2021 · 19 comments
Labels
enhancement New feature or request

Comments

@jboarman
Copy link
Sponsor Member

jboarman commented Jul 8, 2021

A separate feature request has been made that damages the paper with a crumpled / wrinkled effect (see #17). This request, however, starts with a simpler problem: a single fold.

The effect is characterized by the following:

  • a small amount of affine warp at the fold
  • bands of darkness and brightness aligned with the dip and rise of the fold's mesh

The complexity here is creating a nice fold (either inward or outward) with any kind of ridge definition (most obvious in the last image in the set of examples). I propose for the scope of this first issue relating to 3D transformations would be to keep it simple and not attempt to create hard-creased edged but instead focus on smoother folds like the first couple example images.

image
image
image
image

@jboarman jboarman added the enhancement New feature or request label Jul 8, 2021
@shaheryar1
Copy link
Contributor

This feature was also in my mind and I would like to work on it in my next PR.

@jboarman
Copy link
Sponsor Member Author

jboarman commented Jul 11, 2021

@shaheryar1 Have you had a chance to work out some ideas for generating random 2D maps that simulate these 3D effects? If so, can you share a code branch where you are working on this so that other can provide feedback and ideas?

DocCreator's 3D transforms use a library of 3D meshes that warp the document and apply a ray casting type shader. However, that method is substantially more complicated than what is needed to emulate the intended effects. If we can use simpler transforms, then we can maintain higher performance and more maintainable code.

For example, we can start with just the affine warps and worry about the variable brightness as a secondary concern. For the warp, we have to start with just the question of how we create the 2D transformation map (without actually adding a 3rd dimension for the depth/altitude of the positional shift, which would be used for relative darkness/brightness based on direction of light source).

As parameters to generating the 2D mesh, I figure we might randomize things like the gradient (angle of ascent/descent), intensity (max peak/valley), angle of fold's central line relative to page -- there are probably better params and names than these.

But, I'm not sure how one would create 2D meshes using math functions or these params. Do you or others have some ideas?

Example fold warp by

@jboarman
Copy link
Sponsor Member Author

In addition to the scikit-image affine warp implementation mentioned in the issue, we might borrow ideas from Albumentations implementation of elastic warp transform.

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 12, 2021

@shaheryar1 are you still working on this? Otherwise i will look further on how to implement this. Actually i think it will be better if we avoid those 3D meshes, it would be much more complicated with that. Affine warp should be able to do something like this, but definitely need more work if we want to make it looks natural.

@jboarman
Copy link
Sponsor Member Author

The affine warp requires a 2D "mesh" of sorts. Basically, the mesh represents the "before" and "after" positions of key points fed into the warp processing.

If we get to adding lighting or differing brightness/darkness levels, then a 3rd dimension of the mesh will be required to represent a "z index". But, we will want to get the warp functional before worrying with brightness or darkness.

@shaheryar1 is going to work on #13 first. So @kwcckw, feel free to work on an implementation of this feature and we can all collaborate on a solution based on the results of your initial experiments.

@shaheryar1
Copy link
Contributor

I researched on this and there wasn't much literature I found on this. For initial experiments, I thought to try applying 2-D affine transformation on sub-part of the image i:e on straight vertical/horizontal line with 5-10 pixel width. But did not got a chance to try this approach. @kwcckw you can try this idea to kick start this.

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 12, 2021

I researched on this and there wasn't much literature I found on this. For initial experiments, I thought to try applying 2-D affine transformation on sub-part of the image i:e on straight vertical/horizontal line with 5-10 pixel width. But did not got a chance to try this approach. @kwcckw you can try this idea to kick start this.

Thanks, from the suggestion by @jboarman , i'm thinking to try grid distortion and elastic transform from here:
https://github.com/albumentations-team/albumentations

Possibly the combination of them as well.
image

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 12, 2021

This is the result from 2 affine transform:

image

Looks like we need further lighting related processing to make it looks more natural, specifically on the folding area.

@jboarman
Copy link
Sponsor Member Author

That’s a great first start. Can you share your code in some way? As a Colab notebook would be ideal, but even as a gist would be OK.

To get the lighting, we have to translate the implied mesh into a mask that we add to to the warped image. So, for each position, we are either descending or ascending, and the value is therefore a positive or negative number. Adding that layer to the base, ensuring a pixel does not go below 0 or above 255, will then darken or lighten the image.

For a simple BW image like that example, we’ll only see darkening. Direction of lighting can be dealt with later.

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 12, 2021

That’s a great first start. Can you share your code in some way? As a Colab notebook would be ideal, but even as a gist would be OK.

To get the lighting, we have to translate the implied mesh into a mask that we add to to the warped image. So, for each position, we are either descending or ascending, and the value is therefore a positive or negative number. Adding that layer to the base, ensuring a pixel does not go below 0 or above 255, will then darken or lighten the image.

For a simple BW image like that example, we’ll only see darkening. Direction of lighting can be dealt with later.

Okay, but i need to figure out the general function to shift the image after the transformation first, and google colab should be good enough to test the code.

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 13, 2021

Here is the draft of the code in google colab:
https://drive.google.com/drive/folders/1t4UBurLbD-asR9NvERhoOVQSnkgKyrL0?usp=sharing

I added some noises and we are able to fold 2 locations too:

1 folding:
image

multiple foldings:
image

I think it looks much more better now with the noises.

@jboarman
Copy link
Sponsor Member Author

This is a great starting point. We should go ahead and proceed with creating a PR based on this implementation since it is viable as it is now. We should continue to track this as an open issue, or split out a V2 for a more advanced semi-3D implementation down the road, but this is great as it is now.

What parameters do you see that we should randomize for this version of the implementation?

Some ideas for options and potential default values:

  • fold_count = [0,1,2,3] - randomly selected # of folds on page
  • orientation = [90, 45, 0, (35, 55)] - angle of fold's central line relative to page; where tuple represents a fluid range of angle values
  • orientation_jitter = 10 - degrees of fold angle variation
  • fold_noise = 0.5 - intensity of noise from 0..1; should just embed some level of jitter in this as I don't think it's necessary to parameterize noise jitter
  • gradient_width = 0.01 - simplistic measure of the space affected by fold prior to being warped (in units of percentage of width of page)
  • gradient_height = 0.01 - simplistic measure of depth of fold (unit measured as percentage page width)

This sample sketch below provides a half-baked concept for a gradient intensity measure. This could very easily change in the future as we advance this feature, and we should not be afraid of breaking changes later with a feature that is so new. I'm pretty open to how we parameterize that fold intensity measure as there are clearly lots of ways to handle that.

gradient params

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 14, 2021

This is a great starting point. We should go ahead and proceed with creating a PR based on this implementation since it is viable as it is now. We should continue to track this as an open issue, or split out a V2 for a more advanced semi-3D implementation down the road, but this is great as it is now.

Okay, i will put it up as a draft and create a pull request later once we finalized the parameters below.

What parameters do you see that we should randomize for this version of the implementation?

Some ideas for options and potential default values:

* `fold_count = [0,1,2,3]` - randomly selected # of folds on page

Yes, i think randomly would be better instead of let user specify the folding x location. Also actually we can create folding effect at the edges of image too, so that it would looks like curling up or down:
image

* `orientation = [90, 45, 0, (35, 55)]` - angle of fold's central line relative to page; where tuple represents a fluid range of angle values

* `orientation_jitter = 10` - degrees of fold angle variation

I need to think about this and test a bit first, since it might not be straight forward to create diagonal lines. One of the workaround is to rotate the image and fold them multiple times, for example:

image

But do take note right now there's artifacts (in red circle) from the perspective transformation, which I haven't check on the possible workaround yet.

* `fold_noise = 0.5` - intensity of noise from 0..1; should just embed some level of jitter in this as I don't think it's necessary to parameterize noise jitter

Yea, just noise should be sufficient. I think level of noises should be controlled by how far the noise from the center of the folding, and right now the code is doing that, where the noise is heaviest at the center of folding.

* `gradient_width = 0.01` - simplistic measure of the space affected by fold prior to being warped (in units of percentage of width of page)

* `gradient_height = 0.01` - simplistic measure of depth of fold (unit measured as percentage page width)

This sample sketch below provides a half-baked concept for a gradient intensity measure. This could very easily change in the future as we advance this feature, and we should not be afraid of breaking changes later with a feature that is so new. I'm pretty open to how we parameterize that fold intensity measure as there are clearly lots of ways to handle that.

Yes, i think this should be feasible, where gradient width would be the folding x length while gradient height would be the distortion in depth . And right now the algorithm is using 2 transforms, at left and right side of the folding:
image

It could be smoother with more than 2 transforms:
image

So at this point, do you think we need this kind of complexity?

@jboarman
Copy link
Sponsor Member Author

It could be smoother with more than 2 transforms .... So at this point, do you think we need this kind of complexity?

Regarding your last question, I think 2 transforms is good enough for V1 of this augmentation. Taking it further would likely consider some semi-3D aspects and direction of light source etcetera that will require more thought and not worth it for where we are with the library.

Also, regarding the folds at arbitrary angles, that too can be deferred to a future date since it's not truly necessary for a V1 implementation. It's something we can improve later as needed.

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 14, 2021

It could be smoother with more than 2 transforms .... So at this point, do you think we need this kind of complexity?

Regarding your last question, I think 2 transforms is good enough for V1 of this augmentation. Taking it further would likely consider some semi-3D aspects and direction of light source etcetera that will require more thought and not worth it for where we are with the library.

Also, regarding the folds at arbitrary angles, that too can be deferred to a future date since it's not truly necessary for a V1 implementation. It's something we can improve later as needed.

Alright got it, thanks for the input.

@proofconstruction
Copy link
Collaborator

proofconstruction commented Jul 15, 2021

This looks really good already, great work @kwcckw! Please submit a PR adding this augmentation.

I think the artifacts you circled above look pretty natural; the edge of the "page" looks clean except for the corner where it was folded, and the dark spots would appear when copied by a real scanner (see images in #13 ).

In the future, we can probably also approximate part of the "crumpled paper" effect I was thinking of in #17 by adding lots of these folds with small gradient widths and varying lengths (so the fold doesn't extend across the whole image).

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 15, 2021

This looks really good already, great work @kwcckw! Please submit a PR adding this augmentation.

Thanks, i just submitted the pull request, but still it might be buggy in some conditions, need to test further.

I think the artifacts you circled above look pretty natural; the edge of the "page" looks clean except for the corner where it was folded, and the dark spots would appear when copied by a real scanner (see images in #13 ).

Yea, the edges may look more natural if we apply perspective transform a few more times to smoothen it, but that probably would be overkill for now.

In the future, we can probably also approximate part of the "crumpled paper" effect I was thinking of in #17 by adding lots of these folds with small gradient widths and varying lengths (so the fold doesn't extend across the whole image).

Right, but the challenge i can see is how to make it looks natural, since if the fold doesn't extend across the whole image, there's a lot of work need to be done on the starting line of folding effect. Right now i still don't have a clear idea to address on this problem yet.

@proofconstruction
Copy link
Collaborator

PR #28 has been merged.

@jboarman
Copy link
Sponsor Member Author

jboarman commented Aug 9, 2022

For future reference, this Stack Overflow thread provides insight into some options that give us better mesh-oriented folds:

https://stackoverflow.com/a/53908416/764307

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants