New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Single Fold to Paper #21
Comments
This feature was also in my mind and I would like to work on it in my next PR. |
@shaheryar1 Have you had a chance to work out some ideas for generating random 2D maps that simulate these 3D effects? If so, can you share a code branch where you are working on this so that other can provide feedback and ideas? DocCreator's 3D transforms use a library of 3D meshes that warp the document and apply a ray casting type shader. However, that method is substantially more complicated than what is needed to emulate the intended effects. If we can use simpler transforms, then we can maintain higher performance and more maintainable code. For example, we can start with just the affine warps and worry about the variable brightness as a secondary concern. For the warp, we have to start with just the question of how we create the 2D transformation map (without actually adding a 3rd dimension for the depth/altitude of the positional shift, which would be used for relative darkness/brightness based on direction of light source). As parameters to generating the 2D mesh, I figure we might randomize things like the gradient (angle of ascent/descent), intensity (max peak/valley), angle of fold's central line relative to page -- there are probably better params and names than these. But, I'm not sure how one would create 2D meshes using math functions or these params. Do you or others have some ideas? |
In addition to the scikit-image affine warp implementation mentioned in the issue, we might borrow ideas from Albumentations implementation of elastic warp transform. |
@shaheryar1 are you still working on this? Otherwise i will look further on how to implement this. Actually i think it will be better if we avoid those 3D meshes, it would be much more complicated with that. Affine warp should be able to do something like this, but definitely need more work if we want to make it looks natural. |
The affine warp requires a 2D "mesh" of sorts. Basically, the mesh represents the "before" and "after" positions of key points fed into the warp processing. If we get to adding lighting or differing brightness/darkness levels, then a 3rd dimension of the mesh will be required to represent a "z index". But, we will want to get the warp functional before worrying with brightness or darkness. @shaheryar1 is going to work on #13 first. So @kwcckw, feel free to work on an implementation of this feature and we can all collaborate on a solution based on the results of your initial experiments. |
I researched on this and there wasn't much literature I found on this. For initial experiments, I thought to try applying 2-D affine transformation on sub-part of the image i:e on straight vertical/horizontal line with 5-10 pixel width. But did not got a chance to try this approach. @kwcckw you can try this idea to kick start this. |
Thanks, from the suggestion by @jboarman , i'm thinking to try grid distortion and elastic transform from here: |
That’s a great first start. Can you share your code in some way? As a Colab notebook would be ideal, but even as a gist would be OK. To get the lighting, we have to translate the implied mesh into a mask that we add to to the warped image. So, for each position, we are either descending or ascending, and the value is therefore a positive or negative number. Adding that layer to the base, ensuring a pixel does not go below 0 or above 255, will then darken or lighten the image. For a simple BW image like that example, we’ll only see darkening. Direction of lighting can be dealt with later. |
Okay, but i need to figure out the general function to shift the image after the transformation first, and google colab should be good enough to test the code. |
Here is the draft of the code in google colab: I added some noises and we are able to fold 2 locations too: I think it looks much more better now with the noises. |
This is a great starting point. We should go ahead and proceed with creating a PR based on this implementation since it is viable as it is now. We should continue to track this as an open issue, or split out a V2 for a more advanced semi-3D implementation down the road, but this is great as it is now. What parameters do you see that we should randomize for this version of the implementation? Some ideas for options and potential default values:
This sample sketch below provides a half-baked concept for a gradient intensity measure. This could very easily change in the future as we advance this feature, and we should not be afraid of breaking changes later with a feature that is so new. I'm pretty open to how we parameterize that fold intensity measure as there are clearly lots of ways to handle that. |
Regarding your last question, I think 2 transforms is good enough for V1 of this augmentation. Taking it further would likely consider some semi-3D aspects and direction of light source etcetera that will require more thought and not worth it for where we are with the library. Also, regarding the folds at arbitrary angles, that too can be deferred to a future date since it's not truly necessary for a V1 implementation. It's something we can improve later as needed. |
Alright got it, thanks for the input. |
This looks really good already, great work @kwcckw! Please submit a PR adding this augmentation. I think the artifacts you circled above look pretty natural; the edge of the "page" looks clean except for the corner where it was folded, and the dark spots would appear when copied by a real scanner (see images in #13 ). In the future, we can probably also approximate part of the "crumpled paper" effect I was thinking of in #17 by adding lots of these folds with small gradient widths and varying lengths (so the fold doesn't extend across the whole image). |
Thanks, i just submitted the pull request, but still it might be buggy in some conditions, need to test further.
Yea, the edges may look more natural if we apply perspective transform a few more times to smoothen it, but that probably would be overkill for now.
Right, but the challenge i can see is how to make it looks natural, since if the fold doesn't extend across the whole image, there's a lot of work need to be done on the starting line of folding effect. Right now i still don't have a clear idea to address on this problem yet. |
PR #28 has been merged. |
For future reference, this Stack Overflow thread provides insight into some options that give us better mesh-oriented folds: |
A separate feature request has been made that damages the paper with a crumpled / wrinkled effect (see #17). This request, however, starts with a simpler problem: a single fold.
The effect is characterized by the following:
The complexity here is creating a nice fold (either inward or outward) with any kind of ridge definition (most obvious in the last image in the set of examples). I propose for the scope of this first issue relating to 3D transformations would be to keep it simple and not attempt to create hard-creased edged but instead focus on smoother folds like the first couple example images.
The text was updated successfully, but these errors were encountered: