Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How tf.image.extract_image_patches works ? #29857

Closed
Khoa-NT opened this issue Jun 17, 2019 · 5 comments
Closed

How tf.image.extract_image_patches works ? #29857

Khoa-NT opened this issue Jun 17, 2019 · 5 comments
Assignees
Labels
comp:ops OPs related issues TF 2.0 Issues relating to TensorFlow 2.0 type:support Support issues

Comments

@Khoa-NT
Copy link

Khoa-NT commented Jun 17, 2019

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 16.04
  • TensorFlow installed from (source or binary): Anaconda for 1.13.1 and Pip for TF 2 Beta
  • TensorFlow version (use command below): TF 1.13.1 and TF 2 Beta
  • Python version: 3.6.8
  • CUDA/cuDNN version: CUDA 10 / cuDNN 7.6
  • GPU model and memory: GTX 1080 , 11GB

Describe the current behavior
I want to extract a large gray scale image from (1250 x 1250) to patches (512 x 512). So I tried to run tf.image.extract_image_patches on both versions: TF 1.13.1 and TF 2 Beta follow this tutorial.
My parameters:

my_input_image # shape = ( batch , size_x, size_y)
my_input_image = tf.expand_dims(my_input_image ,-1)  #add 1 more "depth" channel as the last axis 
ksizes = [1, 512, 512, 1] #size of output patch
strides = [1, 256, 256, 1] # Stride
rates = [1, 1, 1, 1]  #Rate
padding='SAME' # I want to have zero padding when the stride go out of my_input_image 

image_patches = tf.image.extract_patches(input_big_pic, ksizes, strides, rates, padding)


image_patches.shape  # => TensorShape([125, 5, 5, 262144]) . Why we have 5 pictures in a row?

patch1 = image_patches[0,0,0,] # Get the 1st patch
patch1 = tf.reshape(patch1, [512, 512, 1]) # Reshape to the correct shape
patch1 = tf.squeeze(patch1) # Remove the depth channel
plt.imshow(patch1)

tf.image.extract_patches will output a matrix image patches 5x5.
The zero paddings = 143 on each edge.
I don't understand why we have 5 pictures in a row?
How tf.image.extract_patches works ?

Describe the expected behavior
It should be 4x4 matrix image patches with zero paddings = 15 on each edge.
Denote n is the number of stride steps.
The number image patchs = n + 1

Input_size = 1250
Output_size= 512
Stride = 256
Padding:

We have an equation:
2*Padding + Input_size = n*Stride + Output (1)
We don't know how many zero padding we need. So:
Input_size <= n*Stride + Output
1250 <= n*256 + 512
Then 2.88 <= n .
We choose the nearest interger n = 3.
(1) => zero padding = 15

Will will have 4x4 matrix image patches with zero paddings = 15 on each edge.

Please correct me if my calculation is wrong.

@martinwicke
Copy link
Member

Please the this excellent explanation on StackOverflow: https://stackoverflow.com/questions/40731433/understanding-tf-extract-image-patches-for-extracting-patches-from-an-image

I don't think there's a bug here, so I will close this issue.

@martinwicke
Copy link
Member

(That said, I am adding a proper docstring to tf.image.extract_patches)

@Khoa-NT
Copy link
Author

Khoa-NT commented Jun 23, 2019

@martinwicke : Hi, I've read it.
But I would like to know why there is a difference between a manual calculation result and tf.image.extract_patches result ?

@martinwicke
Copy link
Member

1250/256 = 4.8.., because you use 'same' padding, you round up.

The sizes (512) is not relevant to determine how many patches there are, only how big they are (they can overlap).

tensorflow-copybara pushed a commit that referenced this issue Jun 24, 2019
Fixes #29857.

PiperOrigin-RevId: 254797260
@Khoa-NT
Copy link
Author

Khoa-NT commented Jun 25, 2019

Example:
Input_picture = np.zeros((1,1250,1250,1))
From tf.extract_image_patches

def extract_patches(x):
    return tf.extract_image_patches(
        x,
        (1, 512, 512, 1),
        (1, 256, 256, 1),
        (1, 1, 1, 1),
        padding="SAME"
        )
Output_extracted = extract_patches(Input_picture)

Output_extracted will have shape:
TensorShape([Dimension(1), Dimension(5), Dimension(5), Dimension(262144)])
( 1, 5, 5, 262144 )
It means there are 5 x 5 = 25 patches.

But we calculate:

1250/256 = 4.8.., because you use 'same' padding, you round up.

The sizes (512) is not relevant to determine how many patches there are, only how big they are (they can overlap).

4.8 rounds up = 5
256*5 = 1280
It means we can add zero pad = 1280 - 1250 = 30
=> Padding big image size = 1280 x 1280
Then we extract 4 x 4 = 16 patches by
1st patch: 0----------->512
2nd patch: 256------------>768
3rd patch: 512-------------->1024
4th patch: 768---------------->1280

So the correct answer should be ( 1, 4, 4, 262144 )

@lvenugopalan lvenugopalan added the TF 2.0 Issues relating to TensorFlow 2.0 label Apr 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:ops OPs related issues TF 2.0 Issues relating to TensorFlow 2.0 type:support Support issues
Projects
None yet
Development

No branches or pull requests

5 participants