Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference for image clipping in image recreation step? #70

Closed
kai-tub opened this issue Jan 13, 2020 · 5 comments
Closed

Reference for image clipping in image recreation step? #70

kai-tub opened this issue Jan 13, 2020 · 5 comments

Comments

@kai-tub
Copy link

kai-tub commented Jan 13, 2020

Hi,
first of all, thanks for providing this repository! I really like it and compare it to my own implementation, trying to understand the algorithms. I am very new to this, so sorry if this is a stupid question, but could you/anyone point me to a reference of why you clip instead of normalizing the images during the recreation phase?
The specific code lines are the following in the mis_functions.py:

recreated_im[recreated_im > 1] = 1
recreated_im[recreated_im < 0] = 0

Why shouldn't we rescale them to a range between 0 and 1?

Thanks :)

@utkuozbulak
Copy link
Owner

Hey, it is already scaled between 0 and 1. The reason for those lines is to force values that become higher than 1 (white color in image domain) and less than 0 (black color in image domain) to be within those ranges. Otherwise it can't be converted to an image.

@kai-tub
Copy link
Author

kai-tub commented Jan 18, 2020

Sorry, but I am still confused.
I get the process of getting the tensor of the image. But during the optimization process, I wouldn't assume the range of data points to still be mainly between 0 and 1. How I understand the code you say everything greater than 1 and lower than 0 is clipped to the corresponding value. Without really thinking, I would rescale ALL the values to a range between 0 and 1. So my old minimum and maximum are now my new 0 and 1 respectively:

Your method:
[-2, 0, 2] -> [0, 0, 1]

What I thought:
[-2, 0, 2] -> [0, 0.5, 1]

I am missing an explanation of why one method makes more sense than the other and would be grateful for a pointer. :) I bet your method is correct, but I wonder what speaks against the other method and would like to hear your thoughts about this. :)

So in my mind you think about the tensor being an "image" at all times and clip at the end of the optimization process, to be able to display the output. I rather thought that the tensor could optimize the way it wants and not think about it as an image during the process. I simply "convert" it back to an image range when it is done.

@utkuozbulak
Copy link
Owner

So, in the beginning there are no negative values in the image because the values are between [0, 255], from here we normalize to the range [0, 1] by dividing 255. I think with this part you have no issue.

Now, lets say there is this tensor that we are optimizing/modifying/generating. It can be that the values become less than 0 or more than 1. Now, the problem is, how are we going to convert this to an image?

1-
One way is clipping and multiplying by 255.
Image = Tensor[Tensor > 1 = 1, Tensor < 0 = 0] * 255

Let's say, we have following values in the tensor:
[-0.05, 0.2 ,0.1, 0.9, 1.1] with the clipping it becomes [0, 0.2 ,0.1, 0.9, 1] because we clip everything that is bigger than 1 and less than 0. And the image will be [0, 0.2. ,0.1, 0.9, 1]*255.

2-
Another way is (I think this is the way your intuition tells you) using minimum value to scale up the and maximum value to re-scale.

Something like this:
((Tensor + min(Tensor)) / max(Tensor) ) * 255.

Following the same example:
[-0.05, 0.2 ,0.1, 0.9, 1.1] will become
a) [0, 0.25, 0.15, 0.95, 1.15] after the first step. ( + min(Tensor))
b) [0, 0.217, 0.13, 0.826, 1] after the second step. ( / max(Tensor))

Now, again, all values are between [0,1] and output image is [0, 0.217, 0.13, 0.826, 1] * 255.

Which method is correct? You can make arguments for both. The reason I use (1) is because in that one, there is less modification to the original tensor. You see, the second one modifies every value in the tensor by definition. You can also argue that by using (1) the scaling is not correct, but in my experience for visualization cases, it really doesn't matter. This is because the values that are less than 0 or bigger than 1 are less and bigger by a veeeeeery small margin, so using (2) and (1) doesn't really change anything.

However, if you are talking about adversarial examples, by modifying every pixel during scaling you lose a important properties of adversarial attacks (which is the reason I continued using scaling in this way after working so long with adversarial attacks).

This is my reason for using (1) instead of (2). Feel free to use anything. Try it yourself and see if it makes a large difference.

I hope this makes things clear now.
Take care.

@utkuozbulak utkuozbulak reopened this Jan 18, 2020
@kai-tub
Copy link
Author

kai-tub commented Jan 20, 2020

Thanks!
Yes, I also tested both versions and didn't find big visual differences, although I've found max and min values, which were larger by a "bigger" margin, as in the same magnitude. The issue can be marked as solved, but you did end on a cliffhanger. What important properties of adversarial attacks are lost during scaling? I am happy to read through a reference. :)

@utkuozbulak
Copy link
Owner

Glad I could help.

I don't have a specific paper for scaling and adversariality, it is mostly my experience and what heard from other researchers during discussion. I vaguely remember reading something about box constraints etc. from Nicholas Carlini though https://nicholas.carlini.com/papers.

Take care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants