Shouldn't it be "pil_img" instead of "input"? #20

PytaichukBohdan · 2021-11-05T16:16:20Z

Line 59 in 54d273e

pil_img = resize_right.resize(input, out_shape=[smallest_side],

PytaichukBohdan · 2021-11-05T16:19:27Z

Or even np.array(pil_img)
But still, the code isn't running properly after those changes

Got an error:
RuntimeError: adaptive_avg_pool2d(): Expected input to have non-zero size for non-batch dimensions, but input has sizes [1, 1066, 0, 226] with dimension 2 being empty

afiaka87 · 2021-11-06T06:29:25Z

@PytaichukBohdan

Indeed, thanks for filing an issue! I'll patch it.

afiaka87 · 2021-11-06T06:34:09Z

c81fde0
should have fixed it. let me know if it works for you.

afiaka87 · 2021-11-06T06:35:15Z

@PytaichukBohdan - ah yes, another issue you may be facing is that you have to use multiples of (i believe) 16 (for size < 128) and 32 for the offsets.

htoyryla · 2022-02-19T07:56:47Z

Still does not work. ResizeRight is expecting either a numpy array or a torch tensor, now it gets a PIL image which does not have shape attribute.

This is what I tried and at least it runs without an error

   t_img = tvf.to_tensor(pil_img)
   t_img = resize_right.resize(t_img, out_shape=(smallest_side, smallest_side),
                                 interp_method=lanczos3, support_sz=None,
                                 antialiasing=True, by_convs=False, scale_tolerance=None)
   batch = make_cutouts(t_img.unsqueeze(0).to(device))

I am not sure what was intended here as to the output shape. As it was it made 1024x512 from 1024x1024 original, for image_size 512, now this makes 512x512.

I am not using offsets, BTW.

As to the images produced, can't see much happening, but I guess that is another story. According to my experience guidance by comparing CLIP encoded images is not very useful as such, so I'll probably go my own way to add other ways as to image based guidance. This might depend on the kind of images I work with and how. More visuality than semantics.

PS. I see now that the init image actually means using perceptual losses as guidance, rather than initialising something (like one can do with VQGAN latents for instance). So that's more like what I am after.

htoyryla · 2022-02-19T09:25:36Z

Or even np.array(pil_img) But still, the code isn't running properly after those changes

Got an error: RuntimeError: adaptive_avg_pool2d(): Expected input to have non-zero size for non-batch dimensions, but input has sizes [1, 1066, 0, 226] with dimension 2 being empty

I tried that also first. I guess it fails as the numpy array has shape (h, w, c) while (I think) (c, h, w) is expected. Using to_tensor takes care of this.

afiaka87 closed this as completed Nov 12, 2021

htoyryla mentioned this issue Feb 19, 2022

Issue #20 still not working. #24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shouldn't it be "pil_img" instead of "input"? #20

Shouldn't it be "pil_img" instead of "input"? #20

PytaichukBohdan commented Nov 5, 2021

PytaichukBohdan commented Nov 5, 2021

afiaka87 commented Nov 6, 2021 •

edited

Loading

afiaka87 commented Nov 6, 2021

afiaka87 commented Nov 6, 2021

htoyryla commented Feb 19, 2022 •

edited

Loading

htoyryla commented Feb 19, 2022

Shouldn't it be "pil_img" instead of "input"? #20

Shouldn't it be "pil_img" instead of "input"? #20

Comments

PytaichukBohdan commented Nov 5, 2021

PytaichukBohdan commented Nov 5, 2021

afiaka87 commented Nov 6, 2021 • edited Loading

afiaka87 commented Nov 6, 2021

afiaka87 commented Nov 6, 2021

htoyryla commented Feb 19, 2022 • edited Loading

htoyryla commented Feb 19, 2022

afiaka87 commented Nov 6, 2021 •

edited

Loading

htoyryla commented Feb 19, 2022 •

edited

Loading