Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1024x1024 model question #12

Closed
zelenooki87 opened this issue Sep 12, 2021 · 17 comments
Closed

1024x1024 model question #12

zelenooki87 opened this issue Sep 12, 2021 · 17 comments
Labels
question Further information is requested

Comments

@zelenooki87
Copy link

Hi. Do you plan to release 1024x1024 model?
Or it can be done modifying code to cascade all models?
Cause, I think it is better to not downscale images to 128Pixels.
Thank you very much in advance

@Janspiry
Copy link
Owner

  1. 1024x1024 model
    According to the 64 to 512 super-resolution task, it needs a large mount of model parameters to recover adequate images information. I am afraid not to train 1024x1024 super-resolution model since GPU devices limitation in a short time.
  2. cascade all models
    For the current pre-trained models, it can not be cascaded directly. You may need to train yourself for new tasks : )

@zelenooki87
Copy link
Author

zelenooki87 commented Sep 12, 2021

ok.
could you please tell me, how to test your code on 512Px cropped face images without downscale them if possible? Thanks

@Janspiry
Copy link
Owner

Janspiry commented Sep 12, 2021

I did not understand it, which means to upscale a 512x512 images to higher resolution one?

@zelenooki87
Copy link
Author

I already cropped faces with onother project(GFPGAN) to 512Pix. Now, what I want is to try your code on those images without changing resolution. I want enhance faces to be in same resolution (512x512pixels). Then I will pasteback those faces directly from GFPGAN. If it is not possible, please tell me ways I can do it. If output face photo must be upscaled, no problem, I will downscale it with photoshop.

@Janspiry
Copy link
Owner

I see. You can directly execute tasks in 512Pixels images, since the model get a super-resolution result from 512Pixels" low-resolution" image. You should change the config to "mode": "HR", // whether need LR img and commented all code about
LR or lr likelr_img = Metrics.tensor2img(visuals['LR']); in sr.py.

@zelenooki87
Copy link
Author

zelenooki87 commented Sep 12, 2021

when I commented that line everything goes OK until
21-09-12 15:40:59.820 - INFO: Begin Model Evaluation.
sampling loop time step: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [05:54<00:00, 5.64it/s]
Traceback (most recent call last):
File "C:\Users\KaMi\SR3\sr.py", line 150, in
hr_img = Metrics.tensor2img(visuals['HR']) # uint8
TypeError: 'NoneType' object is not subscriptable

Do you have solution for this? Thank you

@Janspiry
Copy link
Owner

image
Actually we take 512Pixel image as SR and ignore LR, but we also need HR to calculate SSIM in vanilla method. You can commented all code about SR or copy SR image to HR image as ground truth.

@zelenooki87
Copy link
Author

I successed it. But with model that correspond 64x512 results are too sharp and not much better from original.
Could you please suggest me which model best suits my case? Or how to upscale photos other than 512 pixels? is that possible?
thanks

@Janspiry
Copy link
Owner

I think this has something to do with the unstable pre-trained model. I have some suggestion about it:

  1. retrain it with adequate model parameters.
  2. sample from SR rather than random noise in validation, which need change the code img = torch.randn(shape, device=device) to img = x in sr3_modules/diffuision.py, Line 191, then adjust the timesteps range.
  3. use more timesteps than 2000 directly.
  4. use other super-resolution model.

@Janspiry Janspiry added the good first issue Good for newcomers label Sep 14, 2021
@lmvgjp
Copy link

lmvgjp commented Sep 28, 2021

Hi,
I tried your implementation and faced the problems of zelenooki87.
As you have experience in the field, can you please recommend other super-resolution model in particular that could work for this task?
Thanks a lot for your awsome code!

@Janspiry
Copy link
Owner

So sorry, I didn't know much about other models.
Maybe you can get them by rank in the https://www.paperswithcode.com/ or the SOTA in some papers.

@lmvgjp
Copy link

lmvgjp commented Sep 28, 2021

thank you very much for your reply, Janspiry!

@lmvgjp
Copy link

lmvgjp commented Oct 7, 2021

I think this has something to do with the unstable pre-trained model. I have some suggestion about it:

1. retrain it with adequate model parameters.

2. sample from SR rather than random noise in validation, which need change the code `img = torch.randn(shape, device=device)` to `img = x` in sr3_modules/diffuision.py, Line 191, then adjust the timesteps range.

3. use more timesteps than 2000 directly.

4. use other super-resolution model.

Hello, I have some questions about training your implementation of the model on my own dataset and I would be very grateful if you could please help me.
What main parameters should be changed for training to improve the quality of SR output?
what should the timesteps range be?
what is beta schedule?
do you recommend keeping the learning rate so small? (it is 10^(-6)

Thank you very much!

@Janspiry
Copy link
Owner

Janspiry commented Oct 8, 2021

Hi, thanks for your attention.

  1. parameters
    It should be helpful to change the channel parameters in ResNet block and block numbers like this:
    image
  2. range and beta schedule
    The model gradually adds some noise to random gaussian distribution, and then learns the process of removing noise. Beta schedule decide this adding process. More details can be found in its following version. Timesteps range should be the subinterval of vinallia range as we sample from SR rather than random noise. You can change the linear_end from 1e-2 to 1e-3 or smaller.
    image
  3. learning rate
    It should be larger like 1e-4 if the model have adequate parameters.

@lmvgjp
Copy link

lmvgjp commented Oct 13, 2021

Hello, thank you so much for your helpful reply!
Very kind of you!
Greetings

@Janspiry Janspiry added question Further information is requested and removed good first issue Good for newcomers labels Oct 23, 2021
@Janspiry Janspiry mentioned this issue Feb 7, 2022
@AndreyStille
Copy link

Hello, thank you so much for your helpful reply! Very kind of you! Greetings

Hi
Can you please give some advice if you have succeeded with your task? What beta schedule have you used?
I'm trying to train 256->1024, with quite big dataset, loss is reducing but validation result same as after first epoch.

@lmvgjp
Copy link

lmvgjp commented Feb 10, 2022

Hi
I could not make it work, sorry...the result was too noisy...I think the model needed much more images than i gave it.
I got good super-resolution results directly with this pretrained model, without training it myself:
https://github.com/yangxy/GPEN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants