Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greatly increase max resolution output by taking advantage of this chrominance optimization #17

Closed
jantic opened this issue Nov 6, 2018 · 5 comments
Assignees

Comments

@jantic
Copy link
Owner

jantic commented Nov 6, 2018

Source: MayeulC on HackerNews, thread:

https://news.ycombinator.com/item?id=18363870#18369410

"Now, there seems to be a distinct loss of details in the restored images. The network being resolution-limited, is the black-and-white image displayed at full resolution besides the restored one?

What I would like to see is the output of the network to be treated as chrominance only.

Take the YUV transform of both the input and output images, scale back the UV matrix of the restored one to match the input, and replace the original channels. I'd be really curious to look at the output (and would do it myself if I was not on a smartphone)!"

@jantic jantic self-assigned this Nov 14, 2018
@jantic
Copy link
Owner Author

jantic commented Nov 16, 2018

And....This is done! So happy about this one.

@MayeulC
Copy link

MayeulC commented Nov 22, 2018

Hey, I just read your answer on HN. I'm glad to have been helpful, and that you were able to take advantage of this. Also, thank you for making this an issue, it makes further discussion easier.

For future reference, here is some material that was part of the original comment thread:


I had a look at dabb3a0 but couldn't determine if you reduced the dimensionality of the input/output data? Unfortunately, I am not familiar enough with the code to tell this or contribute in a meaningful way.

I touched on this idea here and there, but the basic idea is that you should be able to reduce the size of the input data a lot (factor 3) by feeding your network the luma channel instead of RGB. And have it only output the chroma channels. You can probably leave most hyperparameters untouched, although you might be able to reduce its size further (I am by no mean an authority on this, so take this with a grain of salt). This could provide quite sizeable performance improvements; mostly for training, but also at runtime.

@jantic
Copy link
Owner Author

jantic commented Nov 22, 2018

@MayeulC So yes I did consider reducing the dimensionality of the input like you suggested here. But here's the thing- as far as I can tell it wouldn't actually make a huge inpact on model size/efficiency. Reason being: The gray scale input currently comes in 3 channels in an input layer, but then is immediately expanded into much higher dimensional data as the model processes it. 3 to 64 to 256 to 512, etc. At that point I'd expect that the model would be effectively consolidating the redundancies in the channels (when I reduce the dimensions I get reduced performance). So really, as soon as they're processed, the fact that they were 3 channels or one quickly becomes almost irrelevant- the information that's relevant, regardless of redundancy is already extracted.

So in other words- I'd expect it to make almost no difference to reduce the input channels here. Now to complicate matters, I'm also using a pretrained network that expects 3 channels already. The pretrained network (Resnet34) has these hard-earned weights (coefficients) that took a lot of time to train on somebody else's machine. So I think that's worth keeping as well- there's not going to be a 1 channel version of this pretrained.

I might be wrong somehow here. Please correct me if that's the case!

@MayeulC
Copy link

MayeulC commented Nov 25, 2018

All right, your explanation makes sense. I was expecting the dimensionality gains to propagate down the layers, but it is true that it might only provide tangible gains in the first (and last) channels. And the fact that you are using a pre-trained network also makes sense!

I am afraid I can't really provide much more interesting input for your project, I wish you the best of luck with it, and I look forward to its next iterations!

@jantic
Copy link
Owner Author

jantic commented Nov 25, 2018

@MayeulC Dude, you made such a huge impact on this project already! That was the single most impactful improvement I've been able to make on the rendering. So thank you thank you thank you.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants