New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prisma seems to preserve more detail #58
Comments
Here's the result from my newly trained model after 40k iterations. The "colored mesh" did not spread out throughout the sky, as I feared, but in fact retreated and the overall look improved. But still, it has almost no similarity to the original style. In this image I'd point out the white "ghosts" behind the bridge and the buildings on the left. Here they blend quite well into the background but in my experiments I've seen images almost totally dominated by such "ghost" shapes, especially in the sky, and especially with higher style weights. |
I repeated the training with content_weight=1, style_weight=3, up to 40k iterations. The full command was:
Here's the resulting image. I then wrote a script to copy original colors to an image (https://gist.github.com/htoyryla/147f641f2203ad01b040f4b568e98260) and using it made the second image. I think it would be possible to get even closer to the Prisma look by still finetuning the weights, and to get a touch of detail, blend the different channels of the original image in a suitable mix into resulting image, instead of simply copying the color information. |
Wow, nice work @htoyryla! I think your Udnie model is better than mine :) In general I don't think that Prisma is doing anything fundamentally different from fast-neural-style; I think they have just spent a lot of time and effort carefully tuning the hyperparameters of their models to give nice effects. I think they also do some post-processing to blend the raw neural-net output with the content image; there are a lot of different ways to blend images, and I think they also tune the post-processing per style to make sure their results are nice. |
"I think they also do some post-processing to blend the raw neural-net output with the content image; there are a lot of different ways to blend images, and I think they also tune the post-processing per style to make sure their results are nice." That's exactly what I was thinking when I wrote about blending. I simply copied the Y channel, but if one wants a touch of detail then I think one could blend some of the other channels. And the optimum way to do this is likely to be specific to style. I used my own dataset consisting of 2500 photos, places and landscapes. I have noticed that using it can give quite different results from mscoco. I'll check now the same training but using mscoco. |
Interesting; I have only tried training with COCO but I'm pretty sure the training images are important. I think the number of training images is also important; In Dmitry's Instance Normalization paper (https://arxiv.org/abs/1607.08022) he mentions that his best results were trained with only 16 content images. I haven't done much experimentation with training sets, but this seems to be an important area for exploration. |
The results looks awesome! Thanks! I will try to do more works on post-processing. |
@jcjohnson I have tried to train with 200K images from coco, MIT space, and imageNet. Seems does not get better results. I will try this again later. |
@jcjohnson what is your parameter for the_wave style. I seems can not reproduce your results with parameter in print_options.lua. Following is my result: Your results prisma |
My wave model does not use instance norm, so you should set |
Is there a way to change parameters manually inside a model? Something like selecting weights that interests us, and changing their values? This script uses vgg16 model, but I also have not trained vgg19 model which gave me better results in slow neural style from fzliu. Can I switch to training vgg19 model with your script?
The first model is the best, but It have some glitches that I want to erease in future training sessions. |
I see you have copied the arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3 from my comment. As commented by @jcjohnson in another thread, that might be a poor choice, and perhaps one should add a conv layer between the two U2 layers (even if for me it worked without). |
@htoyryla can u make your dataset and pretrained models' parameters available? |
After almost three years of doing other things, no, sorry. |
I find prisma seems to preserve more details when do style transfer. Following is an example:
origin image
udnie style
prisma use udnie style
fast-neural-style results use udnie style
There are following differences:
I do not know how does prisma achieve this, I have already tune many hyper parameters during training.
The text was updated successfully, but these errors were encountered: