Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training datasets #5

Closed
dexhunter opened this issue Aug 17, 2017 · 13 comments
Closed

training datasets #5

dexhunter opened this issue Aug 17, 2017 · 13 comments

Comments

@dexhunter
Copy link
Contributor

Hi! As @lllyasviel asked in makegirls I was wondering if @lllyasviel is willing to share the training datasets for this repo?

I have collected and preprocessed a dataset of anime faces with corresponding sketches(96x96 x2) Here is the link:

链接: https://pan.baidu.com/s/1o87HtRc 密码: yjmi

@lllyasviel
Copy link
Owner

As you wish.
400,000 images: https://nico-opendata.jp/en/seigadata/index.html
sketch maker: https://github.com/lllyasviel/sketchKeras

@dexhunter
Copy link
Contributor Author

dexhunter commented Aug 18, 2017

@lllyasviel Hi! As I checkout the niconico dataset, there are too many 'noisy'(weird cartoon, not usable) images. So do you use all of the image as input (without label?) or do you hand-pick (or with a script?) some of the images?

Edit:
Or do you use other data sets as well?

@dexhunter dexhunter reopened this Aug 18, 2017
@lllyasviel
Copy link
Owner

just download the dataset in reversed order.

@dexhunter
Copy link
Contributor Author

dexhunter commented Aug 18, 2017 via email

@lllyasviel
Copy link
Owner

you can download in reversed order and do not use all the dataset.
the quality can be relatively higher.

@dexhunter
Copy link
Contributor Author

Hi! I think it's okay that you collect your own data set, but at least you should be honest about what kind of data you are using.

Clearly, this open dataset you give with so many noisy data cannot generate the illustration on README. Here is a peek to the dataset_067 & dataset_066
out3

out4

@lllyasviel
Copy link
Owner

lllyasviel commented Aug 19, 2017

sketches in readme is real sketches by human and sketches from some random google search image results' sketch from sketchKeras.
It is meaningless to use sketches in database to show the model's result.

@dexhunter
Copy link
Contributor Author

So you hand-picked out the sketches & non-character images from data sets?

@dexhunter dexhunter reopened this Aug 19, 2017
@lllyasviel
Copy link
Owner

just use noisy dataset to train and then use your favourite sketch to test your model.
good luck.

@lllyasviel
Copy link
Owner

"noisy" is important for these model including paintschainer. paintschainer even has its own way to make data noisy. noisy is fine.

@dexhunter
Copy link
Contributor Author

Hmm...Interesting, I will take a look after the training. Thanks anyways.

@lllyasviel
Copy link
Owner

lllyasviel commented Aug 19, 2017

I also do not believe in the power of this noisy dataset at the begining, but not after I find this excellent project based on this dataset.
https://nico-opendata.jp/ja/casestudy/neural_style_synthesizer/index.html

@diyseguy
Copy link

diyseguy commented Apr 4, 2018

Has anyone tried other datasets? I would like to run this locally (since website has been down for a while). Still waiting for approval from nico-opendata to access these images. Thank you

@ghost ghost mentioned this issue May 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants