Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why the input image of generator contains target text ? #25

Closed
ChuangLee opened this issue Jul 27, 2017 · 10 comments
Closed

why the input image of generator contains target text ? #25

ChuangLee opened this issue Jul 27, 2017 · 10 comments

Comments

@ChuangLee
Copy link

ChuangLee commented Jul 27, 2017

the shape of input is (batch_size,256,256,6), the first 3 chanels is image of target font, why?I thought it would make generator cheat...
In my test, if set all value of first 3 chanels to zero. the image generated was not the same, although the difference was not big as I thought, that is not what we expect,right?

@kaonashi-tyc
Copy link
Owner

It is concatenated for the ease of processing, they are splitted before input into encoder.

@ChuangLee
Copy link
Author

@kaonashi-tyc
TKS, I will read the code carefully,But when infering, how do the value of first 3 chanels affect the result slightly?

@kaonashi-tyc
Copy link
Owner

kaonashi-tyc commented Jul 29, 2017

@ChuangLee

The inference still accept a 6 channel input, however the target part of will not be used or have any effect on the output, it is just easier to reuse the preprocessing code for inference as well.

@ChuangLee
Copy link
Author

ChuangLee commented Aug 1, 2017

@kaonashi-tyc That's what confusing me ,in my test, the target part does make the output different, this is my code:

        mat = misc.imread('my.jpg').astype(np.float)
        img = normalize_image(mat)
        img_pad = (np.random.random((256, 256, 3)) * 255).astype(np.float)
        img_pad = normalize_image(img_pad)
        img = np.concatenate([img_pad, img], axis=2)

img is the input, you will get different result every time, but the difference was not so big.

if you change the code of img_pad

        img_pad = np.zeros((256, 256, 3)).astype(np.float), 

you will get same result every time.

@xsmxsm
Copy link

xsmxsm commented Aug 8, 2017

@ChuangLee

what is your purpose of this code ?

mat = misc.imread('my.jpg').astype(np.float)
img = normalize_image(mat)
img_pad = (np.random.random((256, 256, 3)) * 255).astype(np.float)
img_pad = normalize_image(img_pad)
img = np.concatenate([img_pad, img], axis=2)

do you want to use "my.jpg" instead of the input font in font2img?
how is the result?

@ChuangLee
Copy link
Author

@xsmxsm
yeah ,I have only source images of character want to infer, of size 256*256 ,
result is good, I just don't know how does value of first 3 channels(target part) affect the output.

@xsmxsm
Copy link

xsmxsm commented Aug 8, 2017

@ChuangLee

in author's code , it acquire characters from a line file, then transfer them into images , do you mean you jump the step of font2img.py , and use your own images replace the generated images ? then what the style of characters in your images? can it use someone handwriting characters make the images?

then do you do the step of package.py ? and do you train them or you just use the author's trained result,and infer them?

can you show me your whole code in your infer ?

作者的font2img.py文件中,获取的是中文汉字集中的汉字或者是一个文档中输入的一行汉字然后转换成图片格式,你是跳过了这一步然后用自己的图片代替了这些图片吗?那你自己的图片中的字体是什么字体呢?可以是自己手写的字体拍成照片制作的吗?

你有没有进行package.py和train.py这两步呢?

你的这段代码是添加进infer.py文件中的吗?可不可以给我看看你的整体代码是怎样的?

多谢!

@ChuangLee
Copy link
Author

@xsmxsm
I just modified the code of infer function, made it take 256*256 size image as input, you have read the code above.
the font of source images is still SimSum,

@xsmxsm
Copy link

xsmxsm commented Aug 9, 2017

@ChuangLee

in infer , it just has these parameters: (below is in my code , )

python infer.py --model_dir=font_27
--batch_size=4
--source_obj=train.obj
--embedding_ids=4
--save_dir=save_dir

if you do the three steps before infer.py , it already has files train.obj or val.obj ,
what is the function of your 256256 input images? and how do you acquire the 256256 input images?

thank you so much

@ChuangLee
Copy link
Author

@xsmxsm
train.obj or val.obj is just package of images, you know?
I modified --source_obj to ---source_dir, what means a dir contains source images you want to infer ,
that's easy, I think you need to read the project's code more, perhaps debugging can help you understand,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants