Original Model 128 worth it? #385

andenixa · 2018-05-02T05:59:12Z

I have created a rough version of Original Model with dimensions 128, 128, 3.

Rationale:

There seems to be increasing demand for HD face-swapping while none had any luck with GAN128 as far as I can tell from issue and the playground. In addition it could also cover more face area.

Is releasing Original128 worth it? Still assessing the efficiency. I had to sacrifice some color data to keep up with memory / speed limitations but overally its not very visible (as opposed to GAN128 that discards
original color data). Speed seems to be up to snuff. Trying one-to-many scenarios as well. No LowMem version probably will ever be created for that.

Could also do Orig256 and Orig512 but definably won't fit in consumer GPU RAM.
--cheers

Kirin-kun · 2018-05-02T06:39:45Z

Post results eventually.

Something like a short clip before and after. So we can judge if it's worth it?

I think the main problem isn't the resolution, but the averaging.

andenixa · 2018-05-02T06:58:39Z

Sure I will. I don't fully understand the theory though a highres data-set may contain more details for reconstruction (and more face coverage as well). I also work on idea if a concatenated Conv2d tensors with different kernels could preserve facial features such as wrinkles, freckles, and moles.

ruah1984 · 2018-05-02T08:34:21Z

You can share out, and we run again to try.

torzdf · 2018-05-02T10:55:20Z

Yeah, I think it's worth it. If you can add a new model, then choice is good.

If it's in a state you can share, then please raise a Pull Request so others can test.

Thanks 👍

gessyoo · 2018-05-02T16:21:28Z

I'm willing to test it. Is the Dfaker plugin/model still planned for integration at some point? I can run the Original model with no issues, but can't get the proposed Dfaker code here to work.

ruah1984 · 2018-05-03T07:05:36Z

@andenixa we can test it, i believe 1080TI can support your request.

andenixa · 2018-05-03T07:25:15Z

@torzdf yes its in in a state which I can share it. The major issue is to see if it has any meaningful results and since 128 models train longer I am still looking if it can perform at the level of Original quality-wise. At this stage it learns rather well but decoder part significantly lags behind the encoder and I can't predict its limitations.
Funny part I can run it with ENCODER_DIM of ~3k and batch size ~42 (there is no -bs size limitation such as even numbers only or has to be power of 2) and it still fits 9080ti memory.
PS: I shamelessly picked how GAN128 is implemented, but my model doesn't share its architecture. I only use some GAN128 tricks to conserve GPU RAM.

torzdf · 2018-05-03T09:23:50Z

Excellent. Well, whenever your ready, please raise a PR pointing at the Staging branch. Thanks!

andenixa · 2018-05-04T05:06:49Z

@torzdf
I might be making a PR to Staging as you've suggested. Perhaps someone could get better result either by getting better data-set and giving more it training or tweaking the model itself while I work at my version.
Though I am yet to see consistent results that would at least outstretch GAN128. I am pretty happy with its learning ability, but the result it generates is a little "low-fi". On a good side it doesn't create aberrations such as twisted lines out of nowhere (the major reason I started making this one over GAN128 trainer).

andenixa · 2018-05-05T11:52:30Z

Still tuning the Net. Memory consumption is modest even for middle-level cards. Speed is quite good, but I can't get a crisp picture even for decoder.

tjdwo13579 · 2018-05-07T06:00:55Z

I'm not trying to nitpick but is conversion not possible with this model?

I've tried adding the "-t OriginalHighRes" on the conversion code but it's not working.

It says:
Reason: Error when checking : expected input_4 to have shape (None, 128, 128, 3) but got array with shape (1, 64, 64, 3)

Was this commit only meant for training as of now?
Mind my ignorance.. I'm not an expert in this field

andenixa · 2018-05-07T10:39:31Z

@tjdwo13579 you might be right I have forgotten to add conversion code. I did a PR yet I weren't able to test it with the latest git version.
Somehow new releases became less Windows path friendly specially if you are using SMB paths like in my case.
@iperov I shall be sure to check it. Thanks.

tjdwo13579 · 2018-05-07T10:51:20Z

@andenixa Thanks for adding the conversion code! I'll try it out now.

andenixa · 2018-05-07T11:10:31Z

@iperov I shall try your interleaved Upscale/ResBlock approach on decoder if you don't mind. I also like the face extractor you are using. I want to create something akin to H128 yet maskless. I noticed you reduce memory consumption by using smaller batch sizes. Does it play well for diverse (different lighting condition) data-sets? I noticed bugger batches contribute to more accurate / generalized models. I wasn't able to refine anything with bs < ~45-48

andenixa · 2018-05-07T11:56:15Z

@iperov looks excellent. Do you think its possible to preserve TARGET faces details, freckles, perhaps through another special layer? Do you feel that additional conv layers (for Encoder) contribute to better detail preservation? I also want to try deep-deep approach with additional Dense+Dropout layer in the middle of Encoder.

iperov · 2018-05-07T13:10:35Z

@andenixa model experiments with result comparison are much welcome.

andenixa · 2018-05-07T17:46:35Z

Thanks to @iperov I am currently testing another revision of HighRes model adopting their re-scaling idea.
Memory consumption is somewhat high though but you guys with 6 to 8Gb should be fine. Training speed is slower as there is much more deep layers in Encoder. When I get a model with somewhat good clarity I shall adjust it for more face coverage.

torzdf · 2018-05-07T19:46:39Z

I'll leave it open as a pr for now. Let me know when you think it's ready for merging.

andenixa · 2018-05-08T12:05:24Z

@torzdf sure, I am just trying to see if its not worse than the previous one considering decreased learning speed and raised memory demands. I am also trying a sliced bread design with dropout layer in the middle because previous 64x model (which is the basis for HighResv2) overtrained because of increased number of Conv2D layers.
I shall credit the ideas I might have borrowed from other contributes of course.
Generally I just want a working 128x tensor with HD quality;)

andenixa · 2018-05-12T10:50:11Z

Still working on the model. The clarity is fascinating now, but the target vectors sometimes match wrongly aligned faces. I am trying to reduce number of deep layers to see if it helps that but I shall leave the high clarity (very-deep) Encoder in the code as well for those who want to experiment.

andenixa · 2018-05-13T13:21:08Z

@torzdf I've updated my PR for the new model. It seems to be rather sane and stable. It takes some time to train and resource consumption is around 5gb per 24 batch_size. The clarity is rather good with a nice data-set. It seems to work for multi-gpu model as well.

iperov · 2018-05-13T13:47:30Z

@andenixa is SeparableConv useful ? what benefits it provides? have you comparison against regular idea with residual layers ?

andenixa · 2018-05-13T14:45:13Z

@iperov I think its slightly faster and less accurate with colors as it processes color layers separately (presumably sequentially). It consumes less memory though it probably has worse convergence in general. I try to squeeze more layers while having reasonable training speed and Ram requirements. Also ideally the first conv layers it has to be 2x the retina side size yet I think it's unfeasible with Conv2D memory wise.
If you can fix a proper 128x HalfFace using the regular Conv2D I'd appreciate it.

PS: The reason I can't use OpenFaceSwap it's not compatible with current training sets and I have a lot of manually crafted sets.

torzdf · 2018-05-14T14:36:43Z

Ok, I haven't got time to test this at the moment, but I will merge it into staging.

If anyone wants to checkout the staging branch, give it a go and report back their findings that would be appreciated.

iperov · 2018-05-14T16:59:01Z

@andenixa I made best H128, without suxx residual blocks. I removed res blocks from all models. New super update for all models upcoming...

andenixa · 2018-05-15T10:36:23Z

@iperov sounds fascinating if you can make it happen. In fact perhaps we should aim for H256 next. I very excited to give your H128 a try just need a time to time to make a training set.

Are H128 considerably different from full-face? For regular faceswap its just a matter of adjusting the margin matrix and of course training it to catch more "space". I actually changed new HighRes model to cover most of the face which is going to be in the next revision.

iperov · 2018-05-15T10:41:05Z

H128 has more details vs full face 128, but doesnt cover one cheek and beard.
Half face good for women fakes whose cheeks occluded by hair.

andenixa · 2018-05-15T10:45:07Z

@iperov I am not exactly aiming to create fakes but rather to have one-to-many model where I merge multiple faces in the target data-set to catch unique features of each face. I have been successful with the basic Model by adding extra Conv layer(s) and increasing neuron count at dense layers. The problem of poor generalization and over-fitting still persists. It needs some learning rate decay and a lot of training epochs and still sucks quality-wise.
The major problem is also that the approach faceswap uses puts too much emphasis at matching the color rather than shape which makes it difficult to "melt" multiple sets.

iperov · 2018-05-15T10:51:08Z

then what you doing in face swap repo ?

andenixa · 2018-05-15T10:53:33Z

@iperov faceswap serves my purpose to some extent. It also doesn't have any working 128 model thus though I could provide one. Still not sure if my "concoction" works good enough (though its gotten much better now). Perhaps you could donate some of your code to create a basic H128 with decent quality and speed for faceswap repo.

tjess78 · 2018-06-15T11:48:05Z

hmn, latest push still says OOM in both modes, even with batch size 2 on GTX 1060 6Gb

Kirin-kun · 2018-06-15T11:50:38Z

@tjess78 I think you have to checkout the andenixa-patch-1 branch

andenixa · 2018-06-15T12:09:35Z

I tested it. Seems to be learning rather well both performance and quality-wise.
Further increasing of image size would require a mask otherwise it won't converge because of the background and Dense limitations (due to memory).
You may expect increase of both image quality as there is much room for the improvement. I might also be implementing re-loading of old saved weights and re-sizing it to fit the new topology if that is reasonably practical.

Jasas9754 · 2018-06-15T12:18:42Z

@andenixa
1.What are the benefits of 'shaoanlu' type? I don't think it's learning faster than the original, or clearer. What's the advantage?
2. You have raised encoder dim to 1024(not 512) and changed conv to conv_sep. Do you think encoder dim is more important?

Thank you

andenixa · 2018-06-15T12:29:57Z

@Jasas9754

I think shaoanlu has better generalization, but you are not required to use it.

conv_sep takes less memory, but the final result is not different from conv, though it might be slower to catch up at later stages. Encoder dim is very important specially with unmasked auto-encoders. In fact its plays a great role for both learning, clarity, and better reconstruction. Though we don't always have the luxury to make it big enough because of video memory.

tjess78 · 2018-06-15T14:10:06Z

Narf, I don't get it. No matter what I do, it will not run on my 1060 6GB.
Here is the error. tried it with last push from gui 3.0 and andenixa patch 1, 2 and 2-1. Could it be an error in my eviroment?

Exception in thread Thread-1:
Traceback (most recent call last):
File "d:\ProgramData\Anaconda3\envs\gui\lib\threading.py", line 914, in _bootstrap_inner
self.run()
File "d:\ProgramData\Anaconda3\envs\gui\lib\threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "D:\faceswap-and\scripts\train.py", line 97, in process_thread
raise err
File "D:\faceswap-and\scripts\train.py", line 89, in process_thread
self.run_training_cycle(model, trainer)
File "D:\faceswap-and\scripts\train.py", line 124, in run_training_cycle
trainer.train_one_step(epoch, viewer)
File "D:\faceswap-and\plugins\Model_OriginalHighRes\Trainer.py", line 39, in train_one_step
loss_B = self.model.autoencoder_B.train_on_batch(warped_B, target_B)
File "d:\ProgramData\Anaconda3\envs\gui\lib\site-packages\keras\engine\training.py", line 1220, in train_on_batch
outputs = self.train_function(ins)
File "d:\ProgramData\Anaconda3\envs\gui\lib\site-packages\keras\backend\tensorflow_backend.py", line 2661, in call
return self._call(inputs)
File "d:\ProgramData\Anaconda3\envs\gui\lib\site-packages\keras\backend\tensorflow_backend.py", line 2631, in _call
fetched = self._callable_fn(*array_vals)
File "d:\ProgramData\Anaconda3\envs\gui\lib\site-packages\tensorflow\python\client\session.py", line 1454, in call
self._session._session, self._handle, args, status, None)
File "d:\ProgramData\Anaconda3\envs\gui\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[65536,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: training_1/Adam/gradients/model_1/dense_1/MatMul_grad/MatMul_1 = MatMul[T=DT_FLOAT, _class=["loc:@training_1/Adam/gradients/model_1/dense_1/MatMul_grad/MatMul"], transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model_1/flatten_1/Reshape, training_1/Adam/gradients/model_1/dense_2/MatMul_grad/MatMul)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

andenixa · 2018-06-15T16:42:25Z

@tjess78 seems to be a genuine OOM. What batch size do you use and are there any other programs that use video memory are running in the background?
I generally test with a clean environment and a dedicated card. You might probably want to disable Aero theme and switch to basic altogether. Also close any applications such as web browser that highly utilize video memory. You should be able to run it with batch sizes 8-13 (depending on free ram).

Jasas9754 · 2018-06-16T05:16:16Z

People who are short on vram can adjust the settings.

ENCODER_DIM = 512

x = Dense(dense_shape * dense_shape * 1024)(x)
x = Reshape((dense_shape, dense_shape, 1024))(x)

x = self.upscale(512, kernel_initializer=RandomNormal(0, 0.02))(inpt)
x = self.upscale(256, kernel_initializer=RandomNormal(0, 0.02))(x)

I trained up to 100k with it(-bs 8). the quality is still better than the original 64. The speed is also acceptable.

ruah1984 · 2018-06-19T03:33:56Z

can any one integrate Iperov DeepFacelab to here? Because i think Deepfakes master here should include all kind of faceswap model study . i not sure what will be the different but , i hope master here can include all different deepfakes model such as DFaker, original model, low memory model , LIAEF128YAW (5GB+)

Kirin-kun · 2018-06-19T08:15:35Z

Since it's open source, I suppose anyone, with proper credit given, can port Iperov ideas here.

But don't expect him to help.

andenixa · 2018-06-19T08:50:28Z

@ruah1984 I don't want do put you off as we might actually port some of these models eventually.
But if you are really interested in testing Iperov's models I may suggest just using his fork for the purpose as everything is already in place there.

@Kirin-kun Iperov says he doesn't mind if some of his work is to be ported to faceswap. I actually asked his permission twice and he said I don't need his permission.

ruah1984 · 2018-06-19T09:13:51Z

I have been try his work. And result looks good .Just hope if we have gui version and put all together inside here as a team.

torzdf · 2018-06-19T11:36:20Z

As @andenixa says. Anyone is welcome to open a PR to port other models.

oracle9i88 · 2018-06-21T09:08:25Z

TypeError: join() argument must be str or bytes, not 'PosixPath'

oracle9i88 · 2018-06-21T09:08:58Z

other mode is fine

torzdf · 2018-06-26T08:24:01Z

Should be fixed in latest commit.

agilebean · 2018-06-26T12:36:56Z

Yes, I can confirm that the last reported bug (json expects str or bytes) is fixed.
OriginalHighRes works!
Thanks a lot to @andenixa and @torzdf for this extremely short turnaround time, really incredible.

Kirin-kun · 2018-07-03T10:46:56Z

I tested OriginalHighres with a small dataset and it looks really good. The faces are more detailed and have more depth/defined features than with Original.

A few differences:

OriginalHighRes seem to learn gaze direction better
OriginalhighRes pupils are a little blurrier than Original and the color is lighter/grayish
OriginalHighRes conversion seem to blend colors better. I had some patches of lighter skin on the cheeks for Original, but OriginalHighres gave better skin tone overall (same convert params)
OriginalHighRes seem to have difficulties learning Half-open/Closed eyes and Open/Smiling mouth. I'm at 130k iterations and the smiles just start to look like the ones produced much faster by Original.

In then end, the only caveats are the pupils that look a little too like lifeless gray circles, in Original, they are more black so it's less visible. And eventually the smiles showing teeth that look blurred, but I will see if it improves after training more.

Overall, I'm really satisfied with this model. It might become my preferred model. And its memory management is amazing. With a GTX 1060 6gb, I can manage a batch size of 16

andenixa · 2018-07-04T11:10:12Z

@Kirin-kun thank you for your feedback. I think we essentially would need to add mask for eyes to work.
Batch size of 16 has nothing to do with that as it pretty sufficient. For now if you are doing something to produce a video, not just testing the model, I'd suggest to run the training to 300epochs. I know its an overkill but that would probably make the situation a little better with pupils and eye positions.

Kirin-kun · 2018-07-04T11:58:26Z

For the moment, I'm not doing videos.

I use photos as source material. It's a lot better looking than flickering videos and it's possible to adjust convert parameters easier than with a video with thousands of frames with different zooming on the faces. I tried to do videos, but I had mixed results (with just about all models). In the end, still pictures of models posing gives the best visually seamless faceswap.

I trained more on the same dataset, from 130k to 150k iterations, and the changes are really minute on the converted faces. When comparing, the differences are barely visible.

Further improvements I see are, eventually, the pupils looking more lifelike and also a way to handle, at convert time, the obstacles, like glasses, hair, hand, hats, etc, that cover parts of the destination face.

torzdf · 2018-09-22T09:16:18Z

Closed as OriginalHiRes is implemented.

@andenixa feel free to open new issues for your new models.

torzdf mentioned this issue Jun 15, 2018

GUI v3.0.0a released #412

Closed

torzdf added enhancement feature feedback wanted test needed labels Jun 15, 2018

torzdf closed this as completed Sep 22, 2018

Repository owner deleted a comment from iperov Jun 29, 2019

Original Model 128 worth it? #385

Original Model 128 worth it? #385

Comments

andenixa commented May 2, 2018 • edited

Kirin-kun commented May 2, 2018

andenixa commented May 2, 2018 • edited

ruah1984 commented May 2, 2018

torzdf commented May 2, 2018

gessyoo commented May 2, 2018

ruah1984 commented May 3, 2018

andenixa commented May 3, 2018 • edited

torzdf commented May 3, 2018

andenixa commented May 4, 2018 • edited

andenixa commented May 5, 2018

tjdwo13579 commented May 7, 2018 • edited

andenixa commented May 7, 2018

tjdwo13579 commented May 7, 2018

andenixa commented May 7, 2018 • edited

andenixa commented May 7, 2018 • edited

iperov commented May 7, 2018

andenixa commented May 7, 2018 • edited

torzdf commented May 7, 2018

andenixa commented May 8, 2018 • edited

andenixa commented May 12, 2018 • edited

andenixa commented May 13, 2018

iperov commented May 13, 2018

andenixa commented May 13, 2018 • edited

torzdf commented May 14, 2018

iperov commented May 14, 2018 • edited

andenixa commented May 15, 2018 • edited

iperov commented May 15, 2018 • edited

andenixa commented May 15, 2018 • edited

iperov commented May 15, 2018

andenixa commented May 15, 2018 • edited

tjess78 commented Jun 15, 2018 • edited

Kirin-kun commented Jun 15, 2018

andenixa commented Jun 15, 2018 • edited

Jasas9754 commented Jun 15, 2018 • edited

andenixa commented Jun 15, 2018 • edited

tjess78 commented Jun 15, 2018 • edited

andenixa commented Jun 15, 2018 • edited

Jasas9754 commented Jun 16, 2018

ruah1984 commented Jun 19, 2018

Kirin-kun commented Jun 19, 2018

andenixa commented Jun 19, 2018 • edited

ruah1984 commented Jun 19, 2018

torzdf commented Jun 19, 2018

oracle9i88 commented Jun 21, 2018

oracle9i88 commented Jun 21, 2018

torzdf commented Jun 26, 2018

agilebean commented Jun 26, 2018

Kirin-kun commented Jul 3, 2018 • edited

andenixa commented Jul 4, 2018

Kirin-kun commented Jul 4, 2018

torzdf commented Sep 22, 2018

andenixa commented May 2, 2018 •

edited

andenixa commented May 2, 2018 •

edited

andenixa commented May 3, 2018 •

edited

andenixa commented May 4, 2018 •

edited

tjdwo13579 commented May 7, 2018 •

edited

andenixa commented May 7, 2018 •

edited

andenixa commented May 7, 2018 •

edited

andenixa commented May 7, 2018 •

edited

andenixa commented May 8, 2018 •

edited

andenixa commented May 12, 2018 •

edited

andenixa commented May 13, 2018 •

edited

iperov commented May 14, 2018 •

edited

andenixa commented May 15, 2018 •

edited

iperov commented May 15, 2018 •

edited

andenixa commented May 15, 2018 •

edited

andenixa commented May 15, 2018 •

edited

tjess78 commented Jun 15, 2018 •

edited

andenixa commented Jun 15, 2018 •

edited

Jasas9754 commented Jun 15, 2018 •

edited

andenixa commented Jun 15, 2018 •

edited

tjess78 commented Jun 15, 2018 •

edited

andenixa commented Jun 15, 2018 •

edited

andenixa commented Jun 19, 2018 •

edited

Kirin-kun commented Jul 3, 2018 •

edited