Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
71 changed files
with
8,448 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
## Expected behavior | ||
|
||
*Describe, in some detail, what you are trying to do and what the output is that you expect from the program.* | ||
|
||
## Actual behavior | ||
|
||
*Describe, in some detail, what the program does instead. Be sure to include any error message or screenshots.* | ||
|
||
## Steps to reproduce | ||
|
||
*Describe, in some detail, the steps you tried that resulted in the behavior described above.* | ||
|
||
## Other relevant information | ||
- **Command lined used (if not specified in steps to reproduce)**: main.py ... | ||
- **Operating system and version:** Windows, macOS, Linux | ||
- **Python version:** 3.5, 3.6.4, ... |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
* | ||
!*.py | ||
!*.md | ||
!*.txt | ||
!*.jpg | ||
!requirements* | ||
!doc | ||
!facelib | ||
!gpufmkmgr | ||
!localization | ||
!mainscripts | ||
!mathlib | ||
!models | ||
!nnlib | ||
!utils |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
Please don't ruin the code and this good (as I think) architecture. | ||
|
||
Please follow the same logic and brevity/pithiness. | ||
|
||
Don't abstract the code into huge classes if you only win some lines of code in one place, because this can prevent programmers from understanding it quickly. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
## **DeepFaceLab** is a tool that utilizes deep learning to recognize and swap faces in pictures and videos. | ||
|
||
Based on original FaceSwap repo. **Facesets** of FaceSwap or FakeApp are **not compatible** with this repo. You should to run extract again. | ||
|
||
### **Features**: | ||
|
||
- new models | ||
|
||
- new architecture, easy to experiment with models | ||
|
||
- works on 2GB old cards , such as GT730. Example of fake trained on 2GB gtx850m notebook in 18 hours https://www.youtube.com/watch?v=bprVuRxBA34 | ||
|
||
- face data embedded to png files | ||
|
||
- automatic GPU manager, chooses best gpu(s) and supports --multi-gpu | ||
|
||
- new preview window | ||
|
||
- extractor in parallel | ||
|
||
- converter in parallel | ||
|
||
- added **--debug** option for all stages | ||
|
||
- added **MTCNN extractor** which produce less jittered aligned face than DLIBCNN, but can produce more false faces. Comparison dlib (at left) vs mtcnn on hard case: | ||
 | ||
MTCNN produces less jitter. | ||
|
||
- added **Manual extractor**. You can fix missed faces manually or do full manual extract, click on video: | ||
[](https://webm.video/i/ogL0DL.mp4) | ||
 | ||
|
||
- standalone zero dependencies ready to work prebuilt binary for all windows versions, see below | ||
|
||
### **Model types**: | ||
|
||
- **H64 (2GB+)** - half face with 64 resolution. It is as original FakeApp or FaceSwap, but with new TensorFlow 1.8 DSSIM Loss func and separated mask decoder + better ConverterMasked. for 2GB and 3GB VRAM model works in reduced mode. | ||
* H64 Robert Downey Jr.: | ||
*  | ||
*  | ||
|
||
- **H128 (3GB+)** - as H64, but in 128 resolution. Better face details. for 3GB and 4GB VRAM model works in reduced mode. | ||
* H128 Cage: | ||
*  | ||
* H128 asian face on blurry target: | ||
*  | ||
*  | ||
- **DF (5GB+)** - @dfaker model. As H128, but fullface model. | ||
* DF example - later | ||
|
||
- **LIAEF128 (5GB+)** - new model. Result of combining DF, IAE, + experiments. Model tries to morph src face to dst, while keeping facial features of src face, but less agressive morphing. Model has problems with closed eyes recognizing. | ||
* LIAEF128 Cage: | ||
*  | ||
*  | ||
* LIAEF128 Cage video: | ||
* [](https://www.youtube.com/watch?v=mRsexePEVco) | ||
- **LIAEF128YAW (5GB+)** - currently testing. Useful when your src faceset has too many side faces vs dst faceset. It feeds NN by sorted samples by yaw. | ||
- **MIAEF128 (5GB+)** - as LIAEF128, but also it tries to match brightness/color features. | ||
* MIAEF128 model diagramm: | ||
*  | ||
* MIAEF128 Ford success case: | ||
*  | ||
*  | ||
* MIAEF128 Cage fail case: | ||
*  | ||
- **AVATAR (4GB+)** - face controlling model. Usage: | ||
* src - controllable face (Cage) | ||
* dst - controller face (your face) | ||
* converter --input-dir contains aligned dst faces in sequence to be converted, its mean you can train on 1500 dst faces, but use only 100 for convert. | ||
|
||
### **Sort tool**: | ||
|
||
`hist` groups images by similar content | ||
|
||
`hist-dissim` places most similar to each other images to end. | ||
|
||
`hist-blur` sort by blur in groups of similar content | ||
|
||
`brightness` | ||
|
||
`hue` | ||
|
||
`face` and `face-dissim` currently useless | ||
|
||
Best practice for gather src faceset: | ||
|
||
1) delete first unsorted aligned groups of images what you can to delete. Dont touch target face mixed with others. | ||
2) `blur` -> delete ~half of them | ||
3) `hist` -> delete groups of similar and leave only target face | ||
4) `hist-blur` -> delete blurred at end of groups of similar | ||
5) `hist-dissim` -> leave only first **1000-1500 faces**, because number of src faces can affect result. For YAW feeder model skip this step. | ||
6) `face-yaw` -> just for finalize faceset | ||
|
||
Best practice for dst faces: | ||
|
||
1) delete first unsorted aligned groups of images what you can to delete. Dont touch target face mixed with others. | ||
2) `hist` -> delete groups of similar and leave only target face | ||
|
||
### **Prebuilt binary**: | ||
|
||
Windows 7,8,8.1,10 zero dependency binary except NVidia Video Drivers can be downloaded from torrent. | ||
|
||
Torrent page: https://rutracker.org/forum/viewtopic.php?p=75318742 (magnet link inside) | ||
|
||
### **Facesets**: | ||
|
||
- Nicolas Cage. | ||
|
||
- Cage/Trump workspace | ||
|
||
download from here: https://mega.nz/#F!y1ERHDaL!PPwg01PQZk0FhWLVo5_MaQ | ||
|
||
### **Pull requesting**: | ||
|
||
I understand some people want to help. But result of mass people contribution we can see in deepfakes\faceswap. | ||
High chance I will decline PR. Therefore before PR better ask me what you want to change or add to save your time. |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
import numpy as np | ||
import os | ||
import cv2 | ||
|
||
from pathlib import Path | ||
|
||
class DLIBExtractor(object): | ||
def __init__(self, dlib): | ||
self.scale_to = 1850 | ||
#3100 eats ~1.687GB VRAM on 2GB 730 desktop card, but >4Gb on 6GB card, | ||
#but 3100 doesnt work on 2GB 850M notebook card, I cant understand this behaviour | ||
#1850 works on 2GB 850M notebook card, works faster than 3100, produces good result | ||
self.dlib = dlib | ||
|
||
def __enter__(self): | ||
self.dlib_cnn_face_detector = self.dlib.cnn_face_detection_model_v1( str(Path(__file__).parent / "mmod_human_face_detector.dat") ) | ||
self.dlib_cnn_face_detector ( np.zeros ( (self.scale_to, self.scale_to, 3), dtype=np.uint8), 0 ) | ||
return self | ||
|
||
def __exit__(self, exc_type=None, exc_value=None, traceback=None): | ||
del self.dlib_cnn_face_detector | ||
return False #pass exception between __enter__ and __exit__ to outter level | ||
|
||
def extract_from_bgr (self, input_image): | ||
input_image = input_image[:,:,::-1].copy() | ||
(h, w, ch) = input_image.shape | ||
|
||
detected_faces = [] | ||
input_scale = self.scale_to / (w if w > h else h) | ||
input_image = cv2.resize (input_image, ( int(w*input_scale), int(h*input_scale) ), interpolation=cv2.INTER_LINEAR) | ||
detected_faces = self.dlib_cnn_face_detector(input_image, 0) | ||
|
||
result = [] | ||
for d_rect in detected_faces: | ||
if type(d_rect) == self.dlib.mmod_rectangle: | ||
d_rect = d_rect.rect | ||
left, top, right, bottom = d_rect.left(), d_rect.top(), d_rect.right(), d_rect.bottom() | ||
result.append ( (int(left/input_scale), int(top/input_scale), int(right/input_scale), int(bottom/input_scale)) ) | ||
|
||
return result |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
from enum import IntEnum | ||
|
||
class FaceType(IntEnum): | ||
HALF = 0, | ||
FULL = 1, | ||
HEAD = 2, | ||
AVATAR = 3, #centered nose only | ||
MARK_ONLY = 4, #no align at all, just embedded faceinfo | ||
QTY = 5 | ||
|
||
@staticmethod | ||
def fromString (s): | ||
r = from_string_dict.get (s.lower()) | ||
if r is None: | ||
raise Exception ('FaceType.fromString value error') | ||
return r | ||
|
||
@staticmethod | ||
def toString (face_type): | ||
return to_string_list[face_type] | ||
|
||
from_string_dict = {'half_face': FaceType.HALF, | ||
'full_face': FaceType.FULL, | ||
'head' : FaceType.HEAD, | ||
'avatar' : FaceType.AVATAR, | ||
'mark_only' : FaceType.MARK_ONLY, | ||
} | ||
to_string_list = [ 'half_face', | ||
'full_face', | ||
'head', | ||
'avatar', | ||
'mark_only' | ||
] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
import numpy as np | ||
import os | ||
import cv2 | ||
from pathlib import Path | ||
|
||
from utils import std_utils | ||
|
||
|
||
|
||
def transform(point, center, scale, resolution): | ||
pt = np.array ( [point[0], point[1], 1.0] ) | ||
h = 200.0 * scale | ||
m = np.eye(3) | ||
m[0,0] = resolution / h | ||
m[1,1] = resolution / h | ||
m[0,2] = resolution * ( -center[0] / h + 0.5 ) | ||
m[1,2] = resolution * ( -center[1] / h + 0.5 ) | ||
m = np.linalg.inv(m) | ||
return np.matmul (m, pt)[0:2] | ||
|
||
def crop(image, center, scale, resolution=256.0): | ||
ul = transform([1, 1], center, scale, resolution).astype( np.int ) | ||
br = transform([resolution, resolution], center, scale, resolution).astype( np.int ) | ||
if image.ndim > 2: | ||
newDim = np.array([br[1] - ul[1], br[0] - ul[0], image.shape[2]], dtype=np.int32) | ||
newImg = np.zeros(newDim, dtype=np.uint8) | ||
else: | ||
newDim = np.array([br[1] - ul[1], br[0] - ul[0]], dtype=np.int) | ||
newImg = np.zeros(newDim, dtype=np.uint8) | ||
ht = image.shape[0] | ||
wd = image.shape[1] | ||
newX = np.array([max(1, -ul[0] + 1), min(br[0], wd) - ul[0]], dtype=np.int32) | ||
newY = np.array([max(1, -ul[1] + 1), min(br[1], ht) - ul[1]], dtype=np.int32) | ||
oldX = np.array([max(1, ul[0] + 1), min(br[0], wd)], dtype=np.int32) | ||
oldY = np.array([max(1, ul[1] + 1), min(br[1], ht)], dtype=np.int32) | ||
newImg[newY[0] - 1:newY[1], newX[0] - 1:newX[1] ] = image[oldY[0] - 1:oldY[1], oldX[0] - 1:oldX[1], :] | ||
newImg = cv2.resize(newImg, dsize=(int(resolution), int(resolution)), interpolation=cv2.INTER_LINEAR) | ||
return newImg | ||
|
||
def get_pts_from_predict(a, center, scale): | ||
b = a.reshape ( (a.shape[0], a.shape[1]*a.shape[2]) ) | ||
c = b.argmax(1).reshape ( (a.shape[0], 1) ).repeat(2, axis=1).astype(np.float) | ||
c[:,0] %= a.shape[2] | ||
c[:,1] = np.apply_along_axis ( lambda x: np.floor(x / a.shape[2]), 0, c[:,1] ) | ||
|
||
for i in range(a.shape[0]): | ||
pX, pY = int(c[i,0]), int(c[i,1]) | ||
if pX > 0 and pX < 63 and pY > 0 and pY < 63: | ||
diff = np.array ( [a[i,pY,pX+1]-a[i,pY,pX-1], a[i,pY+1,pX]-a[i,pY-1,pX]] ) | ||
c[i] += np.sign(diff)*0.25 | ||
|
||
c += 0.5 | ||
return [ transform (c[i], center, scale, a.shape[2]) for i in range(a.shape[0]) ] | ||
|
||
|
||
class LandmarksExtractor(object): | ||
def __init__ (self, keras): | ||
self.keras = keras | ||
K = self.keras.backend | ||
class TorchBatchNorm2D(self.keras.engine.topology.Layer): | ||
def __init__(self, axis=-1, momentum=0.99, epsilon=1e-3, **kwargs): | ||
super(TorchBatchNorm2D, self).__init__(**kwargs) | ||
self.supports_masking = True | ||
self.axis = axis | ||
self.momentum = momentum | ||
self.epsilon = epsilon | ||
|
||
def build(self, input_shape): | ||
dim = input_shape[self.axis] | ||
if dim is None: | ||
raise ValueError('Axis ' + str(self.axis) + ' of ' 'input tensor should have a defined dimension ' 'but the layer received an input with shape ' + str(input_shape) + '.') | ||
shape = (dim,) | ||
self.gamma = self.add_weight(shape=shape, name='gamma', initializer='ones', regularizer=None, constraint=None) | ||
self.beta = self.add_weight(shape=shape, name='beta', initializer='zeros', regularizer=None, constraint=None) | ||
self.moving_mean = self.add_weight(shape=shape, name='moving_mean', initializer='zeros', trainable=False) | ||
self.moving_variance = self.add_weight(shape=shape, name='moving_variance', initializer='ones', trainable=False) | ||
self.built = True | ||
|
||
def call(self, inputs, training=None): | ||
input_shape = K.int_shape(inputs) | ||
|
||
broadcast_shape = [1] * len(input_shape) | ||
broadcast_shape[self.axis] = input_shape[self.axis] | ||
|
||
broadcast_moving_mean = K.reshape(self.moving_mean, broadcast_shape) | ||
broadcast_moving_variance = K.reshape(self.moving_variance, broadcast_shape) | ||
broadcast_gamma = K.reshape(self.gamma, broadcast_shape) | ||
broadcast_beta = K.reshape(self.beta, broadcast_shape) | ||
invstd = K.ones (shape=broadcast_shape, dtype='float32') / K.sqrt(broadcast_moving_variance + K.constant(self.epsilon, dtype='float32')) | ||
|
||
return (inputs - broadcast_moving_mean) * invstd * broadcast_gamma + broadcast_beta | ||
|
||
def get_config(self): | ||
config = { 'axis': self.axis, 'momentum': self.momentum, 'epsilon': self.epsilon } | ||
base_config = super(TorchBatchNorm2D, self).get_config() | ||
return dict(list(base_config.items()) + list(config.items())) | ||
self.TorchBatchNorm2D = TorchBatchNorm2D | ||
|
||
def __enter__(self): | ||
keras_model_path = Path(__file__).parent / "2DFAN-4.h5" | ||
if not keras_model_path.exists(): | ||
return None | ||
|
||
self.keras_model = self.keras.models.load_model ( str(keras_model_path), custom_objects={'TorchBatchNorm2D': self.TorchBatchNorm2D} ) | ||
|
||
return self | ||
|
||
def __exit__(self, exc_type=None, exc_value=None, traceback=None): | ||
del self.keras_model | ||
return False #pass exception between __enter__ and __exit__ to outter level | ||
|
||
def extract_from_bgr (self, input_image, rects): | ||
input_image = input_image[:,:,::-1].copy() | ||
(h, w, ch) = input_image.shape | ||
|
||
landmarks = [] | ||
for (left, top, right, bottom) in rects: | ||
|
||
center = np.array( [ (left + right) / 2.0, (top + bottom) / 2.0] ) | ||
center[1] -= (bottom - top) * 0.12 | ||
scale = (right - left + bottom - top) / 195.0 | ||
|
||
image = crop(input_image, center, scale).transpose ( (2,0,1) ).astype(np.float32) / 255.0 | ||
image = np.expand_dims(image, 0) | ||
|
||
with std_utils.suppress_stdout_stderr(): | ||
predicted = self.keras_model.predict (image) | ||
|
||
pts_img = get_pts_from_predict ( predicted[-1][0], center, scale) | ||
pts_img = [ ( int(pt[0]), int(pt[1]) ) for pt in pts_img ] | ||
landmarks.append ( ( (left, top, right, bottom),pts_img ) ) | ||
|
||
return landmarks |
Oops, something went wrong.