Squashed commit of the following:

commit 1976ac5 Author: mdraw <martin.drawitsch@gmail.com> Date: Wed Jul 19 16:06:50 2023 +0200 readme: Add docs for mosaic filter commit 4a91126 Merge: 7082b35 d9393c6 Author: Martin Drawitsch <martin.drawitsch@gmail.com> Date: Wed Jul 19 16:01:07 2023 +0200 Merge pull request ORB-HD#42 from Shazi199/master add replace with mosaic option commit d9393c6 Author: shazi199 <shazi199@gmail.com> Date: Fri Jul 14 16:40:00 2023 +0800 add replace with mosaic option commit 7082b35 Author: mdraw <martin.drawitsch@gmail.com> Date: Tue Jul 4 15:36:59 2023 +0200 Use LRU cache for shape transformation Only cache transform if shapes are actually the same commit 814e6a2 Author: mdraw <martin.drawitsch@gmail.com> Date: Tue Jul 4 15:22:23 2023 +0200 Don't cache input shapes because they can vary Input shape transforms were previously only calculated once (i.e. for one frame / one image) and then cached for all subsequent calls. This is fine in 99% of use cases but if you want to process differently shaped inputs in the same deface process (given multiple input paths), this leads to crashes or even silently incorrect outputs. Transforms now have to be recalculated for each frame but this is very cheap, so it's okay to do this for the sake of better stability. Fixes ORB-HD#41 commit a023f97 Author: mdraw <martin.drawitsch@gmail.com> Date: Mon Jul 3 22:54:34 2023 +0200 Switch model to bn-optimized version, update docs Slightly improves resource usage. End results are the same. Closes ORB-HD#39. commit c555264 Author: mdraw <martin.drawitsch@gmail.com> Date: Mon Jun 5 03:24:33 2023 +0200 Auto-select optimal ORT Execution Provider Accelerated onnxruntime execution providers like OpenVINO are now automatically selected for inference if available. The presumably fastest provider is chosen by default. To override this choice you can use the new --execution-provider CLI argument. Fixes ORB-HD#40. commit ca5c651 Author: Martin Drawitsch <martin.drawitsch@gmail.com> Date: Tue May 2 01:05:37 2023 +0200 readme: Add links to badges commit 5647935 Author: Martin Drawitsch <martin.drawitsch@gmail.com> Date: Mon May 1 22:34:42 2023 +0200 readme: Add badges for PyPI and build commit f80dfaa Author: mdraw <martin.drawitsch@gmail.com> Date: Mon May 1 22:19:12 2023 +0200 readme: Update description commit 79f4cb4 Author: Martin Drawitsch <martin.drawitsch@gmail.com> Date: Mon May 1 14:44:15 2023 +0200 Use github workflow for publishing Publish on PyPI automatically when a tagged release is detected. commit 5ae6ded Author: mdraw <martin.drawitsch@gmail.com> Date: Mon Apr 24 02:32:55 2023 +0200 Switch to pyproject.toml for meta data and build Eliminating the need for many setup files and versioneer commit 48b655e Author: mdraw <martin.drawitsch@gmail.com> Date: Mon Apr 24 01:44:02 2023 +0200 Add --keep-audio option With --keep-audio the audio track is carried over to the output file if the input is a video. Fixes ORB-HD#4 commit 0d2eba8 Merge: 852286e d013a08 Author: Martin Drawitsch <martin.drawitsch@gmail.com> Date: Tue Jan 10 17:47:42 2023 +0100 Merge pull request ORB-HD#28 from mysablehats/master Fixes inconsistent fps issues (fps reading vs fps writing) commit 852286e Author: mdraw <martin.drawitsch@gmail.com> Date: Tue Jan 10 17:25:40 2023 +0100 Automatically convert images to RGB color space Inference only works in RGB, so RGBA and grayscale images need to be transformed first. Output is always RGB, regardless of input format. commit edfe190 Author: mdraw <martin.drawitsch@gmail.com> Date: Tue Jan 10 17:11:56 2023 +0100 Fix centerface type hint commit d013a08 Author: Frederico B. K <frekle@kth.se> Date: Thu Jun 9 10:36:07 2022 +0200 fix 25fps issue commit b1f0656 Author: Frederico B. K <frekle@kth.se> Date: Thu Jun 9 10:34:37 2022 +0200 fix 25fps issue commit b80a601 Author: mdraw <martin.drawitsch@gmail.com> Date: Wed May 18 15:48:53 2022 +0200 Support overriding fps in --ffmpeg-config flag fps value still defaults to source fps, but you can now override it to e.g. 15 through --ffmpeg-config='{"fps": 15}'. Thanks for the report @Karl48071 Closing ORB-HD#24
StealUrKill · Sep 21, 2023 · 63729ed · 63729ed
1 parent d45a37c
commit 63729ed
Show file tree

Hide file tree

Showing 9 changed files with 261 additions and 78 deletions.
diff --git a/.github/workflows/python-publish.yml b/.github/workflows/python-publish.yml
@@ -0,0 +1,34 @@
+# This workflow will upload a Python Package using Twine when a release is created
+# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries
+
+name: Upload Python Package
+
+on:
+  release:
+    types: [published]
+
+permissions:
+  contents: read
+
+jobs:
+  deploy:
+
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v3
+    - name: Set up Python
+      uses: actions/setup-python@v3
+      with:
+        python-version: '3.x'
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install build
+    - name: Build package
+      run: python -m build
+    - name: Publish package
+      uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
+      with:
+        user: __token__
+        password: ${{ secrets.DEFACE_PYPI_PUBLISH }}
diff --git a/.gitignore b/.gitignore
@@ -107,3 +107,5 @@ ENV/
 
 # VS Code
 .vscode/
+
+deface/_version.py
diff --git a/README.md b/README.md
diff --git a/anonfaces/anonfaces.py b/anonfaces/anonfaces.py
@@ -1,11 +1,9 @@
 #!/usr/bin/env python3
 
 import argparse
-import glob
 import json
 import mimetypes
 import os
-import sys
 from typing import Dict, Tuple
 
 import tqdm
@@ -19,9 +17,6 @@
 from anonfaces.centerface import CenterFace
 
 
-# TODO: Optionally preserve audio track?
-
-
 def scale_bb(x1, y1, x2, y2, mask_scale=1.0):
     s = mask_scale - 1.0
     h, w = y2 - y1, x2 - x1
@@ -38,7 +33,8 @@ def draw_det(
         ellipse: bool = True,
         draw_scores: bool = False,
         ovcolor: Tuple[int] = (0, 0, 0),
-        replaceimg = None
+        replaceimg = None,
+        mosaicsize: int = 20
 ):
     if replacewith == 'solid':
         cv2.rectangle(frame, (x1, y1), (x2, y2), ovcolor, -1)
@@ -63,6 +59,13 @@ def draw_det(
             frame[y1:y2, x1:x2] = resized_replaceimg
         elif replaceimg.shape[2] == 4:  # RGBA
             frame[y1:y2, x1:x2] = frame[y1:y2, x1:x2] * (1 - resized_replaceimg[:, :, 3:] / 255) + resized_replaceimg[:, :, :3] * (resized_replaceimg[:, :, 3:] / 255)
+    elif replacewith == 'mosaic':
+        for y in range(y1, y2, mosaicsize):
+            for x in range(x1, x2, mosaicsize):
+                pt1 = (x, y)
+                pt2 = (min(x2, x + mosaicsize - 1), min(y2, y + mosaicsize - 1))
+                color = (int(frame[y, x][0]), int(frame[y, x][1]), int(frame[y, x][2]))
+                cv2.rectangle(frame, pt1, pt2, color, -1)
     elif replacewith == 'none':
         pass
     if draw_scores:
@@ -74,7 +77,7 @@ def draw_det(
 
 def anonymize_frame(
         dets, frame, mask_scale,
-        replacewith, ellipse, draw_scores, replaceimg
+        replacewith, ellipse, draw_scores, replaceimg, mosaicsize
 ):
     for i, det in enumerate(dets):
         boxes, score = det[:4], det[4]
@@ -89,7 +92,8 @@ def anonymize_frame(
             replacewith=replacewith,
             ellipse=ellipse,
             draw_scores=draw_scores,
-            replaceimg=replaceimg
+            replaceimg=replaceimg,
+            mosaicsize=mosaicsize
         )
 
 
@@ -101,7 +105,7 @@ def cam_read_iter(reader):
 def video_detect(
         ipath: str,
         opath: str,
-        centerface: str,
+        centerface: CenterFace,
         threshold: float,
         enable_preview: bool,
         cam: bool,
@@ -111,10 +115,16 @@ def video_detect(
         ellipse: bool,
         draw_scores: bool,
         ffmpeg_config: Dict[str, str],
-        replaceimg = None
+        replaceimg = None,
+        keep_audio: bool = False,
+        mosaicsize: int = 20,
 ):
     try:
-        reader: imageio.plugins.ffmpeg.FfmpegFormat.Reader = imageio.get_reader(ipath)
+        if 'fps' in ffmpeg_config:
+            reader: imageio.plugins.ffmpeg.FfmpegFormat.Reader = imageio.get_reader(ipath, fps=ffmpeg_config['fps'])
+        else:
+            reader: imageio.plugins.ffmpeg.FfmpegFormat.Reader = imageio.get_reader(ipath)
+
         meta = reader.get_meta_data()
         _ = meta['size']
     except:
@@ -136,8 +146,14 @@ def video_detect(
         bar = tqdm.tqdm(dynamic_ncols=True, total=nframes)
 
     if opath is not None:
+        _ffmpeg_config = ffmpeg_config.copy()
+        #  If fps is not explicitly set in ffmpeg_config, use source video fps value
+        _ffmpeg_config.setdefault('fps', meta['fps'])
+        if keep_audio:  # Carry over audio from input path, use "copy" codec (no transcoding) by default
+            _ffmpeg_config.setdefault('audio_path', ipath)
+            _ffmpeg_config.setdefault('audio_codec', 'copy')
         writer: imageio.plugins.ffmpeg.FfmpegFormat.Writer = imageio.get_writer(
-            opath, format='FFMPEG', mode='I', fps=meta['fps'], **ffmpeg_config
+            opath, format='FFMPEG', mode='I', **_ffmpeg_config
         )
 
     for frame in read_iter:
@@ -147,7 +163,7 @@ def video_detect(
         anonymize_frame(
             dets, frame, mask_scale=mask_scale,
             replacewith=replacewith, ellipse=ellipse, draw_scores=draw_scores,
-            replaceimg=replaceimg
+            replaceimg=replaceimg, mosaicsize=mosaicsize
         )
 
         if opath is not None:
@@ -168,14 +184,15 @@ def video_detect(
 def image_detect(
         ipath: str,
         opath: str,
-        centerface: str,
+        centerface: CenterFace,
         threshold: float,
         replacewith: str,
         mask_scale: float,
         ellipse: bool,
         draw_scores: bool,
         enable_preview: bool,
-        replaceimg = None
+        replaceimg = None,
+        mosaicsize: int = 20,
 ):
     frame = imageio.imread(ipath)
 
@@ -185,7 +202,7 @@ def image_detect(
     anonymize_frame(
         dets, frame, mask_scale=mask_scale,
         replacewith=replacewith, ellipse=ellipse, draw_scores=draw_scores,
-        replaceimg=replaceimg
+        replaceimg=replaceimg, mosaicsize=mosaicsize
     )
 
     if enable_preview:
@@ -264,18 +281,27 @@ def parse_cli_args():
         '--mask-scale', default=1.3, type=float, metavar='M',
         help='Scale factor for face masks, to make sure that masks cover the complete face. Default: 1.3.')
     parser.add_argument(
-        '--replacewith', default='blur', choices=['blur', 'solid', 'none', 'img'],
-        help='Anonymization filter mode for face regions. "blur" applies a strong gaussian blurring, "solid" draws a solid black box, "none" does leaves the input unchanged and "img" replaces the face with a custom image. Default: "blur".')
+        '--replacewith', default='blur', choices=['blur', 'solid', 'none', 'img', 'mosaic'],
+        help='Anonymization filter mode for face regions. "blur" applies a strong gaussian blurring, "solid" draws a solid black box, "none" does leaves the input unchanged, "img" replaces the face with a custom image and "mosaic" replaces the face with mosaic. Default: "blur".')
     parser.add_argument(
         '--replaceimg', default='replace_img.png',
         help='Anonymization image for face regions. Requires --replacewith img option.')
+    parser.add_argument(
+        '--mosaicsize', default=20, type=int, metavar='width',
+        help='Setting the mosaic size. Requires --replacewith mosaic option. Default: 20.')
+    parser.add_argument(
+        '--keep-audio', '-k', default=False, action='store_true',
+        help='Keep audio from video source file and copy it over to the output (only applies to videos).')
     parser.add_argument(
         '--ffmpeg-config', default={"codec": "libx264"}, type=json.loads,
         help='FFMPEG config arguments for encoding output videos. This argument is expected in JSON notation. For a list of possible options, refer to the ffmpeg-imageio docs. Default: \'{"codec": "libx264"}\'.'
     )  # See https://imageio.readthedocs.io/en/stable/format_ffmpeg.html#parameters-for-saving
     parser.add_argument(
         '--backend', default='auto', choices=['auto', 'onnxrt', 'opencv'],
         help='Backend for ONNX model execution. Default: "auto" (prefer onnxrt if available).')
+    parser.add_argument(
+        '--execution-provider', '--ep', default=None, metavar='EP',
+        help='Override onnxrt execution provider (see https://onnxruntime.ai/docs/execution-providers/). If not specified, the presumably fastest available one will be automatically selected. Only used if backend is onnxrt.')
     parser.add_argument(
         '--version', action='version', version=__version__,
         help='Print version number and exit.')
@@ -316,9 +342,12 @@ def main():
     threshold = args.thresh
     ellipse = not args.boxes
     mask_scale = args.mask_scale
+    keep_audio = args.keep_audio
     ffmpeg_config = args.ffmpeg_config
     backend = args.backend
     in_shape = args.scale
+    execution_provider = args.execution_provider
+    mosaicsize = args.mosaicsize
     replaceimg = None
     if in_shape is not None:
         w, h = in_shape.split('x')
@@ -329,7 +358,7 @@ def main():
 
 
     # TODO: scalar downscaling setting (-> in_shape), preserving aspect ratio
-    centerface = CenterFace(in_shape=in_shape, backend=backend)
+    centerface = CenterFace(in_shape=in_shape, backend=backend, override_execution_provider=execution_provider)
 
     multi_file = len(ipaths) > 1
     if multi_file:
@@ -361,8 +390,10 @@ def main():
                 draw_scores=draw_scores,
                 enable_preview=enable_preview,
                 nested=multi_file,
+                keep_audio=keep_audio,
                 ffmpeg_config=ffmpeg_config,
-                replaceimg=replaceimg
+                replaceimg=replaceimg,
+                mosaicsize=mosaicsize
             )
         elif filetype == 'image':
             image_detect(
@@ -375,7 +406,8 @@ def main():
                 ellipse=ellipse,
                 draw_scores=draw_scores,
                 enable_preview=enable_preview,
-                replaceimg=replaceimg
+                replaceimg=replaceimg,
+                mosaicsize=mosaicsize
             )
         elif filetype is None:
             print(f'Can\'t determine file type of file {ipath}. Skipping...')

diff --git a/anonfaces/centerface.onnx b/anonfaces/centerface.onnx
diff --git a/anonfaces/centerface.py b/anonfaces/centerface.py
@@ -1,6 +1,7 @@
-import datetime
 import os
 
+from functools import lru_cache
+
 import numpy as np
 import cv2
 
@@ -9,8 +10,17 @@
 default_onnx_path = f'{os.path.dirname(__file__)}/centerface.onnx'
 
 
+def ensure_rgb(img: np.ndarray) -> np.ndarray:
+    """Convert input image to RGB if it is in RGBA or L format"""
+    if img.ndim == 2:  # 1-channel grayscale -> RGB
+        img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
+    elif img.shape[2] == 4:  # 4-channel RGBA -> RGB
+        img = cv2.cvtColor(img, cv2.COLOR_RGBA2RGB)
+    return img
+
+
 class CenterFace:
-    def __init__(self, onnx_path=None, in_shape=None, backend='auto'):
+    def __init__(self, onnx_path=None, in_shape=None, backend='auto', override_execution_provider=None):
         self.in_shape = in_shape
         self.onnx_input_name = 'input.1'
         self.onnx_output_names = ['537', '538', '539', '540']
@@ -41,11 +51,24 @@ def __init__(self, onnx_path=None, in_shape=None, backend='auto'):
 
             static_model = onnx.load(onnx_path)
             dyn_model = self.dynamicize_shapes(static_model)
-            self.sess = onnxruntime.InferenceSession(dyn_model.SerializeToString(), providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'])
+
+            # onnxruntime.get_available_providers() Returns a list of all
+            #  available providers in a reasonable ordering (GPU providers
+            #  first, then accelerated CPU providers like OpenVINO, then
+            #  CPUExecutionProvider as the last choice).
+            #  In normal conditions, overriding this choice won't be necessary.
+            available_providers = onnxruntime.get_available_providers()
+            if override_execution_provider is None:
+                ort_providers = available_providers
+            else:
+                if override_execution_provider not in available_providers:
+                    raise ValueError(f'{override_execution_provider=} not found. Available providers are: {available_providers}')
+                ort_providers = [override_execution_provider]
+
+            self.sess = onnxruntime.InferenceSession(dyn_model.SerializeToString(), providers=ort_providers)
 
             preferred_provider = self.sess.get_providers()[0]
-            preferred_device = 'GPU' if preferred_provider.startswith('CUDA') else 'CPU'
-            # print(f'Running on {preferred_device}.')
+            print(f'Running on {preferred_provider}.')
 
     @staticmethod
     def dynamicize_shapes(static_model):
@@ -71,14 +94,14 @@ def dynamicize_shapes(static_model):
         return dyn_model
 
     def __call__(self, img, threshold=0.5):
-        self.orig_shape = img.shape[:2]
-        if self.in_shape is None:
-            self.in_shape = self.orig_shape[::-1]
-        if not hasattr(self, 'h_new'):  # First call, need to compute sizes
-            self.w_new, self.h_new, self.scale_w, self.scale_h = self.transform(self.in_shape)
+        img = ensure_rgb(img)
+        orig_shape = img.shape[:2]
+        in_shape = orig_shape[::-1] if self.in_shape is None else self.in_shape
+        # Compute sizes
+        w_new, h_new, scale_w, scale_h = self.shape_transform(in_shape, orig_shape)
 
         blob = cv2.dnn.blobFromImage(
-            img, scalefactor=1.0, size=(self.w_new, self.h_new),
+            img, scalefactor=1.0, size=(w_new, h_new),
             mean=(0, 0, 0), swapRB=False, crop=False
         )
         if self.backend == 'opencv':
@@ -88,18 +111,20 @@ def __call__(self, img, threshold=0.5):
             heatmap, scale, offset, lms = self.sess.run(self.onnx_output_names, {self.onnx_input_name: blob})
         else:
             raise RuntimeError(f'Unknown backend {self.backend}')
-        dets, lms = self.decode(heatmap, scale, offset, lms, (self.h_new, self.w_new), threshold=threshold)
+        dets, lms = self.decode(heatmap, scale, offset, lms, (h_new, w_new), threshold=threshold)
         if len(dets) > 0:
-            dets[:, 0:4:2], dets[:, 1:4:2] = dets[:, 0:4:2] / self.scale_w, dets[:, 1:4:2] / self.scale_h
-            lms[:, 0:10:2], lms[:, 1:10:2] = lms[:, 0:10:2] / self.scale_w, lms[:, 1:10:2] / self.scale_h
+            dets[:, 0:4:2], dets[:, 1:4:2] = dets[:, 0:4:2] / scale_w, dets[:, 1:4:2] / scale_h
+            lms[:, 0:10:2], lms[:, 1:10:2] = lms[:, 0:10:2] / scale_w, lms[:, 1:10:2] / scale_h
         else:
             dets = np.empty(shape=[0, 5], dtype=np.float32)
             lms = np.empty(shape=[0, 10], dtype=np.float32)
 
         return dets, lms
 
-    def transform(self, in_shape):
-        h_orig, w_orig = self.orig_shape
+    @staticmethod
+    @lru_cache(maxsize=128)
+    def shape_transform(in_shape, orig_shape):
+        h_orig, w_orig = orig_shape
         w_new, h_new = in_shape
         # Make spatial dims divisible by 32
         w_new, h_new = int(np.ceil(w_new / 32) * 32), int(np.ceil(h_new / 32) * 32)

diff --git a/examples/city_anonymized_mosaic.jpg b/examples/city_anonymized_mosaic.jpg