transnetv2 as python package, readme update

soCzech · Jun 8, 2020 · 8c4254c · 8c4254c
1 parent feea1ad
commit 8c4254c
Show file tree

Hide file tree

Showing 6 changed files with 147 additions and 21 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # TransNet V2: Shot Boundary Detection Neural Network
 
-This repository contains extension of [TransNet: A deep network for fast detection of common shot transitions](https://arxiv.org/abs/1906.03363).
+This repository contains code for [TransNet V2: An effective deep network architecture for fast shot transition detection](#TBA) (link will be added in the coming weeks).
 
 Our reevaluation of other publicly available state-of-the-art shot boundary methods (F1 scores):
 
@@ -12,12 +12,13 @@ TransNet V2 (this repo) | **77.9** | **96.2** | 93.9
 [Tang et al., ResNet baseline](https://arxiv.org/abs/1808.04234) [(github)](https://github.com/Tangshitao/ClipShots_basline) | 76.1 | 89.3 | 92.8
 
 
-### USE IT ON YOUR DATA
-See [_inference_ folder](https://github.com/soCzech/TransNetV2/tree/master/inference) and its _README_ file.
+### USE IT
+**See [_inference_ folder](https://github.com/soCzech/TransNetV2/tree/master/inference) and its _README_ file.**
 
 
 ### REPLICATE THE WORK
 This repository contains all that is needed to run any experiment for TransNet V2 network including network training and dataset creation.
+All experiments should be runnable in [this NVIDIA DOCKER file](https://github.com/soCzech/TransNetV2/blob/master/Dockerfile).
 
 In general these steps need to be done in order to replicate our work:
 
@@ -33,12 +34,12 @@ In general these steps need to be done in order to replicate our work:
 
 ### CREDITS
 If find useful, please cite us;)
-At the moment there is only the older paper [TransNet: A deep network for fast detection of common shot transitions](https://arxiv.org/abs/1906.03363) available so please cite that one.
 ```
-@article{soucek2019transnet,
-    title={TransNet: A deep network for fast detection of common shot transitions},
-    author={Sou{\v{c}}ek, Tom{\'a}{\v{s}} and Moravec, Jaroslav and Loko{\v{c}}, Jakub},
-    journal={arXiv preprint arXiv:1906.03363},
-    year={2019}
+@article{soucek2020transnetv2,
+    title={TransNet V2: An effective deep network architecture for fast shot transition detection},
+    author={Sou{\v{c}}ek, Tom{\'a}{\v{s}} and and Loko{\v{c}}, Jakub},
+    year={2020}
 }
-```
+```
+
+The older paper [TransNet: A deep network for fast detection of common shot transitions](https://arxiv.org/abs/1906.03363).
diff --git a/inference/Dockerfile b/inference/Dockerfile
@@ -0,0 +1,13 @@
+FROM tensorflow/tensorflow:2.1.1-gpu
+
+RUN pip3 --no-cache-dir install \
+    Pillow \
+    ffmpeg-python
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    ffmpeg
+
+COPY setup.py /tmp
+COPY inference /tmp/inference
+
+RUN cd /tmp && python3 setup.py install && rm -r *
diff --git a/inference/README.md b/inference/README.md
@@ -1,19 +1,55 @@
 # TransNet V2: Shot Boundary Detection Neural Network
 
+Inference code for [TransNet V2: An effective deep network architecture for fast shot transition detection](#TBA) (link will be added in the coming weeks).
 
-### INSTALLATION
+### INSTALL REQUIREMENTS
 ```bash
 pip install tensorflow==2.1
 ```
 
 If you want to predict directly on video files, install `ffmpeg`.
-If you want to visualize results also install `pillow`.
+If you want to visualize results also install `pillow` (simple usage requires both).
 ```bash
 apt-get install ffmpeg
 pip install ffmpeg-python pillow
 ```
 
-### USAGE
+or **use NVIDIA DOCKER**!
+```
+# run from the root directory of the repository
+docker build -t transnet -f inference/Dockerfile .
+```
+Then simply use it the following way:
+```
+docker run -it --rm --gpus 1 -v /path/to/video/dir:/tmp transnet transnetv2_predict /tmp/video.mp4 [--visualize]
+```
+
+> Note `transnetv2-weights` directory contains files in git-lfs.
+> You may need to install git-lfs and run `git lfs pull` from the root directory of the repository
+> (or you can download `transnetv2-weights` directory manually).
+
+### INSTALL AS PYTHON PACKAGE (optional)
+Run `python setup.py install` from the root directory of the repository.
+
+
+### SIMPLE USAGE
+
+```
+# run from this directory
+python transnetv2.py /path/to/video.mp4 [--visualize]
+# or if installed as python package, run from anywhere
+transnetv2_predict /path/to/video.mp4 [--visualize]
+```
+
+It creates:
+- `/path/to/video.mp4.scenes.txt` file containing a list of scenes - pairs of
+  *start-frame-index*, *end-frame-index* (indexed from zero, both limits inclusive).
+- `/path/to/video.mp4.predictions.txt` file with each line containing raw predictions for corresponding frame
+  (fist number is from the first 'single-frame-per-transition' head, the second from 'all-frames-per-transition' head)
+- optionally it creates visualization in file `/path/to/video.mp4.vis.png`
+
+
+### ADVANCED USAGE
 - Get predictions:
 ```python
 from transnetv2 import TransNetV2
@@ -44,12 +80,12 @@ model.visualize_predictions(
 
 ### CREDITS
 If find useful, please cite us;)
-At the moment there is only the older paper [TransNet: A deep network for fast detection of common shot transitions](https://arxiv.org/abs/1906.03363) available so please cite that one.
 ```
-@article{soucek2019transnet,
-    title={TransNet: A deep network for fast detection of common shot transitions},
-    author={Sou{\v{c}}ek, Tom{\'a}{\v{s}} and Moravec, Jaroslav and Loko{\v{c}}, Jakub},
-    journal={arXiv preprint arXiv:1906.03363},
-    year={2019}
+@article{soucek2020transnetv2,
+    title={TransNet V2: An effective deep network architecture for fast shot transition detection},
+    author={Sou{\v{c}}ek, Tom{\'a}{\v{s}} and and Loko{\v{c}}, Jakub},
+    year={2020}
 }
-```
+```
+
+The older paper [TransNet: A deep network for fast detection of common shot transitions](https://arxiv.org/abs/1906.03363).
diff --git a/inference/__init__.py b/inference/__init__.py
@@ -0,0 +1 @@
+from .transnetv2 import TransNetV2
diff --git a/inference/transnetv2.py b/inference/transnetv2.py
@@ -1,10 +1,18 @@
+import os
 import numpy as np
 import tensorflow as tf
 
 
 class TransNetV2:
 
-    def __init__(self, model_dir: str):
+    def __init__(self, model_dir=None):
+        if model_dir is None:
+            model_dir = os.path.join(os.path.dirname(__file__), "transnetv2-weights/")
+            if not os.path.isdir(model_dir):
+                raise FileNotFoundError(f"[TransNetV2] ERROR: {model_dir} is not a directory.")
+            else:
+                print(f"[TransNetV2] Using weights from {model_dir}.")
+
         self._input_size = (27, 48, 3)
         self._model = tf.saved_model.load(model_dir)
 
@@ -135,3 +143,46 @@ def visualize_predictions(frames: np.ndarray, predictions):
                 if value != 0:
                     draw.line((x + j, y, x + j, y - value), fill=tuple(color), width=1)
         return img
+
+
+def main():
+    import sys
+    import argparse
+
+    parser = argparse.ArgumentParser()
+    parser.add_argument("files", type=str, nargs="+", help="path to video files to process")
+    parser.add_argument("--weights", type=str, default=None,
+                        help="path to TransNet V2 weights, tries to infer the location if not specified")
+    parser.add_argument('--visualize', action="store_true",
+                        help="save a png file with prediction visualization for each extracted video")
+    args = parser.parse_args()
+
+    model = TransNetV2(args.weights)
+    for file in args.files:
+        if os.path.exists(file + ".predictions.txt") or os.path.exists(file + ".scenes.txt"):
+            print(f"[TransNetV2] {file}.predictions.txt or {file}.scenes.txt already exists. "
+                  f"Skipping video {file}.", file=sys.stderr)
+            continue
+
+        video_frames, single_frame_predictions, all_frame_predictions = \
+            model.predict_video(file)
+
+        predictions = np.stack([single_frame_predictions, all_frame_predictions], 1)
+        np.savetxt(file + ".predictions.txt", predictions, fmt="%.6f")
+
+        scenes = model.predictions_to_scenes(single_frame_predictions)
+        np.savetxt(file + ".scenes.txt", scenes, fmt="%d")
+
+        if args.visualize:
+            if os.path.exists(file + ".vis.png"):
+                print(f"[TransNetV2] {file}.vis.png already exists. "
+                      f"Skipping visualization of video {file}.", file=sys.stderr)
+                continue
+
+            pil_image = model.visualize_predictions(
+                video_frames, predictions=(single_frame_predictions, all_frame_predictions))
+            pil_image.save(file + ".vis.png")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/setup.py b/setup.py
@@ -0,0 +1,24 @@
+from setuptools import setup
+
+setup(
+    name="transnetv2",
+    version="1.0.0",
+    # let user install tensorflow, etc. manually
+    # install_requires=[
+    #     "tensorflow>=2.0",
+    #     "ffmpeg-python",
+    #     "pillow"
+    # ],
+    entry_points={
+        "console_scripts": [
+            "transnetv2_predict = transnetv2.transnetv2:main",
+        ]
+    },
+    packages=["transnetv2"],
+    package_dir={"transnetv2": "./inference"},
+    package_data={"transnetv2": [
+        "transnetv2-weights/*",
+        "transnetv2-weights/variables/*"
+    ]},
+    zip_safe=False
+)