Permalink
Browse files

Merge branch 'master' of github.com:rezoo/illustration2vec

  • Loading branch information...
rezoo committed Jan 19, 2016
2 parents af42fda + 3cc78f2 commit c99e1007210a7c5b70aaff9c6275f6ffd2de2090
Showing with 32 additions and 14 deletions.
  1. +25 −7 README.md
  2. +5 −5 get_models.sh
  3. +2 −2 i2v/base.py
View
@@ -1,9 +1,8 @@
# Illustration2Vec # Illustration2Vec
``illustration2vec (i2v)`` is a simple library for estimating a set of tags and ``illustration2vec (i2v)`` is a simple library for estimating a set of tags and
extracting semantic feature vectors from given illustrations with extracting semantic feature vectors from given illustrations.
Convolutional Neural Networks. For details, please see For details, please see [our project page](http://illustration2vec.net) or
[our project page](http://illustration2vec.net) or
[our main paper](http://illustration2vec.net/papers/illustration2vec-main.pdf). [our main paper](http://illustration2vec.net/papers/illustration2vec-main.pdf).
# Demo # Demo
@@ -12,8 +11,9 @@ Convolutional Neural Networks. For details, please see
# Requirements # Requirements
* Pre-trained CNN models (please download them from * Pre-trained models (``i2v`` uses Convolutional Neural Networks. Please download
http://illustration2vec.net or run ``get_models.sh`` in this repository). several pre-trained models from http://illustration2vec.net,
or execute ``get_models.sh`` in this repository).
* ``numpy`` and ``scipy`` * ``numpy`` and ``scipy``
* ``PIL`` (Python Imaging Library) or its alternatives (e.g., ``Pillow``) * ``PIL`` (Python Imaging Library) or its alternatives (e.g., ``Pillow``)
* ``skimage`` (Image processing library for python) * ``skimage`` (Image processing library for python)
@@ -36,6 +36,9 @@ This image is licensed under the Creative Commons - Attribution-NonCommercial,
3.0 Unported (CC BY-NC). 3.0 Unported (CC BY-NC).
## Tag prediction ## Tag prediction
``i2v`` estimates a number of semantic tags from given illustrations
in the following manner.
```python ```python
import i2v import i2v
from PIL import Image from PIL import Image
@@ -51,7 +54,9 @@ illust2vec = i2v.make_i2v_with_chainer(
img = Image.open("images/miku.jpg") img = Image.open("images/miku.jpg")
illust2vec.estimate_plausible_tags([img], threshold=0.5) illust2vec.estimate_plausible_tags([img], threshold=0.5)
``` ```
The returned result is the following:
``estimate_plausible_tags()`` returns dictionaries that have a pair of
tag and its confidence.
```python ```python
[{'character': [(u'hatsune miku', 0.9999994039535522)], [{'character': [(u'hatsune miku', 0.9999994039535522)],
'copyright': [(u'vocaloid', 0.9999998807907104)], 'copyright': [(u'vocaloid', 0.9999998807907104)],
@@ -70,10 +75,21 @@ The returned result is the following:
(u'questionable', 0.020535090938210487), (u'questionable', 0.020535090938210487),
(u'explicit', 0.0006299660308286548)]}] (u'explicit', 0.0006299660308286548)]}]
``` ```
These tags are classified into the following four categories:
*general tags* representing general attributes included in an image,
*copyright tags* representing the specific name of the copyright,
*character tags* representing the specific name of the characters,
and *rating tags* representing X ratings.
If you want to focus on several specific tags, use ``estimate_specific_tags()`` instead.
```python
illust2vec.estimate_specific_tags([img], ["1girl", "blue eyes", "safe"])
# -> [{'1girl': 0.9873462319374084, 'blue eyes': 0.01301183458417654, 'safe': 0.9785731434822083}]
```
## Feature vector extraction ## Feature vector extraction
``i2v`` supports the two types of feature vectors: the real vectors and the binary ones. ``i2v`` can extract a semantic feature vector from an illustration.
```python ```python
import i2v import i2v
from PIL import Image from PIL import Image
@@ -86,10 +102,12 @@ illust2vec = i2v.make_i2v_with_chainer("illust2vec_ver200.caffemodel")
img = Image.open("images/miku.jpg") img = Image.open("images/miku.jpg")
# extract a 4,096-dimensional feature vector
result_real = illust2vec.extract_feature([img]) result_real = illust2vec.extract_feature([img])
print("shape: {}, dtype: {}".format(result_real.shape, result_real.dtype)) print("shape: {}, dtype: {}".format(result_real.shape, result_real.dtype))
print(result_real) print(result_real)
# i2v also supports a 4,096-bit binary feature vector
result_binary = illust2vec.extract_binary_feature([img]) result_binary = illust2vec.extract_binary_feature([img])
print("shape: {}, dtype: {}".format(result_binary.shape, result_binary.dtype)) print("shape: {}, dtype: {}".format(result_binary.shape, result_binary.dtype))
print(result_binary) print(result_binary)
View
@@ -5,11 +5,11 @@ DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd $DIR cd $DIR
echo "Downloading pre-trained models..." echo "Downloading pre-trained models..."
wget --no-check-certificate http://illustration2vec.net/models/tag_list.json.gz wget http://illustration2vec.net/models/tag_list.json.gz
wget --no-check-certificate http://illustration2vec.net/models/illust2vec_tag.prototxt wget http://illustration2vec.net/models/illust2vec_tag.prototxt
wget --no-check-certificate http://illustration2vec.net/models/illust2vec_tag_ver200.caffemodel wget http://illustration2vec.net/models/illust2vec_tag_ver200.caffemodel
wget --no-check-certificate http://illustration2vec.net/models/illust2vec.prototxt wget http://illustration2vec.net/models/illust2vec.prototxt
wget --no-check-certificate http://illustration2vec.net/models/illust2vec_ver200.caffemodel wget http://illustration2vec.net/models/illust2vec_ver200.caffemodel
gunzip tag_list.json.gz gunzip tag_list.json.gz
echo "Done." echo "Done."
View
@@ -27,8 +27,8 @@ def _convert_image(self, image):
arr = np.asarray(image, dtype=np.float32) arr = np.asarray(image, dtype=np.float32)
if arr.ndim == 2: if arr.ndim == 2:
# convert a monochrome image to a color one # convert a monochrome image to a color one
ret = np.zeros((arr.shape[0], arr.shape[1], 3), dtype=np.float32) ret = np.empty((arr.shape[0], arr.shape[1], 3), dtype=np.float32)
ret += arr.reshape(arr.shape[0], arr.shape[1], 1) ret[:] = arr.reshape(arr.shape[0], arr.shape[1], 1)
return ret return ret
elif arr.ndim == 3: elif arr.ndim == 3:
# if arr contains alpha channel, remove it # if arr contains alpha channel, remove it

0 comments on commit c99e100

Please sign in to comment.