Permalink
Browse files

Merge branch 'master' of github.com:rezoo/illustration2vec

  • Loading branch information...
rezoo committed Jan 19, 2016
2 parents af42fda + 3cc78f2 commit c99e1007210a7c5b70aaff9c6275f6ffd2de2090
Showing with 32 additions and 14 deletions.
  1. +25 −7 README.md
  2. +5 −5 get_models.sh
  3. +2 −2 i2v/base.py
View
@@ -1,9 +1,8 @@
# Illustration2Vec
``illustration2vec (i2v)`` is a simple library for estimating a set of tags and
extracting semantic feature vectors from given illustrations with
Convolutional Neural Networks. For details, please see
[our project page](http://illustration2vec.net) or
extracting semantic feature vectors from given illustrations.
For details, please see [our project page](http://illustration2vec.net) or
[our main paper](http://illustration2vec.net/papers/illustration2vec-main.pdf).
# Demo
@@ -12,8 +11,9 @@ Convolutional Neural Networks. For details, please see
# Requirements
* Pre-trained CNN models (please download them from
http://illustration2vec.net or run ``get_models.sh`` in this repository).
* Pre-trained models (``i2v`` uses Convolutional Neural Networks. Please download
several pre-trained models from http://illustration2vec.net,
or execute ``get_models.sh`` in this repository).
* ``numpy`` and ``scipy``
* ``PIL`` (Python Imaging Library) or its alternatives (e.g., ``Pillow``)
* ``skimage`` (Image processing library for python)
@@ -36,6 +36,9 @@ This image is licensed under the Creative Commons - Attribution-NonCommercial,
3.0 Unported (CC BY-NC).
## Tag prediction
``i2v`` estimates a number of semantic tags from given illustrations
in the following manner.
```python
import i2v
from PIL import Image
@@ -51,7 +54,9 @@ illust2vec = i2v.make_i2v_with_chainer(
img = Image.open("images/miku.jpg")
illust2vec.estimate_plausible_tags([img], threshold=0.5)
```
The returned result is the following:
``estimate_plausible_tags()`` returns dictionaries that have a pair of
tag and its confidence.
```python
[{'character': [(u'hatsune miku', 0.9999994039535522)],
'copyright': [(u'vocaloid', 0.9999998807907104)],
@@ -70,10 +75,21 @@ The returned result is the following:
(u'questionable', 0.020535090938210487),
(u'explicit', 0.0006299660308286548)]}]
```
These tags are classified into the following four categories:
*general tags* representing general attributes included in an image,
*copyright tags* representing the specific name of the copyright,
*character tags* representing the specific name of the characters,
and *rating tags* representing X ratings.
If you want to focus on several specific tags, use ``estimate_specific_tags()`` instead.
```python
illust2vec.estimate_specific_tags([img], ["1girl", "blue eyes", "safe"])
# -> [{'1girl': 0.9873462319374084, 'blue eyes': 0.01301183458417654, 'safe': 0.9785731434822083}]
```
## Feature vector extraction
``i2v`` supports the two types of feature vectors: the real vectors and the binary ones.
``i2v`` can extract a semantic feature vector from an illustration.
```python
import i2v
from PIL import Image
@@ -86,10 +102,12 @@ illust2vec = i2v.make_i2v_with_chainer("illust2vec_ver200.caffemodel")
img = Image.open("images/miku.jpg")
# extract a 4,096-dimensional feature vector
result_real = illust2vec.extract_feature([img])
print("shape: {}, dtype: {}".format(result_real.shape, result_real.dtype))
print(result_real)
# i2v also supports a 4,096-bit binary feature vector
result_binary = illust2vec.extract_binary_feature([img])
print("shape: {}, dtype: {}".format(result_binary.shape, result_binary.dtype))
print(result_binary)
View
@@ -5,11 +5,11 @@ DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd $DIR
echo "Downloading pre-trained models..."
wget --no-check-certificate http://illustration2vec.net/models/tag_list.json.gz
wget --no-check-certificate http://illustration2vec.net/models/illust2vec_tag.prototxt
wget --no-check-certificate http://illustration2vec.net/models/illust2vec_tag_ver200.caffemodel
wget --no-check-certificate http://illustration2vec.net/models/illust2vec.prototxt
wget --no-check-certificate http://illustration2vec.net/models/illust2vec_ver200.caffemodel
wget http://illustration2vec.net/models/tag_list.json.gz
wget http://illustration2vec.net/models/illust2vec_tag.prototxt
wget http://illustration2vec.net/models/illust2vec_tag_ver200.caffemodel
wget http://illustration2vec.net/models/illust2vec.prototxt
wget http://illustration2vec.net/models/illust2vec_ver200.caffemodel
gunzip tag_list.json.gz
echo "Done."
View
@@ -27,8 +27,8 @@ def _convert_image(self, image):
arr = np.asarray(image, dtype=np.float32)
if arr.ndim == 2:
# convert a monochrome image to a color one
ret = np.zeros((arr.shape[0], arr.shape[1], 3), dtype=np.float32)
ret += arr.reshape(arr.shape[0], arr.shape[1], 1)
ret = np.empty((arr.shape[0], arr.shape[1], 3), dtype=np.float32)
ret[:] = arr.reshape(arr.shape[0], arr.shape[1], 1)
return ret
elif arr.ndim == 3:
# if arr contains alpha channel, remove it

0 comments on commit c99e100

Please sign in to comment.