# Dataset for Other Classes

## Data for "Other" Category

#### 2D Shape Structure Dataset

over 1200 shapes in 70 shape classes. Polygons, xy coordinates in json format.
calss device3 contains some rectangles, device4 contains triangles, device8
contains T shapes, device9 contains circles

- http://ubee.enseeiht.fr/ShapesDataset/
- http://ubee.enseeiht.fr/ShapesDataset/data/NamesJSON.zip

#### Mythological creatures 2D

Two-dimensional articulated shapes (silhouettes) for partial similarity experiments. The data set contains 15 shapes: 5 humans, 5 horses and 5 centaurs. Each shape differs by an articulation and additional parts. The shapes are represented as binary images in .bmp format.

- http://visl.technion.ac.il/bron/publications/BroBroBruKimIJCV07.pdf
- http://tosca.cs.technion.ac.il/book/resources_data.html
- http://tosca.cs.technion.ac.il/data/myth.zip

#### Tools 2D

35 images in 5 classes (scissors, tools). BMP format

- http://visl.technion.ac.il/bron/publications/BroBroBruKimIJCV07.pdf 
- http://tosca.cs.technion.ac.il/book/resources_data.html
- http://tosca.cs.technion.ac.il/data/tools.zip

#### MPEG-7 Shape Dataset

http://www.cis.temple.edu/~latecki/TestData/mpeg7shapeB.tar.gz

In [22]:
%matplotlib inline
import os
import numpy as np
import matplotlib.pyplot as plt
import cv2
from PIL import Image

In [29]:
dir_myth = "raw_other/myth"
dir_tool = "raw_other/tools/"
dir_mpeg = "raw_other/mpeg7shapeB/"

dir_target = "raw_other/_converted/"

In [62]:
# the normals ones
for dirname in (dir_myth, dir_tool, dir_mpeg):
    for fname in os.listdir(dirname):
        if not(fname.endswith(".bmp") or fname.endswith(".gif")):
            continue
        base, _ = os.path.splitext(fname)
        target = os.path.join(dir_target, base+".png")           
        
        img = Image.open(os.path.join(dirname, fname))
        if fname.endswith(".gif"):
            img = -1.0*np.array(img)
            imin, imax = np.min(img), np.max(img)
            img = 255*(img-imin)/(imax-imin)
            cv2.imwrite(target, img)
        else:
            img.save(target)


In [67]:
n = len(os.listdir(dir_target))
print(n)

r_test = 0.85
n_val = int((1-r_test)*n)
n_train = n - 2*n_val
print(n, n_train, n_val)

1452
1452 1018 217


In [73]:
## Shuffle, and split into test
import random
import shutil

In [71]:
fnames = os.listdir(dir_target)
random.shuffle(fnames)

In [76]:
for fname in fnames[:n_train]:
    source = os.path.join(dir_target, fname)
    tdir = os.path.join("dataset/train/OTHER")
    target = os.path.join(tdir, fname)
    if not os.path.exists(tdir):
        os.mkdir(tdir)
    shutil.copy(source, target)
    
for fname in fnames[n_train:n_train+n_val]:
    source = os.path.join(dir_target, fname)
    tdir = os.path.join("dataset/val/OTHER")
    target = os.path.join(tdir, fname)
    if not os.path.exists(tdir):
        os.mkdir(tdir)
    shutil.copy(source, target)
    
for fname in fnames[n_train+2*n_val:]:
    source = os.path.join(dir_target, fname)
    tdir = os.path.join("dataset/test/OTHER")
    target = os.path.join(tdir, fname)
    if not os.path.exists(tdir):
        os.mkdir(tdir)
    shutil.copy(source, target)