<a href="https://colab.research.google.com/github/thejawker/architector/blob/master/Architector.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Architector

> Algorithm to detect features of residential architecture.


## Possible Pipeline
1. Detect if image has house
2. Do a rough detection on labeled architecture
  - a. find a large dataset of house fronts
  - b. label these with the right architecture
  - c. train model
  - d. win
3. Recognise features (roof, window, shape of building, color, walls etc)
  - Either use supervised learning:
    - for these individual features have a huge amount of labeled data 
  - Or unsupervised
    - seperate the most distinct features in these

## Dataset
> How to get a good dataset?

#### Google Maps Street View

#### Scrape from Zillow, HotPads, Funda, etc.

### Labeling

## Things to test
Try running simple algorithm on lots of data and see if it picked up on good catagories through **unsupervised learning**.

## Roadmap

### Proof of concept
1. First do some testing, find 10 popular architecture styles
2. Find at least 100 images of those architecture styles
  - check [pintrest](https://github.com/ankitshekhawat/pinterest-scraper)
  - google images
  - zillow
  - hotpads
  - maybe use cgi to generate houses
  - [cool house concepts](https://coolhouseconcepts.com/house-plans/3-bedroom-bungalow-house-plan/)
  - check [houseplans.com](https://www.houseplans.com/collection/design-styles)
3. Train with labels
4. Put on server
5. Make api and do some demoing with test frontend

### Do the stuff above




# Links to possibly intereting things

* [Darknet (fast and accurate object / feature detection)](https://github.com/AlexeyAB/darknet)
* [Tutorial for Darknet implementation](https://www.youtube.com/watch?v=10joRJt39Ns)
* [Darnet Tensorflow implementation)[https://github.com/wizyoung/YOLOv3_TensorFlow]




In [0]:
!rm -rf /tmp/houses
!rm -rf /data

In [72]:
import zipfile

!wget https://github.com/thejawker/architector/raw/master/data/picked_house_plans.zip -P /tmp/houses

zip_ref = zipfile.ZipFile('/tmp/houses/picked_house_plans.zip', 'r')
zip_ref.extractall('/tmp/houses')
zip_ref.close()

--2020-04-19 19:34:22--  https://github.com/thejawker/architector/raw/master/data/picked_house_plans.zip
Resolving github.com (github.com)... 140.82.114.3
Connecting to github.com (github.com)|140.82.114.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/thejawker/architector/master/data/picked_house_plans.zip [following]
--2020-04-19 19:34:22--  https://raw.githubusercontent.com/thejawker/architector/master/data/picked_house_plans.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16784973 (16M) [application/zip]
Saving to: ‘/tmp/houses/picked_house_plans.zip’


2020-04-19 19:34:23 (96.4 MB/s) - ‘/tmp/houses/picked_house_plans.zip’ saved [16784973/16784973]



In [73]:
# Prepare the data

import os
from shutil import copyfile


training_split = 0.7

styles = [
    'contemporary',
    'craftsman',
    'beach',
    'bungalow',
    'classical',
    'colonial',
    'country',
    'european',
    'mediterranean',
    'modern',
    'southern',
    'victorian',
]

def copy_dataset(images, split, style):
  split_at = int(len(images) * split)
  dataset = {
    'training': images[:split_at],
    'testing': images[split_at:]
  }

  for (directory, images) in dataset.items():
    print(directory)
    print(images)
    for image in images:
      source = '/tmp/houses/picked_house_plans/{}/{}'.format(style, image)
      dest = '/data/{}/{}/{}'.format(directory, style, image)
      copyfile(source, dest)
      

for style in styles:
  try:
    os.makedirs('/data/training/{}'.format(style))
    os.makedirs('/data/testing/{}'.format(style))
  except:
    pass

for style in styles:
  images = os.listdir('/tmp/houses/picked_house_plans/{}'.format(style))
  copy_dataset(images, training_split, style)


training
['contemporary-1-9.jpg', 'contemporary-2-4.jpg', 'contemporary-12-2.jpg', 'contemporary-23-6.jpg', 'contemporary-11-3.jpg', 'contemporary-1-7.jpg', 'contemporary-1-2.jpg', 'contemporary-18-2.jpg', 'contemporary-9-2.jpg', 'contemporary-11-5.jpg', 'contemporary-6-3.jpg', 'contemporary-13-5.jpg', 'contemporary-11-2.jpg', 'contemporary-8-2.jpg', 'contemporary-9-5.jpg', 'contemporary-1-6.jpg', 'contemporary-3-2.jpg', 'contemporary-28-2.jpg', 'contemporary-22-2.jpg', 'contemporary-18-1.jpg', 'contemporary-19-5.jpg', 'contemporary-29-1.jpg', 'contemporary-28-6.jpg', 'contemporary-2-1.jpg', 'contemporary-4-4.jpg', 'contemporary-9-4.jpg', 'contemporary-4-2.jpg', 'contemporary-7-1.jpg', 'contemporary-24-4.png', 'contemporary-28-4.jpg', 'contemporary-1-10.jpg', 'contemporary-11-4.jpg', 'contemporary-22-15.jpg', 'contemporary-18-4.jpg', 'contemporary-9-3.jpg', 'contemporary-15-3.jpg', 'contemporary-13-4.jpg', 'contemporary-24-1.png', 'contemporary-6-1.jpg', 'contemporary-28-1.jpg', 'conte

In [77]:
import tensorflow as tf
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.optimizers import RMSprop

model = tf.keras.models.Sequential([
    Conv2D(16, (3, 3), input_shape=(180, 120, 3)),
    MaxPooling2D(2, 2),
    Conv2D(32, (3, 3)),
    MaxPooling2D(2, 2),
    Conv2D(64, (3, 3)),
    # MaxPooling2D(2, 2),
    # Conv2D(128, (3, 3)),
    Flatten(),
    Dense(1024, activation='relu'),
    # Dropout(0.2),
    Dense(512, activation='relu'),
    # Dense(256),
    Dense(len(styles), activation='softmax'),
])

model.summary()
model.compile(optimizer=RMSprop(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_22 (Conv2D)           (None, 178, 118, 16)      448       
_________________________________________________________________
max_pooling2d_16 (MaxPooling (None, 89, 59, 16)        0         
_________________________________________________________________
conv2d_23 (Conv2D)           (None, 87, 57, 32)        4640      
_________________________________________________________________
max_pooling2d_17 (MaxPooling (None, 43, 28, 32)        0         
_________________________________________________________________
conv2d_24 (Conv2D)           (None, 41, 26, 64)        18496     
_________________________________________________________________
flatten_5 (Flatten)          (None, 68224)             0         
_________________________________________________________________
dense_20 (Dense)             (None, 1024)             

In [78]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

batch_size = 10

TRAINING_DIR = '/data/training'
train_datagen = ImageDataGenerator(
    rescale=1.0/255.0, 
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

train_generator = train_datagen.flow_from_directory(TRAINING_DIR, 
                                                    batch_size=batch_size,
                                                    target_size=(180, 120))

VALIDATION_DIR = '/data/testing'
validation_datagen = ImageDataGenerator(
    rescale=1.0/255.0, 
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

validation_generator = validation_datagen.flow_from_directory(VALIDATION_DIR,
                                                              batch_size=batch_size,
                                                              target_size=(180, 120))

Found 780 images belonging to 12 classes.
Found 328 images belonging to 12 classes.


In [79]:
STEP_SIZE_TRAIN=train_generator.n//train_generator.batch_size
STEP_SIZE_VALID=validation_generator.n//validation_generator.batch_size

history = model.fit_generator(train_generator,
                              epochs=200,
                              steps_per_epoch=STEP_SIZE_TRAIN,
                              validation_steps=STEP_SIZE_VALID,
                              validation_data=validation_generator)

Epoch 1/200
Epoch 2/200
Epoch 3/200
15/78 [====>.........................] - ETA: 50s - loss: 2.6698 - accuracy: 0.1200

KeyboardInterrupt: ignored

In [0]:
# PLOT LOSS AND ACCURACY
%matplotlib inline

import matplotlib.image  as mpimg
import matplotlib.pyplot as plt

#-----------------------------------------------------------
# Retrieve a list of list results on training and test data
# sets for each training epoch
#-----------------------------------------------------------
acc=history.history['accuracy']
val_acc=history.history['val_accuracy']
loss=history.history['loss']
val_loss=history.history['val_loss']

epochs=range(len(acc)) # Get number of epochs

#------------------------------------------------
# Plot training and validation accuracy per epoch
#------------------------------------------------
plt.plot(epochs, acc, 'r', "Training Accuracy")
plt.plot(epochs, val_acc, 'b', "Validation Accuracy")
plt.title('Training and validation accuracy')
plt.figure()

#------------------------------------------------
# Plot training and validation loss per epoch
#------------------------------------------------
plt.plot(epochs, loss, 'r', "Training Loss")
plt.plot(epochs, val_loss, 'b', "Validation Loss")


plt.title('Training and validation loss')

# Desired output. Charts with training and validation metrics. No crash :)