# Project :  Automated front-end development using deep learning

## Project Statement:
Implement a Keras model to generate HTML code from hand-drawn website mockups.

![](https://78.media.tumblr.com/3fc281aa001f1f17c0ce8d461cdfabeb/tumblr_nky6c83pNG1rob81ao4_r1_250.gif)


## Intuition:
SketchCode is a deep learning model that takes hand-drawn web mockups and converts them into working HTML code. It uses an image captioning architecture to generate its HTML markup from hand-drawn website wireframes.
![](https://github.com/ashnkumar/sketch-code/raw/master/header_image.png)

A typical design flow can quickly turn the length of the development cycle into a bottle-neck.
![](https://cdn-images-1.medium.com/max/1000/1*zUhBTc7gJ985IGdjyinKGQ.png)

Companies like Airbnb have already started to use [machine learning](https://airbnb.design/sketching-interfaces/) to make this process more efficient.

This problem falls under a broader range of tasks known as [program synthesis](https://en.wikipedia.org/wiki/Program_synthesis), which is the automatic generation of working code. Another domain which we would be leveraging in our project is known as [image captioning](https://cs.stanford.edu/people/karpathy/deepimagesent/) that seeks to learn models that tie together images and text, specifically to generate descriptions of the contents of a source image.

![](https://cdn-images-1.medium.com/max/1000/1*_sY92m24szeF5pFUCXWpNA.png)

We will reframe our model into that of image captioning, taking a drawn website wireframe as the input image and generate its corresponding HTML code as its output text.

## Getting the dataset:
![](https://media1.giphy.com/media/bh6fRkvDxsLSw/200w.webp)

For our use case, we need thousands of hand-drawn html pages which contains various bootstrap elements like buttons, dropdowns, text-boxes, etc along with their corresponding tokens. 
The [Dataset](https://github.com/tonybeltramelli/pix2code) from the pix2code paper, consists of 1,750 screenshots of synthetically generated websites and their relevant source code. The source code for each sample consists of tokens from a domain-specific-language (DSL) that the authors of the paper created for their task. Each token corresponds to a snippet of HTML and CSS, and a compiler is used to translate from the DSL to working HTML code.

In order to generate hand-drawn images from the dataset, the following operations were performed:


*   Change the border radius of elements on the page to curve the corners of buttons and divs.
*   Adjust the thickness of borders to mimic drawn sketches, and added drop shadows.
*   Change the font to one that looks like handwriting

You can download the Dataset used for this project from [here](https://github.com/ashnkumar/sketch-code/tree/master/scripts)


We would be adding additional bootstrap elements in our training set using the python library **opencv** which will be covered in the later section of this python notebook.




## Model Architecture
The model architecture consists of three major parts:
* A computer vision model that uses a Convolutional Neural Network (CNN) to extract image features from the source images
* A language model consisting of a Gated Recurrent Unit (GRU) that encodes sequences of source code tokens
* A decoder model (also a GRU), which takes in the output from the previous two steps as its input, and predicts the next token in the sequence


![](https://cdn-images-1.medium.com/max/1250/1*vZX3R1neqV6Rf9y_baJCYw.png)

### Implementation

* To train the model, we split the source code into sequences of tokens. A single input for the model is one of these sequences along with its source image, and its label is the next token in the document. The model uses the cross-entropy cost as its loss function, which compares the model’s next token prediction with the actual next token.

############################################################################################################
   ### Image
   ![](https://github.com/karanpatel22/Machine-Learning/blob/master/01B5B64A-658F-411D-82B7-63EF0D80E8E6.png?raw=true)

### Sequence of Tokens
**header {
btn-inactive, btn-inactive, btn-inactive, btn-inactive
}
row {
double {
small-title, text, btn-orange
}
double {
small-title, text, btn-orange
}
}
row {
quadruple {
small-title, text, btn-orange
}
quadruple {
small-title, text, btn-orange
}
quadruple {
small-title, text, btn-orange
}
quadruple {
small-title, text, btn-orange
}
}
row {
single {
small-title, text, btn-orange
}
}**

############################################################################################################

* At inference time when the model is tasked with generating code from scratch, the process is slightly different. The image is still processed through the CNN network, but the text process is seeded with just a starting sequence. At each step, the model’s prediction for the next token in the sequence is appended to the current input sequence, and fed into the model as a new input sequence. This is repeated until the model predicts an <END> token or the process reaches a predefined limit to the number of tokens per document.
  
### Example of predicted sequence of tokens

\[' < START \>', 'btn-inactive', 'btn-inactive', 'row', 'row', '{', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', '}', 'row', '{', < END >']

* Once the set of predicted tokens is generated from the model, a compiler converts the DSL tokens into HTML, which can be rendered in any browser.**Refer the file 'default-dsl-mapping.json' in the folder structure for the compiler which converts tokens into HTML**


# BLEU Score

We will be using BLEU score to evaluate the model. This is a common metric used in machine translation tasks, which seeks to measure how closely a machine-generated text resembles what a human would have generated, given the same input.

Essentially, the BLEU compares n-gram sequences of both the generated text and reference text to create a modified form of precision. It’s very suitable for this project since it factors in the actual elements in the generated HTML, as well as where they are in relation to each other.

A perfect BLEU score of 1.0 would have the right elements in the right locations given the source image, while a lower score would predict the wrong elements and/or put them in the wrong locations relative to each other. The final model was able to get a BLEU score of 0.76 on the evaluation set.

# BONUS: Scaling the model to add additional bootstrap elements
* We used the python library '**open-cv**' to add a 'dropdown' element randomly to our training dataset.
### Example:
![](https://github.com/karanpatel22/Machine-Learning/blob/master/dropdown.png?raw=true)

* Add an appropriate **token-html mapping** for the dropdown element in the file -  **'default-dsl-mapping.json'**

### Code Snippet
. . . . . . . . . . . . . . 

"btn-inactive": "< li >< a href=\"#\">[]< /a >< /li >\n",

  "dropdown": "< li class=\"dropdown\" > < a href=\"#\" class=\"dropdown-toggle\" data-toggle=\"dropdown\" >[]< span class=\"caret\" >< /span >< /a >< ul class=\"dropdown-menu\" role=\"menu\">< li >< a href=\"#tab1\" data-toggle=\"tab\">[]< /a >< /li >< li >< a href=\"#tab2\" data-toggle=\"tab\">[]< /a >< /li >< /ul >< /li >\n",
  
  "row": "< div class=\"row\">{}< /div >\n",

. . . . . . . . . . . . . . 

* Add the new bootstrap element in the **vocabulary.vocab** file.
* Add the logic to generate random text for the **dropdown** button in the Node.rendering function().
* Train the model and evaluate on sample images.




In [0]:
#Google drive authentication

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
!mkdir -p drive
!google-drive-ocamlfuse drive

Please, open the following URL in a web browser: https://accounts.google.com/o/oauth2/auth?client_id=32555940559.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force
··········
Please, open the following URL in a web browser: https://accounts.google.com/o/oauth2/auth?client_id=32555940559.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force
Please enter the verification code: Access token retrieved correctly.


# Getting the Data and relevant files
## Download the sketch-code-master.zip folder from the given link and store it in colab.

In [0]:
!wget -O /content/sketch-code-master.zip 'https://www.dropbox.com/s/0b17nnfuwd1u9wc/sketch-code.zip?dl=0'
!unzip sketch-code-master.zip

In [0]:
# Folder structure
!tree sketch-code-master -I data

# Class: ImagePreprocessor
Used for performing preprocessing steps like image augmentation and resizing.

# Methods:


1.   build_image_dataset
2.   get_img_features
3.   save_resized_img_arrays
4.   get_resized_images
5.   resize_img

![](https://media2.giphy.com/media/11R7nCyinm0awo/giphy.gif)

In [0]:
#ImagePreprocessor.py
from __future__ import absolute_import

import os
import sys
import shutil

import numpy as np
from PIL import Image
import cv2
from keras.preprocessing.image import ImageDataGenerator

class ImagePreprocessor:

    def __init__(self):
        pass

    def build_image_dataset(self, data_input_folder, augment_data=True):

        print("Converting images from {} into arrays, augmentation: {}".format(data_input_folder, augment_data))
        resized_img_arrays, sample_ids = self.get_resized_images(data_input_folder)

        if augment_data == 1:
            self.augment_and_save_images(resized_img_arrays, sample_ids, data_input_folder)
        else:
            self.save_resized_img_arrays(resized_img_arrays, sample_ids, data_input_folder)

    def get_img_features(self, png_path):
        img_features = self.resize_img(png_path)
        assert(img_features.shape == (256,256,3))
        return img_features


   ##########################################
   ####### PRIVATE METHODS ##################
   ##########################################



    def save_resized_img_arrays(self, resized_img_arrays, sample_ids, output_folder):
        count = 0
        for img_arr, sample_id in zip(resized_img_arrays, sample_ids):
            npz_filename = "{}/{}.npz".format(output_folder, sample_id)
            np.savez_compressed(npz_filename, features=img_arr)
            retrieve = np.load(npz_filename)["features"]
            assert np.array_equal(img_arr, retrieve)
            count += 1
        print("Saved down {} resized images to folder {}".format(count, output_folder))
        del resized_img_arrays

    def augment_and_save_images(self, resized_img_arrays, sample_ids, data_input_folder):
        datagen = ImageDataGenerator(
                                 rotation_range=2,
                                 width_shift_range=0.05,
                                 height_shift_range=0.05,
                                 zoom_range=0.05
                                )
        keras_generator = datagen.flow(resized_img_arrays,sample_ids,batch_size=1)
        count = 0
        for i in range(len(resized_img_arrays)):
            img_arr, sample_id = next(keras_generator)
            img_arr = np.squeeze(img_arr)
            npz_filename = "{}/{}.npz".format(data_input_folder, sample_id[0])
            im = Image.fromarray(img_arr.astype('uint8'))
            np.savez_compressed(npz_filename, features=img_arr)
            retrieve = np.load(npz_filename)["features"]
            assert np.array_equal(img_arr, retrieve)
            count += 1
        print("Saved down {} augmented images to folder {}".format(count, data_input_folder))
        del resized_img_arrays

    def get_resized_images(self, pngs_input_folder):
        all_files = os.listdir(pngs_input_folder)
        png_files = [f for f in all_files if f.find(".png") != -1]
        images = []
        labels = []
        for png_file_path in png_files:
            png_path = "{}/{}".format(pngs_input_folder, png_file_path)
            sample_id = png_file_path[:png_file_path.find('.png')]
            resized_img_arr = self.resize_img(png_path)
            images.append(resized_img_arr)
            labels.append(sample_id)
        return np.array(images), np.array(labels)

    def resize_img(self, png_file_path):
        img_rgb = cv2.imread(png_file_path)
        img_grey = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
        img_adapted = cv2.adaptiveThreshold(img_grey, 255, cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY, 101, 9)
        img_stacked = np.repeat(img_adapted[...,None],3,axis=2)
        resized = cv2.resize(img_stacked, (200,200), interpolation=cv2.INTER_AREA)
        bg_img = 255 * np.ones(shape=(256,256,3))
        bg_img[27:227, 27:227,:] = resized
        bg_img /= 255
        return bg_img

Using TensorFlow backend.


# Class: Dataset
Used for performing tasks such as dividing dataset into training and validation set.

# Methods:


1.   split_datasets
2.   split_samples
3.   preprocess_data
4.   load_vocab
5.    create_generator
6.  data_generator
7.  process_data_for_generator
8.  load_data
9.  create_data_folders
10. copy_files_to_folders
11. delete_existing_folders
12. populate_sample_ids
13. get_all_id_sets
14. split_paths

![](https://media1.giphy.com/media/t1lmQR7sXeZJC/giphy.gif)

In [0]:
#Dataset.py
from __future__ import absolute_import

import os
import shutil
import pdb
import hashlib
import numpy as np

from keras.preprocessing.text import Tokenizer, one_hot
from keras.preprocessing.sequence import pad_sequences
from keras.utils import to_categorical


VOCAB_FILE              = 'sketch-code-master/vocabulary.vocab'
TRAINING_SET_NAME       = "training_set"
VALIDATION_SET_NAME     = "validation_set"
BATCH_SIZE              = 64

class Dataset:

    def __init__(self, data_input_folder, test_set_folder=None):
        self.data_input_folder = data_input_folder
        self.test_set_folder   = test_set_folder

    def split_datasets(self, validation_split):
        sample_ids = self.populate_sample_ids()
        print("Total number of samples: ", len(sample_ids))

        train_set_ids, val_set_ids, shuffled_sampled_ids = self.get_all_id_sets(validation_split, sample_ids)
        training_path, validation_path = self.split_samples(train_set_ids, val_set_ids)

        return training_path, validation_path

    def split_samples(self, train_set_ids, val_set_ids):
        training_path, validation_path = self.create_data_folders()
        self.copy_files_to_folders(train_set_ids, training_path)
        self.copy_files_to_folders(val_set_ids, validation_path)
        return training_path, validation_path

    def preprocess_data(self, training_path, validation_path, augment_training_data):
        train_img_preprocessor = ImagePreprocessor()
        train_img_preprocessor.build_image_dataset(training_path, augment_data=augment_training_data)
        val_img_preprocessor = ImagePreprocessor()
        val_img_preprocessor.build_image_dataset(validation_path, augment_data=0)




    ##########################################
    ####### PRIVATE METHODS ##################
    ##########################################

    @classmethod
    def load_vocab(cls):
        file = open(VOCAB_FILE, 'r')
        text = file.read().splitlines()[0]
        file.close()
        tokenizer = Tokenizer(filters='', split=" ", lower=False)
        tokenizer.fit_on_texts([text])
        vocab_size = len(tokenizer.word_index) + 1
        return tokenizer, vocab_size

    @classmethod
    def create_generator(cls, data_input_path, max_sequences):
        img_features, text_features = Dataset.load_data(data_input_path)
        total_sequences = 0
        for text_set in text_features: total_sequences += len(text_set.split())
        steps_per_epoch = total_sequences // BATCH_SIZE
        tokenizer, vocab_size = Dataset.load_vocab()
        data_gen = Dataset.data_generator(text_features, img_features, max_sequences, tokenizer, vocab_size)
        return data_gen, steps_per_epoch

    @classmethod
    def data_generator(cls, text_features, img_features, max_sequences, tokenizer, vocab_size):
        while 1:
            for i in range(0, len(text_features), 1):
                Ximages, XSeq, y = list(), list(),list()
                for j in range(i, min(len(text_features), i+1)):
                    image = img_features[j]
                    desc = text_features[j]
                    in_img, in_seq, out_word = Dataset.process_data_for_generator([desc], [image], max_sequences, tokenizer, vocab_size)
                    for k in range(len(in_img)):
                        Ximages.append(in_img[k])
                        XSeq.append(in_seq[k])
                        y.append(out_word[k])
                yield [[np.array(Ximages), np.array(XSeq)], np.array(y)]

    @classmethod
    def process_data_for_generator(cls, texts, features, max_sequences, tokenizer, vocab_size):
        X, y, image_data = list(), list(), list()
        sequences = tokenizer.texts_to_sequences(texts)
        for img_no, seq in enumerate(sequences):
            for i in range(1, len(seq)):
                in_seq, out_seq = seq[:i], seq[i]
                in_seq = pad_sequences([in_seq], maxlen=max_sequences)[0]
                out_seq = to_categorical([out_seq], num_classes=vocab_size)[0]
                image_data.append(features[img_no])
                X.append(in_seq[-48:])
                y.append(out_seq)
        return np.array(image_data), np.array(X), np.array(y)

    @classmethod
    def load_data(cls, data_input_path):
        text = []
        images = []
        all_filenames = os.listdir(data_input_path)
        all_filenames.sort()
        for filename in all_filenames:
            if filename[-3:] == "npz":
                image = np.load(data_input_path+'/'+filename)
                images.append(image['features'])
            elif filename[-3:] == 'gui':
                file = open(data_input_path+'/'+filename, 'r')
                texts = file.read()
                file.close()
                syntax = '<START> ' + texts + ' <END>'
                syntax = ' '.join(syntax.split())
                syntax = syntax.replace(',', ' ,')
                text.append(syntax)
        images = np.array(images, dtype=float)
        return images, text

    def create_data_folders(self):
        training_path = "{}/{}".format(os.path.dirname(self.data_input_folder), TRAINING_SET_NAME)
        validation_path = "{}/{}".format(os.path.dirname(self.data_input_folder), VALIDATION_SET_NAME)

        self.delete_existing_folders(training_path)
        self.delete_existing_folders(validation_path)

        if not os.path.exists(training_path): os.makedirs(training_path)
        if not os.path.exists(validation_path): os.makedirs(validation_path)
        return training_path, validation_path

    def copy_files_to_folders(self, sample_ids, output_folder):
        copied_count = 0
        for sample_id in sample_ids:
            sample_id_png_path = "{}/{}.png".format(self.data_input_folder, sample_id)
            sample_id_gui_path = "{}/{}.gui".format(self.data_input_folder, sample_id)
            if os.path.exists(sample_id_png_path) and os.path.exists(sample_id_gui_path):
                output_png_path = "{}/{}.png".format(output_folder, sample_id)
                output_gui_path = "{}/{}.gui".format(output_folder, sample_id)
                shutil.copyfile(sample_id_png_path, output_png_path)
                shutil.copyfile(sample_id_gui_path, output_gui_path)
                copied_count += 1
        print("Moved {} files from {} to {}".format(copied_count, self.data_input_folder, output_folder))

    def delete_existing_folders(self, folder_to_delete):
        if os.path.exists(folder_to_delete):
            shutil.rmtree(folder_to_delete)
            print("Deleted existing folder: {}".format(folder_to_delete))

    def populate_sample_ids(self):
        all_sample_ids = []
        full_path = self.data_input_folder
        for f in os.listdir(full_path):
            if f.find(".gui") != -1:
                file_name = f[:f.find(".gui")]
                if os.path.isfile("{}/{}.png".format(self.data_input_folder, file_name)):
                    all_sample_ids.append(file_name)
        return all_sample_ids

    def get_all_id_sets(self, validation_split, sample_ids):
        np.random.shuffle(sample_ids)
        val_count = int(validation_split * len(sample_ids))
        train_count = len(sample_ids) - val_count
        print("Splitting datasets, training samples: {}, validation samples: {}".format(train_count, val_count))
        train_set, val_set = self.split_paths(sample_ids, train_count, val_count)

        return train_set, val_set, sample_ids

    def split_paths(self, sample_ids, train_count, val_count):
        count = 0
        train_set = []
        val_set = []
        hashes = []
        for sample_id in sample_ids:
            f = open("{}/{}.gui".format(self.data_input_folder, sample_id), 'r', encoding='utf-8')

            with f:
                chars = ""
                for line in f:
                    chars += line
                content_hash = chars.replace(" ", "").replace("\n", "")
                content_hash = hashlib.sha256(content_hash.encode('utf-8')).hexdigest()

                if len(val_set) == val_count:
                    train_set.append(sample_id)
                else:
                    is_unique = True
                    for h in hashes:
                        if h is content_hash:
                            is_unique = False
                            break

                    if is_unique:
                        val_set.append(sample_id)
                    else:
                        train_set.append(sample_id)
                
                hashes.append(content_hash)
        assert len(val_set) == val_count

        return train_set, val_set


# Class: ModelUtils
# Methods:
1. prepare_data_for_training

In [0]:
#ModelUtils.py
from __future__ import absolute_import

class ModelUtils:

    @staticmethod
    def prepare_data_for_training(data_input_folder, validation_split, augment_training_data):
        dataset = Dataset(data_input_folder)
        training_path, validation_path = dataset.split_datasets(validation_split)
        dataset.preprocess_data(training_path, validation_path, augment_training_data)

        return training_path, validation_path

# Class: SketchCodeModel
The model architecture which consists of a CNN, a langauge model and a decoder(GRU).

# Methods:


1.   load_model
2.   save_model
3.   create_model
4.   train
5.   construct_callbacks

**Winograd Implementation**

We implemented the Winograd algorithm in the CNN model:
#### Observations:
   
| |Normal Convolution|Winograd Convolution|
|:-:|:-:|:-:|
|Parameters |139,723,686  |7,587,814 
|Execution time | 219.33 min (approx)| 196.66 min (approx)

![](https://media2.giphy.com/media/fo7URGO8QtGPToGZrk/giphy.gif)





In [0]:
#SketchCodeModel.py
from __future__ import absolute_import

from keras.models import Model, Sequential, model_from_json
from keras.callbacks import ModelCheckpoint, CSVLogger, Callback
from keras.layers.core import Dense, Dropout, Flatten
from keras.layers import Embedding, GRU, TimeDistributed, RepeatVector, LSTM, concatenate, Concatenate , Input, Reshape, Dense
from keras.layers.convolutional import Conv2D
from keras.optimizers import RMSprop


MAX_LENGTH = 48
MAX_SEQ    = 150
Conv_master = None

class SketchCodeModel():

    def __init__(self, model_output_path, model_json_file=None, model_weights_file=None):

        # Create model output path
        self.model_output_path = model_output_path

        # If we have an existing model json / weights, load in that model
        if model_json_file is not None and model_weights_file is not None:
            self.model = self.load_model(model_json_file, model_weights_file)
            optimizer = RMSprop(lr=0.0001, clipvalue=1.0)
            self.model.compile(loss='categorical_crossentropy', optimizer=optimizer)
            print("Loaded pretrained model from disk")

        # Create a new model if we don't have one
        else:
            self.create_model()
            print("Created new model, vocab size: {}".format(self.vocab_size))

        print(self.model.summary())

    def load_model(self, model_json_file, model_weights_file):
        json_file = open(model_json_file, 'r')
        loaded_model_json = json_file.read()
        json_file.close()
        loaded_model = model_from_json(loaded_model_json)
        loaded_model.load_weights(model_weights_file)
        return loaded_model
      
    def save_model(self):
        model_json = self.model.to_json()
        with open("{}/model_json.json".format(self.model_output_path), "w") as json_file:
            json_file.write(model_json)
        self.model.save_weights("{}/weights.h5".format(self.model_output_path))

    def create_model(self):
        tokenizer, vocab_size = Dataset.load_vocab()
        self.vocab_size = vocab_size
        
        ## WINOGRAD IMPLEMENTATION
        visual_input = Input(shape=(256,256,3))
        Conv2D_1 = Conv2D(16, (3,1), activation = 'relu',padding='valid')(visual_input)
        Conv2D_1 = Conv2D(16, (1,3), activation = 'relu',padding='valid')(Conv2D_1)
        Conv2D_2 = Conv2D(16, (3,1), activation = 'relu',padding='same',strides = 2)(Conv2D_1)
        Conv2D_2 = Conv2D(16, (1,3), activation = 'relu',padding='same',strides = 2)(Conv2D_2)
        Conv2D_3 = Conv2D(32, (3,1), activation = 'relu',padding='same')(Conv2D_2)
        Conv2D_3 = Conv2D(32, (1,3), activation = 'relu',padding='same')(Conv2D_3)
        Conv2D_4 = Conv2D(32, (3,1), activation = 'relu',padding='same',strides = 2)(Conv2D_3)
        Conv2D_4 = Conv2D(32, (1,3), activation = 'relu',padding='same',strides = 2)(Conv2D_4)
        Conv2D_5 = Conv2D(64, (3,1), activation = 'relu',padding='same')(Conv2D_4)
        Conv2D_5 = Conv2D(64, (1,3), activation = 'relu',padding='same')(Conv2D_5)
        Conv2D_6 = Conv2D(64, (3,1), activation = 'relu',padding='same',strides = 2)(Conv2D_5)
        Conv2D_6 = Conv2D(64, (1,3), activation = 'relu',padding='same',strides = 2)(Conv2D_6)
        Conv2D_7 = Conv2D(128, (3,1), activation = 'relu',padding='same')(Conv2D_6)
        Conv2D_7 = Conv2D(128, (1,3), activation = 'relu',padding='same')(Conv2D_7)
        flat = Flatten()(Conv2D_7)
        dense1 = Dense(1024, activation='relu')(flat)
        dense1 = Dropout(0.3)(dense1)
        dense2 = Dense(1024, activation='relu')(dense1)
        dense2 = Dropout(0.3)(dense2)
        encoded_image = RepeatVector(MAX_LENGTH)(dense2)
        
        
        # Language encoder
        language_input = Input(shape=(MAX_LENGTH,))
        language_model = Embedding(vocab_size, 50, input_length=MAX_LENGTH, mask_zero=True)(language_input)
        language_model = GRU(128, return_sequences=True)(language_model)
        language_model = GRU(128, return_sequences=True)(language_model)

        # Decoder
        decoder = concatenate([encoded_image, language_model])
        decoder = GRU(512, return_sequences=True)(decoder)
        decoder = GRU(512, return_sequences=False)(decoder)
        decoder = Dense(vocab_size, activation='softmax')(decoder)

        # Compile the model
        self.model = Model(inputs=[visual_input, language_input], outputs=decoder)
        optimizer = RMSprop(lr=0.0001, clipvalue=1.0)
        self.model.compile(loss='categorical_crossentropy', optimizer=optimizer)

    def train(self, training_path, validation_path, epochs):

        # Setup data generators
        training_generator, train_steps_per_epoch = Dataset.create_generator(training_path, max_sequences=MAX_SEQ)
        validation_generator, val_steps_per_epoch = Dataset.create_generator(validation_path, max_sequences=MAX_SEQ)

        # Setup model callbacks
        callbacks_list = self.construct_callbacks(validation_path)

        # Begin training
        print("\n### Starting model training ###\n")
        self.model.fit_generator(generator=training_generator, validation_data=validation_generator, epochs=epochs, shuffle=False, validation_steps=val_steps_per_epoch, steps_per_epoch=train_steps_per_epoch, callbacks=callbacks_list, verbose=1)
        print("\n### Finished model training ###\n")
        self.save_model()

    def construct_callbacks(self, validation_path):
        checkpoint_filepath="{}/".format(self.model_output_path) + "weights-epoch-{epoch:04d}--val_loss-{val_loss:.4f}--loss-{loss:.4f}.h5"
        csv_logger = CSVLogger("{}/training_val_losses.csv".format(self.model_output_path))
        checkpoint = ModelCheckpoint(checkpoint_filepath,
                                    verbose=0,
                                    save_weights_only=True,
                                    save_best_only=True,
                                    mode= 'min',
                                    period=2)
        callbacks_list = [checkpoint, csv_logger]
        return callbacks_list


# Class: Evaluator
Used to evaluate the conversion of a sample image to HTML code. 
Current BLEU Score: **0.76**

# Methods:
1. get_sentence_bleu
2. get_corpus_bleu
3. load_gui_doc
4. load_guis_from_folder

![](https://media3.giphy.com/media/7SX08DeQl8WEgGxRVZ/giphy.gif)

In [0]:
#Evaluator.py
from __future__ import print_function
from __future__ import absolute_import

import pdb
import os
import operator
from nltk.translate.bleu_score import sentence_bleu, corpus_bleu

class Evaluator:
    def __init__(self):
        pass

    @classmethod
    def get_sentence_bleu(cls, original_gui_filepath, generated_gui_filepath):
        original_gui = Evaluator.load_gui_doc(original_gui_filepath)
        generated_gui = Evaluator.load_gui_doc(generated_gui_filepath)
        hypothesis = generated_gui[1:-1]
        reference = original_gui
        references = [reference]
        return sentence_bleu(references, hypothesis)

    @classmethod
    def get_corpus_bleu(cls, original_guis_filepath, predicted_guis_filepath):
        actuals, predicted = Evaluator.load_guis_from_folder(original_guis_filepath, predicted_guis_filepath)
        regular_bleu = corpus_bleu(actuals, predicted)
        return regular_bleu

    @classmethod
    def load_gui_doc(cls, gui_filepath):
        file = open(gui_filepath, 'r')
        gui = file.read()
        file.close()
        gui = ' '.join(gui.split())
        gui = gui.replace(',', ' ,')
        gui = gui.split()

        # Predicted images don't have color so we normalize all buttons to btn-orange or btn-active
        btns_to_replace = ['btn-green', 'btn-red']
        normalized_gui = ['btn-orange' if token in btns_to_replace else token for token in gui]
        normalized_gui = ['btn-active' if token == 'btn-inactive' else token for token in normalized_gui]
        return normalized_gui

    @classmethod
    def load_guis_from_folder(cls, original_guis_filepath, predicted_guis_filepath):
        actuals, predicted = list(), list()
        all_files = os.listdir(predicted_guis_filepath)
        all_predicted_files = os.listdir(predicted_guis_filepath)
        all_predicted_guis = [f for f in all_predicted_files if f.find('.gui') != -1]
        all_predicted_guis.sort()
        guis = []
        for f in all_predicted_guis:
            generated_gui_filepath = "{}/{}".format(predicted_guis_filepath, f)
            actual_gui_filepath = "{}/{}".format(original_guis_filepath, f)
            if os.path.isfile(actual_gui_filepath):
                predicted_gui = Evaluator.load_gui_doc(generated_gui_filepath)
                actual_gui = Evaluator.load_gui_doc(actual_gui_filepath)

                predicted.append(predicted_gui[1:-1])
                actuals.append([actual_gui])
        return actuals, predicted

# Class: SamplerUtils
Used to generate random text for the bootstrap elements.

# Methods:
1. get_random_text

In [0]:
#SamplerUtils.py
from __future__ import print_function
from __future__ import absolute_import

import string
import random

class SamplerUtils:

    @staticmethod
    def get_random_text(length_text=10, space_number=1, with_upper_case=True):
        results = []
        while len(results) < length_text:
            char = random.choice(string.ascii_letters[:26])
            results.append(char)
        if with_upper_case:
            results[0] = results[0].upper()

        current_spaces = []
        while len(current_spaces) < space_number:
            space_pos = random.randint(2, length_text - 3)
            if space_pos in current_spaces:
                break
            results[space_pos] = " "
            if with_upper_case:
                results[space_pos + 1] = results[space_pos - 1].upper()

            current_spaces.append(space_pos)

        return ''.join(results)

# Class: Node
Used in rendering the generated gui code to HTML.
# Methods:


1.   add_child
2.   show
3.   rendering_function
4.   render


In [0]:
#Node.py
from __future__ import print_function
from __future__ import absolute_import

TEXT_PLACE_HOLDER = "[]"

class Node:

    def __init__(self, key, parent_node, content_holder):
        self.key = key
        self.parent = parent_node
        self.children = []
        self.content_holder = content_holder

    def add_child(self, child):
        self.children.append(child)

    def show(self):
        for child in self.children:
            child.show()

    def rendering_function(self, key, value):
        if key.find("btn") != -1:
            value = value.replace(TEXT_PLACE_HOLDER, SamplerUtils.get_random_text())
        elif key.find("title") != -1:
            value = value.replace(TEXT_PLACE_HOLDER, SamplerUtils.get_random_text(length_text=5, space_number=0))
        elif key.find("text") != -1:
            value = value.replace(TEXT_PLACE_HOLDER,
                                  SamplerUtils.get_random_text(length_text=56, space_number=7, with_upper_case=False))
        return value

    def render(self, mapping, rendering_function=None):
        content = ""
        for child in self.children:
            placeholder = child.render(mapping, self.rendering_function)
            if placeholder is None:
                self = None
                return
            else:
                content += placeholder

        value = mapping.get(self.key, None)

        if value is None:
            self = None
            return None

        if rendering_function is not None:
            value = self.rendering_function(self.key, value)

        if len(self.children) != 0:
            value = value.replace(self.content_holder, content)

        return value

# Compiler
Used for compiling the generated tokens to their respective HTML code based on various styles : default, facebook, airbnb.
# Methods:


1.   get_stylesheet
2.   compile


In [0]:
#Compiler.py
from __future__ import print_function
from __future__ import absolute_import

import os
import json


BASE_DIR_NAME = os.path.dirname('sketch-code-master/src/classes/inference/')
DEFAULT_DSL_MAPPING_FILEPATH = "{}/styles/default-dsl-mapping.json".format(BASE_DIR_NAME)
FACEBOOK_DSL_MAPPING_FILEPATH = "{}/styles/facebook_dsl_mapping.json".format(BASE_DIR_NAME)
AIRBNB_DSL_MAPPING_FILEPATH = "{}/styles/airbnb_dsl_mapping.json".format(BASE_DIR_NAME)


class Compiler:
    def __init__(self, style):
        style_json = self.get_stylesheet(style)
        with open(style_json) as data_file:
            self.dsl_mapping = json.load(data_file)

        self.opening_tag = self.dsl_mapping["opening-tag"]
        self.closing_tag = self.dsl_mapping["closing-tag"]
        self.content_holder = self.opening_tag + self.closing_tag

        self.root = Node("body", None, self.content_holder)

    def get_stylesheet(self, style):
        if style == 'default':
            return DEFAULT_DSL_MAPPING_FILEPATH
        elif style == 'facebook':
            return FACEBOOK_DSL_MAPPING_FILEPATH
        elif style == 'airbnb':
            return AIRBNB_DSL_MAPPING_FILEPATH

    def compile(self, generated_gui):
        dsl_file = generated_gui

        #Parse fix
        dsl_file = dsl_file[1:-1]
        dsl_file = ' '.join(dsl_file)
        dsl_file = dsl_file.replace('{', '{8').replace('}', '8}8')
        dsl_file = dsl_file.replace(' ', '')
        dsl_file = dsl_file.split('8')
        dsl_file = list(filter(None, dsl_file))

        current_parent = self.root
        for token in dsl_file:
            token = token.replace(" ", "").replace("\n", "")

            if token.find(self.opening_tag) != -1:
                token = token.replace(self.opening_tag, "")
                element = Node(token, current_parent, self.content_holder)
                current_parent.add_child(element)
                current_parent = element
            elif token.find(self.closing_tag) != -1:
                current_parent = current_parent.parent
            else:
                tokens = token.split(",")
                for t in tokens:
                    element = Node(t, current_parent, self.content_holder)
                    current_parent.add_child(element)

        output_html = self.root.render(self.dsl_mapping)
        if output_html is None: return "HTML Parsing Error"

        return output_html

# Class: Sampler
Used for generating the GUI code by predicting the sequence of tokens for a sample image.

# Methods:


1.   convert_batch_of_images
2.   convert_single_image
3.   load_model
4.   generate_gui
5.   generate_html
6.   word_for_id
7.   write_gui_to_disk



In [0]:
#Sampler.py
from __future__ import absolute_import

import sys
import os
import shutil
import json
import numpy as np

from keras.models import model_from_json
from keras.preprocessing.sequence import pad_sequences

MAX_LENGTH = 48

class Sampler:

    def __init__(self, model_json_path=None, model_weights_path=None):
        self.tokenizer, self.vocab_size = Dataset.load_vocab()
        self.model = self.load_model(model_json_path, model_weights_path)

    def convert_batch_of_images(self, output_folder, pngs_path, get_corpus_bleu, original_guis_filepath, style):

        all_filenames = os.listdir(pngs_path)
        all_filenames.sort()
        generated_count = 0
        for filename in all_filenames:
            if filename.find('.png') != -1:
                png_path = "{}/{}".format(pngs_path, filename)
                try:
                    self.convert_single_image(output_folder, png_path, print_generated_output=0, get_sentence_bleu=0, original_gui_filepath=png_path, style=style)
                    generated_count += 1
                except:
                    print("Error with GUI / HTML generation:", sys.exc_info()[0])
                    print(sys.exc_info())
                    continue
        print("Generated code for {} images".format(generated_count))

        if (get_corpus_bleu == 1) and (original_guis_filepath is not None):
            print("BLEU score: {}".format(Evaluator.get_corpus_bleu(original_guis_filepath, output_folder)))

    def convert_single_image(self, output_folder, png_path, print_generated_output, get_sentence_bleu, original_gui_filepath, style):

        # Retrieve sample ID
        png_filename = os.path.basename(png_path)
        if png_filename.find('.png') == -1:
            raise ValueError("Image is not a png!")
        sample_id = png_filename[:png_filename.find('.png')]

        # Generate GUI
        print("Generating code for sample ID {}".format(sample_id))
        generated_gui, gui_output_filepath= self.generate_gui(png_path, print_generated_output=print_generated_output, output_folder=output_folder, sample_id=sample_id)

        # Generate HTML
        generated_html = self.generate_html(generated_gui, sample_id, print_generated_output=print_generated_output, output_folder=output_folder, style=style)

        # Get BLEU
        if get_sentence_bleu == 1 and (original_gui_filepath is not None):
            print("BLEU score: {}".format(Evaluator.get_sentence_bleu(original_gui_filepath, gui_output_filepath)))


    ##########################################
    ####### PRIVATE METHODS ##################
    ##########################################

    def load_model(self, model_json_path, model_weights_path):
        json_file = open(model_json_path, 'r')
        loaded_model_json = json_file.read()
        json_file.close()
        loaded_model = model_from_json(loaded_model_json)
        loaded_model.load_weights(model_weights_path)
        print("\nLoaded model from disk")
        return loaded_model

    def generate_gui(self, png_path, print_generated_output, sample_id, output_folder):
        test_img_preprocessor = ImagePreprocessor()
        img_features = test_img_preprocessor.get_img_features(png_path)

        in_text = '<START> '
        photo = np.array([img_features])
        for i in range(150):
            sequence = self.tokenizer.texts_to_sequences([in_text])[0]
            sequence = pad_sequences([sequence], maxlen=MAX_LENGTH)
            yhat = self.model.predict([photo, sequence], verbose=0)
            yhat = np.argmax(yhat)
            word = self.word_for_id(yhat)
            if word is None:
                break
            in_text += word + ' '
            if word == '<END>':
                break

        generated_gui = in_text.split()

        if print_generated_output is 1:
            print("\n=========\nGenerated GUI code:")
            print(generated_gui)

        gui_output_filepath = self.write_gui_to_disk(generated_gui, sample_id, output_folder)

        return generated_gui, gui_output_filepath

    def generate_html(self, gui_array, sample_id, print_generated_output, output_folder, style='default'):

        compiler = Compiler(style)
        compiled_website = compiler.compile(gui_array)

        if print_generated_output is 1:
            print("\nCompiled HTML:")
            print(compiled_website)

        if compiled_website != 'HTML Parsing Error':
            output_filepath = "{}/{}.html".format(output_folder, sample_id)
            with open(output_filepath, 'w') as output_file:
                output_file.write(compiled_website)
                print("Saved generated HTML to {}".format(output_filepath))

    def word_for_id(self, integer):
        for word, index in self.tokenizer.word_index.items():
            if index == integer:
                return word
        return None

    def write_gui_to_disk(self, gui_array, sample_id, output_folder):
        gui_output_filepath = "{}/{}.gui".format(output_folder, sample_id)
        with open(gui_output_filepath, 'w') as out_f:
            out_f.write(' '.join(gui_array))
        return gui_output_filepath


# Training the model
Initiate training of the model from sratch or start training from a pretrained model.

![](https://media2.giphy.com/media/26tPrcX6EfSj5N0HK/giphy.gif)




In [0]:
#train.py
#!/usr/bin/env python
from __future__ import print_function
from __future__ import absolute_import

from argparse import ArgumentParser

VAL_SPLIT = 0.2

def main():
    data_input_path = 'sketch-code-master/data'  ## directory containing images and guis
    validation_split = 0.2  ## portion of training data for validation set
    epochs = 10  ## number of epochs to train on
    model_output_path = 'sketch-code-master/model_output'  ## directory for saving model data 
    model_json_file = None ## pretrained model json file 
    model_weights_file = None ## pretrained model weights file
    augment_training_data = 1 ## use Keras image augmentation on training data

    # Load model
    model = SketchCodeModel(model_output_path, model_json_file, model_weights_file)

    # Create the model output path if it doesn't exist
    if not os.path.exists(model_output_path):
        os.makedirs(model_output_path)

    # Split the datasets and save down image arrays
    training_path, validation_path = ModelUtils.prepare_data_for_training(data_input_path, validation_split, augment_training_data)

    # Begin model training
    model.train(training_path=training_path,
                validation_path=validation_path,
                epochs=epochs)

if __name__ == "__main__":
    main()

Created new model, vocab size: 18
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_3 (InputLayer)            (None, 256, 256, 3)  0                                            
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 254, 256, 16) 160         input_3[0][0]                    
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 254, 254, 16) 784         conv2d_8[0][0]                   
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 127, 127, 16) 784         conv2d_9[0][0]                   
___________________________________________________________________________



Epoch 2/10

Epoch 3/10

Epoch 4/10
 148/1468 [==>...........................] - ETA: 16:26 - loss: 0.0788

Epoch 5/10
  46/1468 [..............................] - ETA: 17:49 - loss: 0.0778

Epoch 6/10
   6/1468 [..............................] - ETA: 18:18 - loss: 0.0695



Epoch 7/10

Epoch 8/10

Epoch 9/10
 148/1468 [==>...........................] - ETA: 16:18 - loss: 0.0799

Epoch 10/10
  46/1468 [..............................] - ETA: 17:26 - loss: 0.0711


### Finished model training ###



# Sample Input Image
<img src="https://github.com/karanpatel22/Winograd/blob/master/drawn_example2.png?raw=true">


# ConvertSingleImage

Used to validate your model by testing it on sample images.

In [0]:
#ConvertSingleImage.py
#!/usr/bin/env python
import sys
import os
from argparse import ArgumentParser
from os.path import basename

def main():
    png_path = 'sketch-code-master/examples/drawn_example2.png' ## png filepath to convert into HTML
    output_folder = 'sketch-code-master/output/generated-2.html' ## dir to save generated gui and html
    model_json_file = 'sketch-code-master/model_output/model_json.json' ## trained model json file
    model_weights_file = 'sketch-code-master/model_output/weights.h5' ## trained model weights file
    print_generated_output = 1 ## see generated GUI output in terminal
    print_bleu_score = 0 ## see BLEU score for single example
    original_gui_filepath = None ## if getting BLEU score, provide original gui filepath
    style = 'default' ## style to use for generation - default,facebook,airbnb

    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    sampler = Sampler(model_json_path=model_json_file,model_weights_path = model_weights_file)
    sampler.convert_single_image(output_folder, png_path=png_path, print_generated_output=print_generated_output, get_sentence_bleu=print_bleu_score, original_gui_filepath=original_gui_filepath, style=style)

if __name__ == "__main__":
  main()


Loaded model from disk
Generating code for sample ID drawn_example2

Generated GUI code:
['<START>', 'header', '{', 'btn-inactive', ',', 'btn-inactive', ',', 'btn-inactive', ',', 'btn-inactive', ',', 'btn-inactive', '}', 'row', '{', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', 'quadruple', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', '}', 'row', '{', 'single', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', '}', 'row', '{', 'single', '{', 'small-title', ',', 'text', ',', 'btn-orange', '}', '}', '<END>']

Compiled HTML:
<html>
  <header>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+Pm

# Output Image
<img src="https://github.com/karanpatel22/Winograd/blob/master/output.PNG?raw=true">