Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Models

Dataset

Our network takes as input a 3 channel black-and-white image(each channel is repeated) and it gives as output a 5-channel image containing on each channel the mask for each of the website primitive. To finish building the dataset, follow these steps AFTER running the crawler from the corpus directory:

  • cd corpus
  • python3 make_corpus.py --scraped-websites-dir /path/to/scraped/websites --class-dirs list-with-scraped-primitive-names After this, the dataset is ready.

Architectures

In our experiments, we have tried models based on the encoder-decoder architecture, that contain a downsampling and an upsampling path. Currently, in the UI code, we are using frednetv2, but you can go ahead and change it to see the results we have achieved with the others.

In the table below, we have explained the differences between the implementations of UNet, PSPNet and Frednetv2, since the other frednet iterations are similar.

Name Details
UNet Based on the implementation of U-Net: Convolutional Networks for Biomedical Image Segmentation. Contains an encoder-decoder path with residual connections. It was initially used to segment images of brain scans, however we adapted it for our purpose.
PSPNet Based on Pyramid Scene Parsing Network. It is similar to the other encoder-decoder architectures, but the residual connections are achieved by concantenating layers of different sizes from the encoder and passing them through the decoder.
FrednetV2 This is an original architecture. We also base our model on the encoder-decoder networks, but we empirically determined the depth needed for our dataset. Another improvement came from using Deconvolution and Checkerboard Artifacts to remove the checkerboard artifacts obtained from the learnable upsampling layers.