image2source

Convert image to source code. In this project, the source code is in the form of markup code HTML+CSS. The algorithm used is Transformer from the paper Attention is all you need. The website image will be inputted to EfficientNet to produce the encoded low-dimensional vector. This is then passed to the encoder side of the Transformer. The decoder of the Transformer will then produce the simplified XML Notations. This notations will be converted to HTML. The length of the simplified XML Notation is at most case shorter than HTML, which is beneficial for the Natural Language Processing task (less computation time, etc). Currently, the dataset used in the training is from pix2code.

Example of generated HTML is as such:

Example 1 (Image): Generated
Example 2 (Image): Generated

(This repository is still badly documented. Future improvements will be to document the steps to produce the datasets, SXN parser, and training file arguments)

Why?

A lot of time, customers/designers have the idea of website's design ready in form of images. However, it takes time to code the design. For quick prototyping, it is often beneficial to have an "AI frontend designer".

This repository shows that converting image to source code (HTML+CSS) can be done in seconds with a decent code quality. The limitation within is that it does not perform yet as well as human for images that are not similar to the training dataset.

Training

The training and testing can be done via main.py. Make sure beforehand that the requirements of the libraries are met.

Process dataset

! This step is necessary before proceeding to training and/or predcition !

get the dataset folder web from pix2code and move it to ../datasets/pix2code/
find ../datasets/pix2code/ -name *.gui | xargs -n1 ./compiler/web-compiler.py
mkdir datasets
python annotate.py

Train with file

python3 main.py

Pretrained weight and tokenizer

This can be downloaded from : https://github.com/samuelmat19/image2source-tf2/releases/tag/1.0.0

Predict with file

python3 predict.py <target path of the image file>

The global parameters are all set at common_defintions.py.

Future improvements

Clean project and set up proper CI (prioritized)
Improve documentation
Window size is implemented, but not yet properly working. The reason window size is implemented is to reduce computation time. There are several papers that mitigate the computation time with various approaches.

CONTRIBUTING

To contribute to the project, these steps can be followed. Anyone that contributes will surely be recognized and mentioned here!

Contributions to the project are made using the "Fork & Pull" model. The typical steps would be:

create an account on github
fork this repository
make a local clone
make changes on the local copy
commit changes git commit -m "my message"
push to your GitHub account: git push origin
create a Pull Request (PR) from your GitHub fork (go to your fork's webpage and click on "Pull Request." You can then add a message to describe your proposal.)

LICENSE

This open-source project is licensed under MIT License. Also huge thanks for the work of these below in which our work is a continuation from:

https://github.com/tonybeltramelli/pix2code

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
assets		assets
compiler		compiler
image2source		image2source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
annotate.py		annotate.py
main.py		main.py
predict.py		predict.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

image2source

Table of Contents

Why?

Training

Process dataset

Train with file

Pretrained weight and tokenizer

Predict with file

Future improvements

CONTRIBUTING

LICENSE

About

Releases 2

Packages

Languages

License

samkoesnadi/image2source-tf2

Folders and files

Latest commit

History

Repository files navigation

image2source

Table of Contents

Why?

Training

Process dataset

Train with file

Pretrained weight and tokenizer

Predict with file

Future improvements

CONTRIBUTING

LICENSE

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages