GitHub - steve-levesque/Portfolio-AI-DesignToCode: Website generation from a drawn wireframe. This is an attempt to show the usage of AI (OCR) in web development and how it can help reduce time spent in the workflow from drawing the idea to the actual code.

About

Introduction

DesignToCode is a project aiming to show a brief example of AI's use in web development to automate tedious and repititive tasks such as the conversion of the idea (Design) into a real website (Code). This is the HTML and CSS programming respectively.

In summary, computer vision is used to analyse a wireframe to parse all its useful components like text to shapes into computable data for the website to be generated from.

Below, the steps will be explained separately and in order to show the life cycle from the drawing to the HTML/CSS generated website.

No training is necessary for the given examples, only the needed programs. (unless Tesseract is set to work on another language than English or if it is not powerful enough)

All steps in the workflow could be potentially replaced with alternatives and even be refactored.

NB: This project is not made to create semantically and/or production ready websites.

Workflow

The lifecycle of this project has been made by steps which contain its own responsability in order to make it easier to understand and follow.

Here is a brief list with a picture below :

Wireframe Drawing.
Optical Character Recognition (OCR)
Shape Recognition
Data Parsing
Data Mapping
Website Generation

Wireframe Drawing

This is the part with the most human interaction. The user will draw a website in a wireframe nomenclature with a few rules such as:

A picture is represented with a box containing a "X" in it covering the full area.
An element's type is specified at the top-left or bottom-right most corner from the inside or the outside.
It is not necessary to draw everything perfectly at the pixel close, but the margin of error should not be above 10 pixels for each side of a shape.

Optical Character Recognition (OCR)

When the wireframe is done properly, the OCR with Tesseract-OCR and Python will find the words written on it.

The result is then given as a dictionnary to get every parameters such as the coordinates, text and percentage of accuracy.

The accepted percentage must be well balanced. Elsewhere, words could be missing (too high %) or they could not make much sense (too low %).

Shape Recognition

It is the same principle than step 2., but there is a different code logic to get the shapes instead of the words.

With more form detections, it is possible to have less words on the wireframe. By example, the picture shape does not need to have an annotation since it can be recognized without ambiguities.

Data Parsing

All dictionnaries from step 2 and 3 are parsed into better formats.

Data Mapping

The data is mapped together to have all the information in one place.

The data about characters and shapes are put together to have distinct elements that can be computed into the website.

Website Generation

With the data parsed and mapped, the only step left is to generate the elements and create the website.

This part will analyse the data to get what specifications an element has in the HTML and CSS aspect.

When it is completed, the elements are arranged together in the respective sheets and the index page can be launched in your favorite browser.

Directories and Files

Project's Tree

  |  |- wireframes
  |- .gitignore             #
  |- LICENSE                #
  |- README.md              # This file
  |- requirements.txt       #

Installation

For this project to work, some programs needs to be installed with the required Python libraries:

Python 3.x
Jupyter Notebook (optional, but necessary for the notebooks)
Tesseract-OCR

pip install -r requirements.txt

All imports are on the notebooks. The notebooks must be executed in order (4-..., 5-..., 6-..., with 1,2,3 that are optional) with read/write permission on the sub-directories and files of the project.

NB: The default path of Tesseract may not work for your environment. If so, you may have to find the executable and chahnge the path to yours.

// Default path
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

How to Execute

The most simple way to run the project is to use Jupyter Notebook and run all cells of each notebooks in order. The .csv files should be generated almost instantly with the given examples.

It is important to do it in order and wait for the csv file creation because they are reused as the workflow continues.

Contribution

Contributions are always welcome, thank you for you time. Here are the steps to do so.

Fork the Project
Create your Feature Branch (git checkout -b feature/MyContribution)
Commit your Changes (git commit -m 'Add MyContribution')
Push to the Branch (git push origin feature/MyContribution)
Open a Pull Request

License

See the LICENSE file at the root of the project directory for more information.

Acknowlegements and Sources

Readings of articles and projects

Programs

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
docs		docs
results		results
websites		websites
wireframes		wireframes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

steve-levesque/Portfolio-AI-DesignToCode

Folders and files

Latest commit

History

Repository files navigation

About

Introduction

Workflow

Wireframe Drawing

Optical Character Recognition (OCR)

Shape Recognition

Data Parsing

Data Mapping

Website Generation

Directories and Files

Installation

How to Execute

Contribution

License

Acknowlegements and Sources

About

Topics

Resources

License

Stars

Watchers

Forks

Languages