This project is a Markdown-to-HTML compiler written in Java.
The compiler reads a Markdown file as input, analyzes its structure, creates an Abstract Syntax Tree, and generates an HTML file as output.
This project demonstrates the main phases of compiler design:
- Lexical Analysis
- Syntax Analysis
- Abstract Syntax Tree Generation
- Code Generation
The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.
Markdown is used as the source language, and HTML is used as the target language.
Example Markdown input:
# My Page
This is **bold** text and *italic* text.
- Apple
- Banana
- OrangeExample HTML output:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Markdown Output</title>
</head>
<body>
<h1>My Page</h1>
<p>This is <strong>bold</strong> text and <em>italic</em> text.</p>
<ul>
<li>Apple</li>
<li>Banana</li>
<li>Orange</li>
</ul>
</body>
</html>The compiler currently supports the following Markdown features:
- Headings from
#to###### - Paragraphs
- Unordered lists using
- - Bold text using
**text** - Italic text using
*text* - Inline code using
`code` - HTML escaping for special characters
The lexer reads the Markdown source file line by line and converts the input text into tokens.
For example, this Markdown line:
# Hellois converted into a token like this:
Token(HEADING, level=1, text=Hello)The lexer identifies different token types such as:
HEADINGTEXTLIST_ITEMBLANK_LINEEOF
The parser takes the list of tokens produced by the lexer and checks their structure.
It converts the tokens into an Abstract Syntax Tree.
For example:
Token(HEADING, level=1, text=Hello)becomes:
DocumentNode
HeadingNode(level=1, text=Hello)The Abstract Syntax Tree represents the logical structure of the Markdown document.
Example AST:
DocumentNode
HeadingNode
ParagraphNode
UnorderedListNodeThe AST makes it easier for the compiler to generate the final HTML output.
The HTML generator traverses the AST and creates valid HTML code.
For example, this AST node:
HeadingNode(level=1, text=Hello)is converted into:
<h1>Hello</h1>markdown_compiler_java/
├── Main.java
├── Lexer.java
├── Parser.java
├── Token.java
├── TokenType.java
├── Node.java
├── HtmlGenerator.java
├── test1.md
├── test2.md
├── test3.md
├── test4.md
├── test5.md
└── README.mdMain.java is the main entry point of the program.
It reads the input Markdown file, runs the lexer, parser, and HTML generator, and writes the final HTML output file.
It also supports optional debug flags for displaying tokens and the AST.
Lexer.java performs lexical analysis.
It reads the Markdown text and converts each line into tokens.
For example:
This project is a Markdown-to-HTML compiler written in Java.
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.
The project demonstrates the main phases of compiler design:
- Lexical Analysis
- Syntax Analysis
- Abstract Syntax Tree Generation
- Code Generation
The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.
The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.
File name:
test1.md
~~~
# My Markdown Page
This is a simple paragraph.
## Features
This text has **bold** and *italic* words.
- Apple
- Banana
- Orange
~~~
```md
# Titlebecomes:
Token(HEADING, level=1, text=Title)Token.java represents a single token produced by the lexer.
Each token contains:
- token type
- text value
- heading level, if the token is a heading
TokenType.java defines the available token types used by the compiler.
The token types are:
HEADING
TEXT
LIST_ITEM
BLANK_LINE
EOFParser.java performs syntax analysis.
It reads the token list and creates an Abstract Syntax Tree.
It groups related tokens, such as multiple list items, into one AST node.
Node.java contains the classes used to represent the Abstract Syntax Tree.
The AST node classes are:
# Markdown-to-HTML Compiler
## Project Description
This project is a Markdown-to-HTML compiler written in Java.
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.
The project demonstrates the main phases of compiler design:
1. Lexical Analysis
2. Syntax Analysis
3. Abstract Syntax Tree Generation
4. Code Generation
## Project Goal
The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.
## Test Cases Included in the Project
The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.
### Test Case 1: Basic Markdown Document
File name:
```txt
test1.md
~~~
# My Markdown Page
This is a simple paragraph.
## Features
This text has **bold** and *italic* words.
- Apple
- Banana
- Orange
~~~
DocumentNode
HeadingNode
ParagraphNode
UnorderedListNodeHtmlGenerator.java performs code generation.
It converts the AST into valid HTML output.
It also handles inline Markdown formatting such as bold text, italic text, and inline code.
To run this project, the following are required:
- Java installed
- Terminal or command line
- Vim or any text editor
To check the Java version:
java -versionTo check the Java compiler version:
javac -versionInside the project folder, run:
javac *.javaThis compiles all Java source files.
Use the following command:
java Main input.md output.htmlExample:
java Main test1.md test1.htmlThis command reads test1.md and creates test1.html.
The program includes debug options to display the tokens and AST in the terminal.
To show tokens:
java Main test1.md test1.html --tokensTo show the AST:
java Main test1.md test1.html --astTo show both tokens and AST:
java Main test1.md test1.html --tokens --astThese options are useful for demonstrating the compiler design process.
The project contains 5 test cases.
Each test case checks a different part of the Markdown-to-HTML compiler.
File name:
test1.mdContent:
# My Markdown Page
This is a simple paragraph.
## Features
This text has **bold** and *italic* words.
- Apple
- Banana
- OrangeThis project is a Markdown-to-HTML compiler written in Java.
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.
The project demonstrates the main phases of compiler design:
- Lexical Analysis
- Syntax Analysis
- Abstract Syntax Tree Generation
- Code Generation
The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.
The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.
File name:
test1.md
~~~
# My Markdown Page
This is a simple paragraph.
## Features
This text has **bold** and *italic* words.
- Apple
- Banana
- Orange
~~~
Purpose:
This test case checks basic compiler functionality, including:
- Heading 1
- Heading 2
- Paragraphs
- Bold text
- Italic text
- Unordered lists
Command:
```bash
java Main test1.md test1.htmlFile name:
test2.mdContent:
# Headings Test
## Section Two
### Section Three
This file tests multiple heading levels.Purpose:
This test case checks whether the compiler can correctly process different heading levels.
Command:
java Main test2.md test2.htmlFile name:
test3.mdContent:
# Text Formatting Test
This paragraph contains **bold text**.
This paragraph contains *italic text*.
This paragraph contains `inline code`.
This paragraph contains **bold**, *italic*, and `code` together.Purpose:
This test case checks inline formatting features, including:
- Bold text
- Italic text
- Inline code
- Multiple formatting types in one paragraph
Command:
java Main test3.md test3.htmlFile name:
test4.mdContent:
# List Test
Shopping list:
- Apples
- Bananas
- Oranges
- Bread
- Milk
End of list.Purpose:
This test case checks unordered list compilation.
Command:
java Main test4.md test4.htmlFile name:
test5.mdContent:
# Mixed Markdown Test
Welcome to my compiler demo.
## Features
This compiler supports **bold text**, *italic text*, and `inline code`.
- Lexer
- Parser
- AST
- HTML Generator
This is the final test file.Purpose:
This test case checks several compiler features together in one file:
- Headings
- Paragraphs
- Bold text
- Italic text
- Inline code
- Unordered lists
Command:
java Main test5.md test5.htmlTo run all 5 test cases, use the following commands:
java Main test1.md test1.html
java Main test2.md test2.html
java Main test3.md test3.html
java Main test4.md test4.html
java Main test5.md test5.htmlAfter running these commands, the following HTML files will be generated:
test1.html
test2.html
test3.html
test4.html
test5.htmlTo view the generated HTML code in the terminal:
cat test1.htmlTo open the generated HTML file in a browser on Linux:
xdg-open test1.htmlOn macOS:
open test1.htmlOn Windows:
start test1.htmlA good project demonstration can be done using the following commands:
javac *.java
cat test5.md
java Main test5.md test5.html --tokens --ast
cat test5.html
xdg-open test5.html
# Markdown-to-HTML Compiler
## Project Description
This project is a Markdown-to-HTML compiler written in Java.
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.
The project demonstrates the main phases of compiler design:
1. Lexical Analysis
2. Syntax Analysis
3. Abstract Syntax Tree Generation
4. Code Generation
## Project Goal
The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.
## Test Cases Included in the Project
The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.
### Test Case 1: Basic Markdown Document
File name:
```txt
test1.md
~~~
# My Markdown Page
This is a simple paragraph.
## Features
This text has **bold** and *italic* words.
- Apple
- Banana
- Orange
~~~