Skip to content

compilers2026/markdown_compiler_java

Repository files navigation

Markdown-to-HTML Compiler

Project Description

This project is a Markdown-to-HTML compiler written in Java.

The compiler reads a Markdown file as input, analyzes its structure, creates an Abstract Syntax Tree, and generates an HTML file as output.

This project demonstrates the main phases of compiler design:

  1. Lexical Analysis
  2. Syntax Analysis
  3. Abstract Syntax Tree Generation
  4. Code Generation

Project Goal

The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.

Markdown is used as the source language, and HTML is used as the target language.

Example Markdown input:

# My Page

This is **bold** text and *italic* text.

- Apple
- Banana
- Orange

Example HTML output:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Markdown Output</title>
</head>
<body>
<h1>My Page</h1>
<p>This is <strong>bold</strong> text and <em>italic</em> text.</p>
<ul>
  <li>Apple</li>
  <li>Banana</li>
  <li>Orange</li>
</ul>
</body>
</html>

Supported Markdown Features

The compiler currently supports the following Markdown features:

  • Headings from # to ######
  • Paragraphs
  • Unordered lists using -
  • Bold text using **text**
  • Italic text using *text*
  • Inline code using `code`
  • HTML escaping for special characters

Compiler Phases

1. Lexical Analysis

The lexer reads the Markdown source file line by line and converts the input text into tokens.

For example, this Markdown line:

# Hello

is converted into a token like this:

Token(HEADING, level=1, text=Hello)

The lexer identifies different token types such as:

  • HEADING
  • TEXT
  • LIST_ITEM
  • BLANK_LINE
  • EOF

2. Syntax Analysis

The parser takes the list of tokens produced by the lexer and checks their structure.

It converts the tokens into an Abstract Syntax Tree.

For example:

Token(HEADING, level=1, text=Hello)

becomes:

DocumentNode
  HeadingNode(level=1, text=Hello)

3. Abstract Syntax Tree

The Abstract Syntax Tree represents the logical structure of the Markdown document.

Example AST:

DocumentNode
  HeadingNode
  ParagraphNode
  UnorderedListNode

The AST makes it easier for the compiler to generate the final HTML output.

4. Code Generation

The HTML generator traverses the AST and creates valid HTML code.

For example, this AST node:

HeadingNode(level=1, text=Hello)

is converted into:

<h1>Hello</h1>

Project Structure

markdown_compiler_java/
├── Main.java
├── Lexer.java
├── Parser.java
├── Token.java
├── TokenType.java
├── Node.java
├── HtmlGenerator.java
├── test1.md
├── test2.md
├── test3.md
├── test4.md
├── test5.md
└── README.md

File Descriptions

Main.java

Main.java is the main entry point of the program.

It reads the input Markdown file, runs the lexer, parser, and HTML generator, and writes the final HTML output file.

It also supports optional debug flags for displaying tokens and the AST.

Lexer.java

Lexer.java performs lexical analysis.

It reads the Markdown text and converts each line into tokens.

For example:

Markdown-to-HTML Compiler

Project Description

This project is a Markdown-to-HTML compiler written in Java.
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.

The project demonstrates the main phases of compiler design:

  1. Lexical Analysis
  2. Syntax Analysis
  3. Abstract Syntax Tree Generation
  4. Code Generation

Project Goal

The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.

Test Cases Included in the Project

The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.

Test Case 1: Basic Markdown Document

File name:

test1.md

~~~
# My Markdown Page

This is a simple paragraph.

## Features

This text has **bold** and *italic* words.

- Apple
- Banana
- Orange
~~~


```md
# Title

becomes:

Token(HEADING, level=1, text=Title)

Token.java

Token.java represents a single token produced by the lexer.

Each token contains:

  • token type
  • text value
  • heading level, if the token is a heading

TokenType.java

TokenType.java defines the available token types used by the compiler.

The token types are:

HEADING
TEXT
LIST_ITEM
BLANK_LINE
EOF

Parser.java

Parser.java performs syntax analysis.

It reads the token list and creates an Abstract Syntax Tree.

It groups related tokens, such as multiple list items, into one AST node.

Node.java

Node.java contains the classes used to represent the Abstract Syntax Tree.

The AST node classes are:

# Markdown-to-HTML Compiler

## Project Description

This project is a Markdown-to-HTML compiler written in Java.  
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.

The project demonstrates the main phases of compiler design:

1. Lexical Analysis
2. Syntax Analysis
3. Abstract Syntax Tree Generation
4. Code Generation

## Project Goal

The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.

## Test Cases Included in the Project

The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.

### Test Case 1: Basic Markdown Document

File name:

```txt
test1.md

~~~
# My Markdown Page

This is a simple paragraph.

## Features

This text has **bold** and *italic* words.

- Apple
- Banana
- Orange
~~~


DocumentNode
HeadingNode
ParagraphNode
UnorderedListNode

HtmlGenerator.java

HtmlGenerator.java performs code generation.

It converts the AST into valid HTML output.

It also handles inline Markdown formatting such as bold text, italic text, and inline code.

Requirements

To run this project, the following are required:

  • Java installed
  • Terminal or command line
  • Vim or any text editor

To check the Java version:

java -version

To check the Java compiler version:

javac -version

How to Compile the Project

Inside the project folder, run:

javac *.java

This compiles all Java source files.

How to Run the Compiler

Use the following command:

java Main input.md output.html

Example:

java Main test1.md test1.html

This command reads test1.md and creates test1.html.

Debug Options

The program includes debug options to display the tokens and AST in the terminal.

To show tokens:

java Main test1.md test1.html --tokens

To show the AST:

java Main test1.md test1.html --ast

To show both tokens and AST:

java Main test1.md test1.html --tokens --ast

These options are useful for demonstrating the compiler design process.

Test Cases Included in the Project

The project contains 5 test cases.

Each test case checks a different part of the Markdown-to-HTML compiler.

Test Case 1: Basic Markdown Document

File name:

test1.md

Content:

# My Markdown Page

This is a simple paragraph.

## Features

This text has **bold** and *italic* words.

- Apple
- Banana
- Orange

Markdown-to-HTML Compiler

Project Description

This project is a Markdown-to-HTML compiler written in Java.
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.

The project demonstrates the main phases of compiler design:

  1. Lexical Analysis
  2. Syntax Analysis
  3. Abstract Syntax Tree Generation
  4. Code Generation

Project Goal

The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.

Test Cases Included in the Project

The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.

Test Case 1: Basic Markdown Document

File name:

test1.md

~~~
# My Markdown Page

This is a simple paragraph.

## Features

This text has **bold** and *italic* words.

- Apple
- Banana
- Orange
~~~


Purpose:

This test case checks basic compiler functionality, including:

- Heading 1
- Heading 2
- Paragraphs
- Bold text
- Italic text
- Unordered lists

Command:

```bash
java Main test1.md test1.html

Test Case 2: Headings Test

File name:

test2.md

Content:

# Headings Test

## Section Two

### Section Three

This file tests multiple heading levels.

Purpose:

This test case checks whether the compiler can correctly process different heading levels.

Command:

java Main test2.md test2.html

Test Case 3: Text Formatting Test

File name:

test3.md

Content:

# Text Formatting Test

This paragraph contains **bold text**.

This paragraph contains *italic text*.

This paragraph contains `inline code`.

This paragraph contains **bold**, *italic*, and `code` together.

Purpose:

This test case checks inline formatting features, including:

  • Bold text
  • Italic text
  • Inline code
  • Multiple formatting types in one paragraph

Command:

java Main test3.md test3.html

Test Case 4: List Test

File name:

test4.md

Content:

# List Test

Shopping list:

- Apples
- Bananas
- Oranges
- Bread
- Milk

End of list.

Purpose:

This test case checks unordered list compilation.

Command:

java Main test4.md test4.html

Test Case 5: Mixed Markdown Test

File name:

test5.md

Content:

# Mixed Markdown Test

Welcome to my compiler demo.

## Features

This compiler supports **bold text**, *italic text*, and `inline code`.

- Lexer
- Parser
- AST
- HTML Generator

This is the final test file.

Purpose:

This test case checks several compiler features together in one file:

  • Headings
  • Paragraphs
  • Bold text
  • Italic text
  • Inline code
  • Unordered lists

Command:

java Main test5.md test5.html

Running All Test Cases

To run all 5 test cases, use the following commands:

java Main test1.md test1.html
java Main test2.md test2.html
java Main test3.md test3.html
java Main test4.md test4.html
java Main test5.md test5.html

After running these commands, the following HTML files will be generated:

test1.html
test2.html
test3.html
test4.html
test5.html

Viewing the Output

To view the generated HTML code in the terminal:

cat test1.html

To open the generated HTML file in a browser on Linux:

xdg-open test1.html

On macOS:

open test1.html

On Windows:

start test1.html

Example Terminal Demo

A good project demonstration can be done using the following commands:

javac *.java
cat test5.md
java Main test5.md test5.html --tokens --ast
cat test5.html
xdg-open test5.html
# Markdown-to-HTML Compiler

## Project Description

This project is a Markdown-to-HTML compiler written in Java.  
It reads a Markdown file, analyzes its structure, builds an Abstract Syntax Tree, and generates an HTML file as output.

The project demonstrates the main phases of compiler design:

1. Lexical Analysis
2. Syntax Analysis
3. Abstract Syntax Tree Generation
4. Code Generation

## Project Goal

The goal of this project is to design and implement a simple compiler that converts basic Markdown syntax into valid HTML.

## Test Cases Included in the Project

The project contains 5 test cases. Each test case checks a different part of the Markdown-to-HTML compiler.

### Test Case 1: Basic Markdown Document

File name:

```txt
test1.md

~~~
# My Markdown Page

This is a simple paragraph.

## Features

This text has **bold** and *italic* words.

- Apple
- Banana
- Orange
~~~

About

A compiler that converts a simplified Markdown language into HTML.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages