Simple Markdown Parser that returns an Abstract Syntax Tree (AST) with location information.
npm install @snapp-notes/markdown-parser- 📝 Parse markdown into a structured AST
- 📍 Location tracking for every node
- 🎯 Support for common markdown elements:
- Headers (H1-H6)
- Code blocks with language specification
- Bold text (
**and__) - Italic text (
*and_) - Inline links
- List items
- Plain text
- 🚀 Built with PEG.js/Peggy for reliable parsing
- 📦 ES Module support
- đź’Ş TypeScript definitions included
import { parse } from '@snapp-notes/markdown-parser';
const markdown = '# Hello World\nThis is **bold** text.';
const ast = parse(markdown);
console.log(ast);Output:
[
{
type: 'header',
content: '# Hello World',
level: 1,
loc: { start: { offset: 0, line: 1, column: 1 }, end: { ... } }
},
{
type: 'text',
content: '\n',
loc: { ... }
},
{
type: 'text',
content: 'This is '
},
{
type: 'bold',
content: '**bold**',
loc: { ... }
},
{
type: 'text',
content: ' text.'
}
]import { parse } from '@snapp-notes/markdown-parser';
const ast = parse('# H1\n## H2\n### H3');
// Each header node contains:
// - type: 'header'
// - content: full header text including # symbols
// - level: number (1-6)
// - loc: location informationimport { parse } from '@snapp-notes/markdown-parser';
const markdown = `\`\`\`javascript
const greeting = "Hello";
console.log(greeting);
\`\`\``;
const ast = parse(markdown);
// Code node contains:
// - type: 'code'
// - content: code content (includes leading newline)
// - language: 'javascript' (or empty string if not specified)
// - loc: location informationimport { parse } from '@snapp-notes/markdown-parser';
// Bold text
parse('**bold text**'); // or '__bold text__'
// Italic text
parse('*italic text*'); // or '_italic text_'
// Mixed formatting
const ast = parse('This is **bold** and *italic* text');import { parse } from '@snapp-notes/markdown-parser';
const ast = parse('[Google](https://google.com)');
// Link node contains:
// - type: 'link'
// - text: 'Google'
// - url: 'https://google.com'
// - content: '[Google](https://google.com)'
// - loc: location informationimport { parse } from '@snapp-notes/markdown-parser';
const markdown = `* Item 1
* Item 2
* Item 3`;
const ast = parse(markdown);
// List nodes contain:
// - type: 'list'
// - content: '* Item text'
// - loc: location informationimport { parse } from '@snapp-notes/markdown-parser';
const markdown = `# My Document
This is a paragraph with **bold** and *italic* text.
Visit [my website](https://example.com) for more info.
\`\`\`python
def hello():
print("Hello, World!")
\`\`\`
* Feature 1
* Feature 2
`;
const ast = parse(markdown);
// The AST will contain a mix of different node types
ast.forEach(node => {
console.log(`${node.type}: ${node.content?.substring(0, 30)}...`);
});Parses a markdown string and returns an array of AST nodes.
Parameters:
input(string): The markdown text to parseoptions(optional): Parser optionsstartRule(optional): The grammar rule to start parsing from (default: 'start')
Returns: An array of MarkdownNode objects
Throws: SyntaxError if the input cannot be parsed
interface TextNode {
type: 'text' | 'bold' | 'italic' | 'list';
content: string;
loc: Location;
}Used for plain text, bold text, italic text, and list items.
interface HeaderNode {
type: 'header';
content: string;
level: number; // 1-6
loc: Location;
}interface CodeNode {
type: 'code';
content: string;
language?: string;
loc: Location;
}Note: The content includes a leading newline character.
interface LinkNode {
type: 'link';
text: string;
url: string;
content: string;
loc: Location;
}interface Location {
start: Position;
end: Position;
}
interface Position {
offset: number; // Character offset from start
line: number; // Line number (1-based)
column: number; // Column number (1-based)
}| Element | Syntax | Example |
|---|---|---|
| Header | # to ###### |
# Title |
| Bold | **text** or __text__ |
**bold** |
| Italic | *text* or _text_ |
*italic* |
| Link | [text](url) |
[Google](https://google.com) |
| Code Block | ```lang\ncode\n``` |
```js\ncode\n``` |
| List Item | * item |
* Item 1 |
- Nested formatting (e.g., bold within italic) is not fully supported
- Only unordered lists with
*are supported - No support for:
- Blockquotes
- Tables
- Images
- Horizontal rules
- Strikethrough
- Task lists
Generate the parser from the grammar file:
npm run buildRun the test suite:
npm testWatch mode for development:
npm run test:watchThe parser is built using Peggy (formerly PEG.js). The grammar file is located at src/grammar.peggy.
To modify the parser, edit the grammar file and rebuild:
npm run buildContributions are welcome! Please ensure all tests pass before submitting a pull request.
npm run build
npm testCopyright (c) 2025 Jakub T. Jankiewicz
Released under MIT license