Permalink
Browse files

[extract_toc] Add Pandoc support and improve documentation.

  • Loading branch information...
gw0 committed Jul 25, 2014
1 parent 98a83bc commit 7fbbe6906acc636542688e2891c8219f3ad7fd6d
Showing with 64 additions and 37 deletions.
  1. +55 −28 extract_toc/README.md
  2. +9 −9 extract_toc/extract_toc.py
View
@@ -2,7 +2,7 @@ Extract Table of Content
========================
A Pelican plugin to extract table of contents (ToC) from `article.content` and
place it in its own `article.toc` variable.
place it in its own `article.toc` variable for use in templates.
Copyright (c) Talha Mansoor
@@ -12,22 +12,28 @@ Author Email | talha131@gmail.com
Author Homepage | http://onCrashReboot.com
Github Account | https://github.com/talha131
Acknowledgement
---------------
Thanks to [Avaris](https://github.com/avaris) for going out of the way to help
me fix Unicode issues and doing a thorough code review.
Thanks to [gw0](http://gw.tnode.com/) for adding Pandoc reader support.
Why do you need it?
===================
Pelican can generate ToC of reST and Markdown files, using markup's respective
directive and extension. ToC is generated and placed at the beginning of
`article.content`. You cannot place the ToC in `<nav>` HTML5 tag, nor can you
place the ToC at the end of your article's content because ToC is part of
`article.content`.
directive and extension. Such ToC is generated and placed at the beginning of
`article.content` like a string. Consequently it can not be placed anywhere
else on the page (eg. `<nav>` HTML5 tag, in header, or at the end of your
article's contents).
To solve this problem, this plugin extracts ToC from `article.content` and
places it in its own `article.toc` variable for use in templates.
This plugin extracts ToC from `article.content` and places it in `article.toc`.
Requirements
============
@@ -38,26 +44,40 @@ Requirements
pip install beautifulsoup4
```
How to Use
==========
**Important!** This plugin only works with reST and Markdown files. reST files
should have `.rst` extension. Markdown files can have `.md`, `.mkd` or
`markdown`.
This plugin works by extracting the first occurrence of enclosed in:
- `<div class="toc">` for the default Markdown reader
- `<div class="contents topic">` for the default reStructuredText reader
- `<nav class="TOC">` for the Pandoc reader
If ToC appears in your article at more than one places, `extract_toc` will
remove only the first occurrence. You shouldn't probably need to have multiple
ToC in your article. In case you need to display it multiple times, you can
print it via your template.
ToC generated by Markdown is enclosed in `<div class="toc">`. On the other hand
ToC generated by reST is enclosed in `<div class="contents topic">`.
`extract_toc` relies on this behavior to work.
reStructuredText Example
------------------------
Template example
----------------
Add something like this to your Pelican templates if missing:
```python
{% if article.toc %}
<nav class="toc">
{{ article.toc }}
</nav>
{% endif %}
```
reStructuredText reader
-----------------------
To add a table of contents to your reStructuredText document you need to add a 'contents directive' at the place where you want the table of contents to appear. See the [documentation](http://docutils.sourceforge.net/docs/ref/rst/directives.html#table-of-contents) for more details.
To add a table of contents to your reStructuredText document (`.rst`) you need to add a `.. contents::` directive to its beginning. See the [docutils documentation](http://docutils.sourceforge.net/docs/ref/rst/directives.html#table-of-contents) for more details.
```rst
My super title
@@ -79,14 +99,15 @@ Heading 1
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa.
```
Markdown Example
----------------
To add a table of contents to your Markdown document you need to place the 'TOC marker' at the place where you would like the table of contents to appear. See the Python Markdown [documentation](http://pythonhosted.org/Markdown/extensions/toc.html) for more details.
Markdown reader
---------------
To enable table of contents generation for the Markdown reader you need to set `MD_EXTENSIONS = (['toc'])` in your Pelican configuration file.
Important! To enable table of contents generation for the markdown reader you need to set `MD_EXTENSIONS = (['toc'])` in your pelican configuration file.
To add a table of contents to your Markdown document (`.md`) you need to place the `[TOC]` marker to its beginning. See the [Python Markdown documentation](http://pythonhosted.org/Markdown/extensions/toc.html) for more details.
```Markdown
```markdown
title: My super title
date: 4-4-2013
tags: thats, awesome
@@ -98,13 +119,19 @@ tags: thats, awesome
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa.
```
Template Example
================
```python
{% if article.toc %}
<nav class="affix">
{{ article.toc }}
</nav>
{% endif %}
Pandoc reader
-------------
To enable table of contents generation for the Pandoc reader you need to set `PANDOC_ARGS = (['--toc', '--template=pandoc-template-toc'])` in your Pelican configuration file.
Contents of the Pandoc template file `pandoc-template-toc.html5`:
```html
$if(toc)$
<nav id="TOC">
$toc$
</nav>
$endif$
$body$
```
View
@@ -1,9 +1,10 @@
# -*- coding: utf-8 -*-
"""
Extract Table of Content
========================
This plugin allows you to extract table of contents (ToC) from article.content
and place it in its own article.toc variable.
A Pelican plugin to extract table of contents (ToC) from `article.content` and
place it in its own `article.toc` variable for use in templates.
"""
from os import path
@@ -14,16 +15,15 @@
def extract_toc(content):
if isinstance(content, contents.Static):
return
soup = BeautifulSoup(content._content,'html.parser')
filename = content.source_path
extension = path.splitext(filename)[1][1:]
toc = ''
# if it is a Markdown file
if extension in readers.MarkdownReader.file_extensions:
toc = None
if not toc: # default Markdown reader
toc = soup.find('div', class_='toc')
# else if it is a reST file
elif extension in readers.RstReader.file_extensions:
if not toc: # default reStructuredText reader
toc = soup.find('div', class_='contents topic')
if not toc: # Pandoc reader
toc = soup.find('nav', id='TOC')
if toc:
toc.extract()
content._content = soup.decode()

0 comments on commit 7fbbe69

Please sign in to comment.