# Creating a Static Site Generator with Flask
A static site is one that has no interactive capabilities; it is simply a collection of one or more HTML documents that are linked together. While limited in functionality, a static site may be all that is necessary. This blog, itself, is a static site and it was created by the very process I am about to outline. 

I decided just a couple of weeks before making this post that I wanted to create a blog that would serve as a creative outlet for me while also showcasing my knowledge and abilities. Upon an initial search for ideas on how I should go about creating this blog, I stubbled upon <a href="https://pages.github.com/">GitHub Pages</a> which allows GitHub users to create static pages that are hosted from a special repository named <code>&lt;your_username&gt;.github.io</code>. This is great because it allows GitHub users to easily serve up new content with a simple <code>git push</code> command without having to find a hosting service.

The next step for me was then to figure out how to setup a process by which I could create new content and quickly integrate it with my existing blog. 

<h3><b>Enter the static site generator.</b></h3>

Static site generators leverage the templating capability of a dynamic site's server in order to produce a set of static resources that have uniformity of stylization without requiring you to copy and paste the same navigation bar, side bar, footer, etc. over and over again. There are a multitude of options when it comes to selecting a static site generator. Built with Ruby, <a href="https://jekyllrb.com/">Jekyll</a> is one of the most popular and is the recommended generator for GitHub Pages by GitHub itself. Another I had considered was <a href="https://blog.getpelican.com/">Pelican</a> which is built with Python. After an initial survey of my options, I decided that none of them fit my needs which were to have a simple yet customizable generator that was not opinionated about the layout of the blog.

It was at this point when I realized that I should just use Flask to render Jinja templates with HTML created by converting Jupyter notebooks containing my blog content. I knew immediately that this was the best way for me but wasn't quite sure how to go about it best. Thankfully, Nicolas Perriault has an <a href="https://nicolas.perriault.net/code/2012/dead-easy-yet-powerful-static-website-generator-with-flask/">excellent blog post</a> on how to easily achieve exactly what I want.

Here I'll summarize the key points in his blog post and detail the adaptations I made to adhere to my needs.

## My GitHub Pages User Page
First of all, I'll provide a link to my <a href="https://github.com/a-rich/a-rich.github.io">repository</a> which houses this blog. We'll be focusing on <a href="https://github.com/a-rich/a-rich.github.io/blob/master/convert.py">convert.py</a> for turning Jupyter notebooks into blog post content and <a href="https://github.com/a-rich/a-rich.github.io/blob/master/sitebuilder.py">sitebuilder.py</a> for the creation of the static resources.

The process for creating a new blog post is started by filling out a Jupyter notebook before converting that into HTML and using BeautifulSoup to clean it up and insert Bootstrap css. The resulting HTML document is stored in a particular directory which <a href="https://pythonhosted.org/Flask-FlatPages/https://pythonhosted.org/Flask-FlatPages/"><code>Flask-FlatPages</code></a> references to load it. The Flask application in <code>sitebuilder.py</code> has an index route used to generate a link for every blog post, about page, and topic page captured by FlatPages. The application also has a page route that's hit by every link generated by the index route. The page route renders one of three templates depending on the <code>path</code> URL parameter. The final step is to take a snapshot of the page/link graph that is the blog site using <a href="https://pythonhosted.org/Frozen-Flask/"><code>Frozen-Flask</code></a>. All of this combined enables a one command pipeline from Jupyter notebook to updated blog site.

## Step-by-Step


### Convert Jupyter notebook to HTML
Jupyter has a convenient CLI command for converting a notebook into HTML. The <code>--to</code> flag allows us to specify the output file format while the <code>--template</code> flag allows us to specify whether we want a basic or full output file (if you're using something like <a href="https://github.com/dunovank/jupyter-themes">jupyterthemes</a> and you want capture these themes for your blog, use the full template).

In [None]:
jupyter-nbconvert <path/to/notebook.ipynb> --to html --template basic

We include this at the beginning of the <code>convert.py</code> script as a subprocess call. 

In [None]:
subprocess.call(['jupyter-nbconvert', infile, '--to', 'html', '--template', 'basic'])

### Clean up notebook HTML using BeautifulSoup

In [None]:
soup = BeautifulSoup(open(infile, 'r'), 'html.parser')

# Remove weird characters at end of markdown cells
[a.decompose() for a in soup.find_all('a', {'class': 'anchor-link'})]

# Remove prompt brackets before all executable cells
[i.find('div', {'class': 'prompt input_prompt'}).decompose()
        for i in soup.find_all('div', {'class': 'input'})]

# Remove prompt brackets before all outputs
[o.decompose() for o in soup.find_all('div', {'class': 'prompt output_prompt'})]

# Remove extra matplotlib.legend output line
[o.find('pre').decompose()
        for o in soup.find_all('div', {'class': 'output_text'})
        if 'matplotlib' in o.find('pre').get_text()]

# Add card css class, padding/margins, color to all code
for c in soup.find_all('div', {'class': 'input_area'}):
    c['class'] = c.get('class', []) + ['card', 'px-2', 'pt-2', 'my-3']
    c['style'] = c.get('style', []) + ['background-color:#F7F7F9;']

# Add card css class, padding/margins, color to all markdown
for m in [d for d in soup.find_all('div', {'class': 'inner_cell'})
        if 'input' not in d.parent['class']]:
    m['class'] = m.get('class', []) + ['bg-info', 'card', 'px-2', 'pt-2', 'my-3']

# Add card css class, padding/margins, color to all outputs
for o in soup.find_all('div', {'class': 'output'}):
    o['class'] = o.get('class', []) + ['card', 'px-2', 'pt-2', 'my-3']

# Change all "small" headers to the smallest
for size in ['4', '5']:
    for h in soup.find_all('h{}'.format(size)):
        h.name = 'h6'

# Change all "big" headers to smaller headers
for size in ['1', '2', '3']:
    for h in soup.find_all('h{}'.format(size)):
        h['class'] = h.get('class', []) + ['font-weight-bold']
        h.name = 'h{}'.format(int(size) + 3)

The last part of <code>convert.py</code> prepends some optional YAML meta data to the notebook HTML and adds a <code>(title, date, path)</code> tuple to a manifest file that's used to update the recent blogs section in the side bar of the blog.

## Use Flask-FlatPages to aggregate static resources and render a list of links to them 

### Imports, app configuration, index route

In [1]:
import os
import sys
import json
import shutil
from dateutil import parser
from flask import Flask, url_for, render_template
from flask_flatpages import FlatPages
from flask_frozen import Freezer

DEBUG = True
FLATPAGES_AUTO_RELOAD = DEBUG
FLATPAGES_EXTENSION = '.html'

app = Flask(__name__)
app.config.from_object(__name__)
pages = FlatPages(app)
freezer = Freezer(app)

@app.route('/')
def index():
    return render_template('index.html', pages=pages)

### index.html

In [None]:
<ul>
    {% for page in pages %}
        <li>
            <a href="{{ url_for("page", path=page.path) }}">{{ page.title }}</a>
        </li>
    {% else %}
        <li>No posts.</li>
    {% endfor %}
</ul>

### Page route
Each link rendered in the index.html template points to the other route of the application which, depending on the path of the FlatPage, will render one of three templates. There is an about.html template that renders the blog's landing page, a content.html template that renders all blog post pages including this one, and a blog_posts.html template which renders the list of all blog posts as well as all the lists of blog posts by individual topic. 

In [None]:
@app.route('/<path:path>/')
def page(path):
    if os.path.exists('blog.manifest'):
        manifest = json.load(open('blog.manifest', 'r'))
    else:
        manifest = []

    recent_posts = sorted(manifest, key=lambda x: parser.parse(x[1]), reverse=True)[:10]

    groups = {' '.join([s.capitalize() for s in group.split('-')]):
                  sorted([p for p in pages if p.path.split('/')[0] == group], key=lambda x: x.meta['title'])
              for group in sorted(set([p.path.split('/')[0] for p in pages if p.path.split('/')[0] != 'site']))
        }

    for k,v in {
        'about': ['about.html', [], ''],
        'all': ['blog_posts.html', [(k, [p for p in groups[k]]) for k in sorted(groups.keys())], 'Blog Posts'],
        'data-visualization': ['blog_posts.html', groups['Data Visualization'], 'Data Visualization'],
        'in-a-nutshell': ['blog_posts.html', groups['In A Nutshell'], 'In A Nutshell'],
        'machine-learning': ['blog_posts.html', groups['Machine Learning'], 'Machine Learning'],
        'python': ['blog_posts.html', groups['Python'], 'Python'],
        }.items():

        if k == path.split('/')[-1]:
            return render_template(v[0], pages=v[1], header=v[2], posts=recent_posts)

    page = pages.get_or_404(path).html
    return render_template('content.html', page=page, posts=recent_posts)

## Freezing the app and cleaning up the directory
When Frozen-Flask freezes the app, all of the index.html links will be traversed hitting the page route creating a fully rendered HTML page for the each blog and the fixed pages (about and topics). These static files are placed into a directory called <code>build</code> by Frozen-Flask. The final part of <code>sitebuilder.py</code> builds the static resources before moving and removing files to order the directory. 

In [None]:
if __name__ == "__main__":
    if len(sys.argv) > 1 and sys.argv[1] == 'build':
        freezer.freeze()
        for directory in os.listdir('pages'):
            if os.path.exists(directory):
                shutil.rmtree(directory)
        if os.path.exists('index.html'):
            os.remove('index.html')
        os.remove('build/index.html')
        shutil.rmtree('build/static')
        os.rename('build/site/about/index.html', 'index.html')
        shutil.rmtree('build/site/about')
        for directory in os.listdir('build'):
            os.rename('build/'+directory, directory)
        shutil.rmtree('build')
    else:
        app.run()

## Final steps
We can add an additional subprocess call to the end of <code>convert.py</code> that runs the Flask application.

In [None]:
subprocess.call(['python3', 'sitebuilder.py', 'build'])

This way we can run one command to convert a Jupyter notebook into an HTML document and update the static files for the blog. Then we simply push the changes to the GitHub Pages User Page and we're done!