# Purpose

A Python script to prepend a hyperlinked table of contents to a generic Markdown file, based on that file's header hierarchy. 

The workflow:

- Add TOC to Markdown file
- Create HTML from Markdown via pandoc
- Add stylesheet link and anchors to HTML file
- Push to GitHub


# Load libraries

In [1]:
from bs4 import BeautifulSoup
import fileinput
import string
import subprocess

# Add TOC to Markdown file

## Load Markdown file

In [2]:
#fname = input("Filename: ")
#fname = '/home/jtk/Site/REFS/Bash-basics.md'
#fname = '/home/jtk/Site/REFS/metadata.md'
fname = '/home/jtk/Site/REFS/R-basics.md'
fhand = open(fname, 'r')

## Save headers, excluding code chunks

In [3]:
headers = list()
code_flag = False

for row in fhand:
    if row.startswith("```") and code_flag == False:
        code_flag = True
    elif row.startswith("```") and code_flag == True:
        code_flag = False
    elif row.startswith('#') and code_flag == False:
        headers.append(row)

In [4]:
for h in headers:
    print(h)

# Environment

# Conventions

# Get more information

# Datatypes

## Vectors

## Lists

## Factors

## Matrices

## Dataframes

## Datetimes

# Control flow

# Functions & FP



## Construct TOC from headers

In [5]:
TOC = list()

for h in headers:
    
    # I need this
    hsplit = h.split(' ')
    
    # set the indentation
    hlevel = len(hsplit[0])
    if hlevel > 1:
        space = "\t"*(hlevel-1)
    else:
        space = ""
    
    # set the anchor name and link text
    aname = "-".join(hsplit[1:]).lower()[:-1]
    lname = " ".join(hsplit[1:])[:-1]
    
    # construct (indented) bullet point
    TOC.append(space+'- ['+lname+'](#'+aname+')\n')

In [6]:
for item in TOC:
    print(item)

- [Environment](#environment)

- [Conventions](#conventions)

- [Get more information](#get-more-information)

- [Datatypes](#datatypes)

	- [Vectors](#vectors)

	- [Lists](#lists)

	- [Factors](#factors)

	- [Matrices](#matrices)

	- [Dataframes](#dataframes)

	- [Datetimes](#datetimes)

- [Control flow](#control-flow)

- [Functions & FP](#functions-&-fp)



## Write to file

### /home/jtk/Site/REFS/TOCS/fname_TOC.md

In [18]:
foname = fname[:-3]+"_TOC.md"
foname = foname.split('/') 
foname.insert(-1, "TOCS")
foname = '/'.join(foname)

'/home/jtk/Site/REFS/TOCS/R-basics_TOC.md'

In [8]:
fout = open(foname, "w")

# write TOC as table
fout.write('<table id="TOC"><tr><td>')
for row in TOC:
    fout.write(row)
fout.write("</td></tr></table>")
fout.write("\n")
    
# write content 
fhand = open(fname, 'r')
for row in fhand:
    fout.write(row)
    
fout.close()

# Create HTML from MD using pandoc

## /home/jtk/Site/REFS/HTML/fname.html

In [19]:
hname = foname[:-7]+'.html'
hname = hname.split('/')
hname.remove('TOCS')
hname.insert(-1, "HTML")
html_out = "/".join(hname)

In [21]:
# https://stackoverflow.com/questions/26236126/how-to-run-bash-command-inside-python-script
subprocess.run(['pandoc', foname, '-f', 'markdown', '-t', 'html', '-s', '-o', html_out])

CompletedProcess(args=['pandoc', '/home/jtk/Site/REFS/TOCS/R-basics_TOC.md', '-f', 'markdown', '-t', 'html', '-s', '-o', '/home/jtk/Site/REFS/HTML/R-basics.html'], returncode=0)

# Add anchors and stylesheet to HTML

## Create BeautifulSoup object

In [22]:
fhand = open(html_out, 'r')
my_soup = BeautifulSoup(fhand, "html.parser")
#print(my_soup.prettify())

## Make every header an anchor

In [23]:
headers = my_soup.find_all(["h1","h2","h3","h4","h5","h6"])
for h in headers:
    h.string.wrap(my_soup.new_tag("a"))
    del h['id'] 
    h.a['name'] = "-".join(h.get_text().lower().split(" "))

## Add CSS stylesheet to &lt;head&gt;

In [24]:
link = my_soup.new_tag("link")
link["rel"] = "stylesheet"
link["type"] = "text/css"
link['href'] = "refs.css"
my_soup.head.style.replace_with(link)

<style type="text/css">code{white-space: pre;}</style>

# Write out

In [25]:
fhand = open(html_out, 'w')
fhand.write(my_soup.prettify())

19755

In [26]:
# not sure why I need this twice????
fhand = open(html_out, 'w')
fhand.write(my_soup.prettify())

19755

# Push to GitHub

In [33]:
GHB = '471f75344810a6918d7534566cd032662d5a23d2'
f = html_out.split("/")[-1]
subprocess.run(['git', 'add', '.'])
subprocess.run(['git', 'commit', '-m', 'Changes to {}'.format(f)])
subprocess.run(['git', 'push', '', '--all'])

git push https://username:password@myrepository.biz/file.git --all
#subprocess.run()

CompletedProcess(args=['git', 'push'], returncode=128)