# Introduction

Pandoc (http://pandoc.org) is a document processing program that runs on multiple operating systems (Mac, Windows, Linux) and can read and write a wide variety of file formats. In may respects Pandoc can be though of as a *universal translator* for documents. As shown in the following figure there are many formats supported for input and output by Pandoc. IN this workshop we will focus on a specific subset of input and output document types while remembering that we are just scratching the surface of the transformations we can perform with Pandoc. 

<a href="http://pandoc.org/index.html"><img src="diagram.jpg" style="width:50%" /></a>

In particular, we will focus on converting documents written in Pandoc's extended version of Markdown (originally developed by John Gruber as an simple ASCII text syntax for writing blog posts - https://daringfireball.net/projects/markdown/) into some useful output formats including:

* HTML Pages
* HTML-based presentation slide decks
* PDF documents (including memos, letters, reports, manuscripts, presentation slides, and poster presentations)
* Word documents (DOCX)

The bottom line is that Pandoc provides a useful tool that provides a number of significant benefits including:

* Allowing for a clear separation between *content development* from *styling and presentation* of that content. 
* A simple text-based working file format that can be created in any text editor
* A source format that integrates very well into version control systems for collaborative development
* A toolchain that is supported in all major operating systems, providing platform independent document generation
* A simple command syntax that can be used to automate document generation processes for simple replication of document workflow
* A model for developing customized document templates for output formats that allows for a high degree of customization and consistency
* The use of a powerful page layout system (LaTeX - http://www.latex-project.org/about/) that is broadly used to generate high-quality print format documents. 

---
# Installing Pandoc on Your Computer

Pandoc is available for installation on Mac, Windows and Linux. While basic Pandoc functionality is provided by the Pandoc itself, the PDF generation capabilities use the LaTeX system. For a really useful system you will want to install both. The installation page for Pandoc provides links and guidance for installing both Pandoc and LaTeX on all supported operating systems.

http://pandoc.org/installing.html

---
# Some sample documents

## A simple `hello world` of documents

Markdown document: http://localhost:8888/edit/PandocTraining/00-Instructor/01-HelloWorld/helloWorld.md

```
% My title
% Author
% Some date
    
# Heading 1

Hello World - this is as simple as it gets ...
```

Commands to generate different representations of this document:

    pandoc -o helloWorld.pdf helloWorld.md
    pandoc -o helloWorld.docx helloWorld.md
    pandoc -o helloWorld.html helloWorld.md

Try it yourself by running (ctrl-enter) the following set of commands

In [10]:
%%bash
pandoc -o helloWorld.pdf helloWorld.md
pandoc -o helloWorld.docx helloWorld.md
pandoc -o helloWorld.html helloWorld.md

Let's take a look at the generated documents:

* [PDF file](/files/PandocTraining/00-Instructor/01-HelloWorld/helloWorld.pdf)
* [Word Document](/files/PandocTraining/00-Instructor/01-HelloWorld/helloWorld.docx)
* [HTML File](/files/PandocTraining/00-Instructor/01-HelloWorld/helloWorld.html)

## Templated & Styled Content

[Markdown document](/edit/PandocTraining/00-Instructor/01-HelloWorld/templates.md)

```
% My title
% My name
% Today

---
recipientSalutation: Recipient Salutation
recipientName: Recipient Name
recipientTitle: Recipient Title
recipientAddress: Recipient Address
...

Biltong qui pancetta ball tip turkey eiusmod, tongue bresaola ham dolore. Tempor eiusmod ground round pork strip steak sirloin tongue. Magna cillum consequat, minim do tenderloin in porchetta ham officia qui. Picanha swine minim, ham hock boudin aliqua nisi ball tip aliquip deserunt ribeye in est burgdoggen voluptate. Cupim velit landjaeger nisi flank exercitation sunt laboris dolore.

```

PDF Generation Commands:

    pandoc -o templates.pdf templates.md
    pandoc -o templates.pdf --template "main.tex" templates.md

Word Document Generation Commands: (limited to styling - rest of template characteristics can't currently be set)

    pandoc -o templates.docx templates.md
    pandoc -o templates.docx --reference-docx "docxTemplate.docx" templates.md
    
HTML File Generation Commands: 

    pandoc -o templates.html templates.md
    pandoc -o templatesTemplated.html --template "ulPage.html" templates.md
    pandoc -o templatesStyled.html -css page.css templates.md

In [58]:
%%bash
pandoc -o templates.pdf templates.md
pandoc -o templatesTemplated.pdf --template="formal_letter_4.tex" templates.md

Generated Files:

* [Untemplated PDF File](/files/PandocTraining/00-Instructor/01-HelloWorld/templates.pdf)
* [Templated PDF File](/files/PandocTraining/00-Instructor/01-HelloWorld/templatesTemplated.pdf)

In [61]:
%%bash
pandoc -o templates.docx templates.md
pandoc -o templatesTemplated.docx --reference-docx "docxTemplate.docx" templates.md

Generated Files:

* [Untemplated DOCX File](/files/PandocTraining/00-Instructor/01-HelloWorld/templates.docx)
* [Templated DOCX File](/files/PandocTraining/00-Instructor/01-HelloWorld/templatesTemplated.docx)

In [65]:
%%bash
pandoc -o templates.html templates.md
pandoc -o templatesTemplated.html --template "ulPage.html" templates.md
pandoc -o templatesStyled.html --css=page.css -s templates.md

Generated Files:

* [Untemplated HTML file](/files/PandocTraining/00-Instructor/01-HelloWorld/templates.html)
* [Templated HTML file](/files/PandocTraining/00-Instructor/01-HelloWorld/templatesTemplated.html)
* [Styled HTML file](/files/PandocTraining/00-Instructor/01-HelloWorld/templatesStyled.html)

---
## Some Actual Documents

### A Class Syllabus

[The source markdown file](/edit/PandocTraining/00-Instructor/01-HelloWorld/OILS515_syllabus.md)

The commands to generate multiple representations of the syllabus:

    pandoc --standalone --toc --latex-engine=pdflatex  -V geometry:margin=1in -V fontsize:11pt -o OILS515_syllabus.pdf OILS515_syllabus.md
    
    pandoc --toc -s --standalone --css=page2.css -o OILS515_syllabus.html OILS515_syllabus.md
    
    pandoc -s -o OILS515_syllabus.epub OILS515_syllabus.md

In [76]:
%%bash
pandoc --standalone --toc --latex-engine=pdflatex  -V geometry:margin=1in -V fontsize:11pt -o OILS515_syllabus.pdf OILS515_syllabus.md
pandoc --toc -s --standalone --css=page2.css -o OILS515_syllabus.html OILS515_syllabus.md
pandoc -s -o OILS515_syllabus.epub OILS515_syllabus.md

The generated files:

* [The PDF file](/files/PandocTraining/00-Instructor/01-HelloWorld/OILS515_syllabus.pdf)
* [The HTML file](/files/PandocTraining/00-Instructor/01-HelloWorld/OILS515_syllabus.html)
* [The EPub file](/files/PandocTraining/00-Instructor/01-HelloWorld/OILS515_syllabus.epub)

### A Recently Presented Conference Poster

[The source markdown file](/edit/PandocTraining/00-Instructor/01-HelloWorld/AgileCuration_2016AGUPoster/2016-12_AGUPoster.md)

```bash
cd AgileCuration_2016AGUPoster
pandoc -s -S \
--normalize \
--filter pandoc-citeproc \
--csl ./science.csl \
--template=poster.tex \
-f markdown+raw_tex \
-o 2016-12_AGUPoster.pdf \
2016-12_AGUPoster.md
```

In [77]:
%%bash
cd AgileCuration_2016AGUPoster  # change into the directory that has all the files
pandoc -s -S \
--normalize \
--filter pandoc-citeproc \
--csl ./science.csl \
--template=poster.tex \
-f markdown+raw_tex \
-o 2016-12_AGUPoster.pdf \
2016-12_AGUPoster.md

pandoc -s -S \
--normalize \
--filter pandoc-citeproc \
--csl ./science.csl \
-o 2016-12_AGUPoster.html \
2016-12_AGUPoster.md

The generated file:

* [The PDF file](AgileCuration_2016AGUPoster/2016-12_AGUPoster.pdf)
* [The HTML file](AgileCuration_2016AGUPoster/2016-12_AGUPoster.html)

### A Collection of Slide Presentations

[`01_DataManagement.md`](/edit/PandocTraining/00-Instructor/01-HelloWorld/GMT200_DataManagement/01_DatManagement.md)

[`02_DataSecurity.md`](/edit/PandocTraining/00-Instructor/01-HelloWorld/GMT200_DataManagement/01_DataSecurity.md)

[`03_DataManagementPlanning.md`](/edit/PandocTraining/00-Instructor/01-HelloWorld/GMT200_DataManagement/01_DataManagementPlanning.md)

Commands to generate each of the slide shows:

```
pandoc --section-divs --slide-level 3 -c lobo_slides.css  --standalone -t dzslides -o 01_DataManagement.slides.html 01_DataManagement.md

pandoc --section-divs --slide-level 3 -c lobo_slides.css  --standalone -t dzslides -o 02_DataSecurity.slides.html 02_DataSecurity.md

pandoc --section-divs --slide-level 3 -c lobo_slides.css  --standalone -t dzslides -o 03_DataManagementPlanning.slides.html 03_DataManagementPlanning.md
```

Commands to generate the corresponding PDF files:

```
pandoc --template=default.latex --latex-engine=xelatex --self-contained --standalone -o 01_DataManagement.pdf 01_DataManagement.md

pandoc --template=default.latex --latex-engine=xelatex --self-contained --standalone -o 02_DataSecurity.pdf 02_DataSecurity.md

pandoc --template=default.latex --latex-engine=xelatex --self-contained --standalone -o 03_DataManagementPlanning.pdf 03_DataManagementPlanning.md
```

In [85]:
%%bash
cd GMT200_DataManagement
pandoc --section-divs --slide-level 3 -c lobo_slides.css  --standalone -t dzslides -o 01_DataManagement.slides.html 01_DataManagement.md
pandoc --section-divs --slide-level 3 -c lobo_slides.css  --standalone -t dzslides -o 02_DataSecurity.slides.html 02_DataSecurity.md
pandoc --section-divs --slide-level 3 -c lobo_slides.css  --standalone -t dzslides -o 03_DataManagementPlanning.slides.html 03_DataManagementPlanning.md
pandoc --template=default.latex --latex-engine=xelatex --self-contained --standalone -o 01_DataManagement.pdf 01_DataManagement.md
pandoc --template=default.latex --latex-engine=xelatex --self-contained --standalone -o 02_DataSecurity.pdf 02_DataSecurity.md
pandoc --template=default.latex --latex-engine=xelatex --self-contained --standalone -o 03_DataManagementPlanning.pdf 03_DataManagementPlanning.md

[All the files related to this example](/tree/PandocTraining/00-Instructor/01-HelloWorld/GMT200_DataManagement)

---------------------------------------
[NEXT - Pandoc Markdown Syntax](/notebooks/PandocTraining/00-Instructor/02-Syntax/02%20-%20Pandoc%20Mardown%20Syntax.ipynb)