Using Conda is a great way to create reproducible, cross-platform data products.  I used virtual environments before reading Jake VanderPlas' [article](http://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/) on conda and now I've completely made the switch.  Maintaining separate interpreters for projects is nice and so is avoiding compiling issues on Windows, however the best reason to use conda is the `environment.yml` file.  

Building a conda environment from a file is very similiar to using a Makefile - it will reliably recreate a Python interpreter with specific libraries installed.  This isn't useful at the start of a project but after wrangling data and installing libraries of various utility you will ideally end up with a product to share.  This is real power of the conda environment file: the ability to export the state of the environment including dependencies to a file and commit it to the project repository.  Now you can deploy code, model outputs and the execution environment to a server, share it with a friend or redo your own work two years from now on a new machine with no fuss!

This is as simple as:  
`$ conda env export > environment.yml`  

This command and reference for managing conda environments is available [here.](https://conda.io/docs/user-guide/tasks/manage-environments.html)  
  
What is next-level about conda is the ablity to install non-Python tools with conda.  From wrapping [MATLAB](https://anaconda.org/conda-forge/octave), calling [ggplot](https://anaconda.org/r/r-ggplot2) or just taking [Julia](https://anaconda.org/conda-forge/julia) for a spin you can reproducibly include these in your project.  

Jupyter is a tool that needs little introduction [these days](https://www.nature.com/articles/d41586-018-07196-1).  One of its less discussed features the ability to use [multiple kernals](https://github.com/jupyter/jupyter/wiki/Jupyteenvironmentsr-kernels).  So after installing a language kernal via conda, you can run R scripts or Julia or MATLAB or Perl etc right in the notebook.  

I decided a good Hello World for this blog would be its setup.  I initally followed Vik Paruchuri''s post [Building a Data Science Portfolio: Setting Up a Blog](https://www.dataquest.io/blog/how-to-setup-a-data-science-blog/) which lays out the pelican static site process and discusses some trade offs.

### 1. Install Miniconda for your Platform
I prefer to use Miniconda, installation and management instructions can be found at:  

[https://conda.io/docs/user-guide/install/index.html](https://conda.io/docs/user-guide/install/index.html)

### 2. Create working directory and files
Once conda is installed, create a folder and change directory into it, in this post we'll call it `blog-source`  

Next, make a file called `environment.yml` in `blog-source` and put the following in it:  
```
name: pelican-blog

channels:
    - defaults
    - conda-forge
    - plotly

dependencies:
    - markdown
    - beautifulsoup4
    - pelican
    - numpy
    - scipy
    - pandas
    - scikit-learn
    - tensorflow
    - matplotlib
    - plotly
    - jupyter
    - nbconvert
    - ghp-import
    - spyder
    - git
    - pip
    - pip:
        - cufflinks
```  
This is has a few of the libraries that are useful for data driven projects.  Vik's post lists specific library versions which are now out of date, I prefer to specify all requirements and let the package manager sort the dependencies during environment creation.  If in the future I need a specific build I will export the environment state as described above to a new `environment.yml` file and document what it is for.  The export process will save the library version, i.e. `- numpy=1.13.1=py36_0`

### 3. Create Conda environment and activate it  
Inside `blog-source` run:  
```$ conda env create -f environment.yml```  
  
To activate the conda environment in Mac OSX or Linux:  
```$ source activate pelican-blog```  

and in Windows:  
```> activate pelican-blog```  

Check installed libraries work:  
```  
(pelican-blog) ~/blog-source$ python  
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32)  
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux  
Type "help", "copyright", "credits" or "license" for more information.  
>>> import markdown  
>>> import bs4  
>>> import pelican
>>> import numpy
>>> import scipy
>>> import pandas
>>> import sklearn
>>> import tensorflow
>>> import matplotlib
>>> import plotly
>>> import cufflinks
```

### 4. Build Pelican site  
Run the pelican-quickstart command in the `blog-source` folder to begin the interactive setup script:  

```(pelican-blog) ~/blog-source$ pelican-quickstart```  

Answer the setup questions, choose the default if you don't know the answer.  After it completes, you should have `output` and `content` folders as well as several new files including `pelicanconf.py` and `publishconf.py`  

If you want to use a favicon, generate one (I used this [site](https://favicon.io/favicon-generator/)) and put it in a folder called `extra` inside the `content` folder.

### 5. Install Git and Pelican Jupyter plugin  
We'll need git to publish our blog as well as install the Jupyter Pelican plugin.  On Linux and Mac Git may already be installed, it is included in the conda environment file so if you're working in Windows it will be available.  

First confirm git is available:  
`$ git --version  
git version 2.7.4`  

Create a destination for the Jupyter plugin:  
`(pelican-blog) ~/blog-source $ mkdir plugins`  

Install pelican-ipynb:  
```(pelican-blog) ~/blog-source $ git submodule add git://github.com/danielfrg/pelican-ipynb.git plugins/ipynb```  

There should be a `.gitmodules` file and inside the `plugins` folder should be a folder called `ipynb`  

Activate and configure the plugin by editing the `pelicanconf.py` file to have the following at the end of the file. Note that the variables are all lists, older versions of Pelican allowed strings.  

```
MARKUP = ('md', 'ipynb')  
PLUGIN_PATHS = ['./plugins']  
PLUGINS = ['ipynb.markup']  
IGNORE_FILES =['.ipynb_checkpoints']
```

### 6. Write first post  
Each post requires two files, a Jupyter notebook and a meta data file.  In older versions of Pelican, the metadata file extension was `.ipynb-meta` however this file is now `.nbdata`.  The internals of the `.nbdata` are:  
```
Title: First Post
Slug: conda-jupyter-pelican
Date: 2018-12-08 18:00
Category: posts
Tags: Python Jupyter Conda
Author: Colin Dietrich
Summary: Building a Python Blog using Conda and Pelican
```  

The fields are straight forward except for 'Slug' which wikipedia has a [nice explaination.](https://en.wikipedia.org/wiki/Clean_URL#Slug) - it is the file name of the source notebook used in the URL, the companion .nbdata file should be named the same.

### 7. Generate Static HTML  

If you want to use a custom theme, look at the [gallery](https://github.com/getpelican/pelican-themes) and download one into a local folder.  I use `themes` inside `blog-source` so I can commit it and keep it with the rest of the site source.  To tell Pelican to use the theme `your_custom_theme` inside `themes`, open `pelicanconf.py` and add the following line:  
`THEME = '-/blog/themes/your_custom_theme'`

From the root of blog-source again, generat the static HTML with:    
```(blog) ~/blog-source$ pelican content```  
  
Then switch to the output folder and run:  
```(blog) ~/blog-source/output$ pelican -l```  
  
Open `localhost:8000` in a browser to view a local copy of your site.  

To quit the local test server, press `ctrl+C` in the terminal.

### 8. Create Github Page and setup Pelican configuration

Within your github account, create a repository called `your_username.github.io` without any files (no README, .gitignore, etc).  Once it's created on Github, copy the SSH link for adding to your local repository.

Locally, edit the `SITEURL` variable inside the `publishconf.py` to your site URL

### 9. Create Git branch and publish to Github  

At this point, we've successfully installed Python, Jupyter and Pelican, written content in Jupyter, converted it into static HTML and viewed it on a local server.  We have source files consisting of `.py` files, various configuration files and folders and Jupyter Notebooks while the the static HTML site we previewed locally is in the `output` folder.  

To publish our site as currently storing in the `output` folder, we must move the files to the root directory of `blog-source` and `master` branch of a git repository and push that to Github.  To keep everything together in one organized repository (and completely backed up on Github), we'll use a branch to store the source files.  
  
##### A. Setup Github Repository  
Within your github account, create a repository called `your_username.github.io` without any files (no README, .gitignore, etc).  Once it's created on Github, copy the SSH link for adding to your local repository.

##### B. Setup Local Repository  
In local blog directory `blog-source`, initialize a new repository and add the new remote origin on github to the local git repository:  
  
```(blog) ~/blog-source $ git init  
(blog) ~/blog-source $ git remote add origin git@github.com:your_username/your_username.github.io.git```  

Next, create a new branch called `develop` to keep Python, Jupyter and Pelican source files in:  
``(blog) ~/code/blog-source $ git checkout -b develop
Switched to a new branch 'develop'
``

Into the working directory `blog-source` create a file called `.gitignore` and copy the contents of [this file](https://github.com/github/gitignore/blob/master/Python.gitignore) into it.  This was a great suggestion by Vik, it really keeps the repo clean later when we start working on projects with IDEs that litter the working directory.  

Locally, edit the `SITEURL` variable inside the `publishconf.py` to your site URL `your_username.github.io`

### 10. Use ghp-import to populate master branch  

Everything commited to the `master` branch of your repository will be what Github Pages serves as your site and our source content is in the `develop` branch.  To create a `master` branch that contains our static HTML, we'll use the [ghp-import](https://github.com/davisp/ghp-import) tool to move files from one branch to another.  

NOTE: USING ghp-import IS DESTRUCTIVE.  Anything in your `master` or `ghp-pages` branch will be automatically deleted without confirmation.  

The command we'll use is:  
`(blog) ~/code/blog-source $ ghp-import -b master output`  

Since this isn't immediately obvious, let's break down the usage:  
`-b master` = branch to write TO, in this case `master`  
`output` = directory in CURRENT branch to write FROM, in this case `output`  

After running `ghp-import`, we'll have two branches in `blog-source` that when active look like:  

##### develop branch  
```  
content
drafts
environment.yml
Makefile
output
pelicanconf.py
plugins
publishconf.py
__pycache__
tasks.py
themes  
```

##### master branch  
```  
TODO  
```

### 9. Publishing workflow  
  
Always stay on the `develop` Git branch unless you're cleaning up extra files like `__pycache__` or adding another `.gitignore` - `ghp-import` will do the work of updating the `master` branch.  If you do switch branches be aware Jupyter may lose reference to the Notebook file it has open when the repository checks out another file state - if this happens, you can lose unsaved work.  

##### A. Work Locally and check static HTML build
To preview your site locally, in `blog-source` run:  
`(blog) ~/blog-source $ pelican content`  
  
Then start a local test server with:  
`(blog) ~/blog-source $ pelican -l`
  
Check the build locally by opening:  
`http://localhost:8000/`  

Finally, keep a remote copy of your source files on Github:  
`$ git push origin develop`  

##### B. Rebuild for Online and push to Github  
To publish your site to Github Pages using site URLs, run:  
`(blog) ~/blog-source $ pelican content -s publishconf.py`  
  
Use the `ghp-import` tool to import the output folder of the `develop` branch to the root directory of the `master` branch of your repository:  
`$ ghp-import -b master output`  

Push your files to Github:  
`$ git push origin master`  

Check your site is online and rejoice!  
`https://your_username.github.io`  

### 10. Next Steps  
Now that the site is up online, add more Notebooks to `content` and if you want any extended features, revisit `publishconfi.py` to enable analytics, comments, and feeds.  You can use the `content` folder as a working directory for your projects, but I find organizing myself in a separate directory and conda environment a good first publishing step before including in the `blog-source` directory.

### I. Reference  
The precise build for this site is in environment_deployed.yml:  
```
name: blog
channels:
- conda-forge
- defaults
dependencies:
- blinker=1.4=py_1
- feedgenerator=1.9=py_1
- ghp-import=0.5.5=py_1
- pelican=4.0.1=py_0
- alabaster=0.7.10=py36_0
- astroid=1.5.3=py36_0
- babel=2.5.0=py36_0
- backports=1.0=py36_0
- backports.weakref=1.0rc1=py36_0
- beautifulsoup4=4.6.0=py36_0
- blas=1.0=mkl
- bleach=1.5.0=py36_0
- certifi=2016.2.28=py36_0
- chardet=3.0.4=py36_0
- curl=7.54.1=0
- cycler=0.10.0=py36_0
- dbus=1.10.20=0
- decorator=4.1.2=py36_0
- docutils=0.14=py36_0
- entrypoints=0.2.3=py36_0
- expat=2.1.0=0
- fontconfig=2.12.1=3
- freetype=2.5.5=2
- git=2.11.1=0
- glib=2.50.2=1
- gst-plugins-base=1.8.0=0
- gstreamer=1.8.0=0
- html5lib=0.9999999=py36_0
- icu=54.1=0
- imagesize=0.7.1=py36_0
- ipykernel=4.6.1=py36_0
- ipython=6.1.0=py36_0
- ipython_genutils=0.2.0=py36_0
- ipywidgets=6.0.0=py36_0
- isort=4.2.15=py36_0
- jedi=0.10.2=py36_2
- jinja2=2.9.6=py36_0
- jpeg=9b=0
- jsonschema=2.6.0=py36_0
- jupyter=1.0.0=py36_3
- jupyter_client=5.1.0=py36_0
- jupyter_console=5.2.0=py36_0
- jupyter_core=4.3.0=py36_0
- krb5=1.13.2=0
- lazy-object-proxy=1.3.1=py36_0
- libffi=3.2.1=1
- libgcc=5.2.0=0
- libgfortran=3.0.0=1
- libiconv=1.14=0
- libpng=1.6.30=1
- libprotobuf=3.4.0=0
- libsodium=1.0.10=0
- libssh2=1.8.0=0
- libxcb=1.12=1
- libxml2=2.9.4=0
- markdown=2.6.9=py36_0
- markupsafe=1.0=py36_0
- matplotlib=2.0.2=np113py36_0
- mistune=0.7.4=py36_0
- mkl=2017.0.3=0
- nbconvert=5.2.1=py36_0
- nbformat=4.4.0=py36_0
- notebook=5.0.0=py36_0
- numpy=1.13.1=py36_0
- numpydoc=0.7.0=py36_0
- openssl=1.0.2l=0
- pandas=0.20.3=py36_0
- pandocfilters=1.4.2=py36_0
- path.py=10.3.1=py36_0
- pcre=8.39=1
- pexpect=4.2.1=py36_0
- pickleshare=0.7.4=py36_0
- pip=9.0.1=py36_1
- plotly=2.0.11=py36_0
- prompt_toolkit=1.0.15=py36_0
- protobuf=3.4.0=py36_0
- psutil=5.2.2=py36_0
- ptyprocess=0.5.2=py36_0
- pycodestyle=2.3.1=py36_0
- pyflakes=1.6.0=py36_0
- pygments=2.2.0=py36_0
- pylint=1.7.2=py36_0
- pyparsing=2.2.0=py36_0
- pyqt=5.6.0=py36_2
- python=3.6.2=0
- python-dateutil=2.6.1=py36_0
- pytz=2017.2=py36_0
- pyzmq=16.0.2=py36_0
- qt=5.6.2=5
- qtawesome=0.4.4=py36_0
- qtconsole=4.3.1=py36_0
- qtpy=1.3.1=py36_0
- readline=6.2=2
- requests=2.14.2=py36_0
- rope=0.9.4=py36_1
- scikit-learn=0.19.0=np113py36_0
- scipy=0.19.1=np113py36_0
- setuptools=36.4.0=py36_1
- simplegeneric=0.8.1=py36_1
- singledispatch=3.4.0.3=py36_0
- sip=4.18=py36_0
- six=1.10.0=py36_0
- snowballstemmer=1.2.1=py36_0
- sphinx=1.6.3=py36_0
- sphinxcontrib=1.0=py36_0
- sphinxcontrib-websupport=1.0.1=py36_0
- spyder=3.2.3=py36_0
- sqlite=3.13.0=0
- tensorflow=1.3.0=0
- tensorflow-base=1.3.0=py36h5293eaa_1
- tensorflow-tensorboard=0.1.5=py36_0
- terminado=0.6=py36_0
- testpath=0.3.1=py36_0
- tk=8.5.18=0
- tornado=4.5.2=py36_0
- traitlets=4.3.2=py36_0
- unidecode=0.04.21=py36_0
- wcwidth=0.1.7=py36_0
- werkzeug=0.12.2=py36_0
- wheel=0.29.0=py36_0
- widgetsnbextension=3.0.2=py36_0
- wrapt=1.10.11=py36_0
- xz=5.2.3=0
- zeromq=4.1.5=0
- zlib=1.2.11=0
- pip:
  - colorlover==0.2.1
  - cufflinks==0.14.6
  - ipython-genutils==0.2.0
  - jupyter-client==5.1.0
  - jupyter-console==5.2.0
  - jupyter-core==4.3.0
  - nose==1.3.7
  - prompt-toolkit==1.0.15
  - retrying==1.3.3
  - rope-py3k==0.9.4.post1
prefix: /home/colin/anaconda3/envs/blog-source
```