# Demonstrating Rule 5: Using Watermark to Record Dependencies

# Index

### Introduction
### Installation and Using Watermark
### Helpful References
___

# Introduction
After reading ["Ten simple rules for writing and sharing computational analyses in Jupyter Notebooks"](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007007), it was clear how important it is to record the dependencies for your Jupyter Notebooks in order for others to be able to reproduce the results you created. Using **Rule 5** enables others to have a clear understanding of your Jupyter environment at the time your notebook was created.

#### Rule 5: "Record dependencies"

> Rerunning your analysis in the future will require accessing not only your code but also any module or library that your code relied on. As is best practice across computational science, manage your dependencies using a package or environment manager like pip or Conda. These enable you to download modules and libraries, specify the version of each you want to use in your analysis, and even generate files such as Conda’s environment.yml or pip’s requirements.txt that concisely describe all of your dependencies. These files can be used by tools such as Binder or Docker to generate a “container” that other researchers can use to reproduce your analysis using the same versions of every module and library as you did. Always conduct your work in an environment created only from these dependencies to ensure you do not add undocumented dependencies.

> As an extra precaution in notebooks, you can explicitly print out your dependencies using a notebook extension such as [watermark](https://github.com/rasbt/watermark). Listing the versions of critical dependencies in the notebook itself (best done at the bottom) will ensure that, if used in isolation from its environment, the notebook still contains critical information to help readers run it.

#### Goal of this Notebook
In this notebook, I will demonstrate the following portion of **Rule 5**:
> **As an extra precaution in notebooks, you can explicitly print out your dependencies using a notebook extension such as [watermark](https://github.com/rasbt/watermark). Listing the versions of critical dependencies in the notebook itself (best done at the bottom) will ensure that, if used in isolation from its environment, the notebook still contains critical information to help readers run it.**

I will do this by:
1. installing watermark
2. creating multiple watermarks 
___

# Installation and Using Watermark

**The watermark line magic can be installed two ways:**
1. pip install watermark
2. pip install -e git+https://github.com/rasbt/watermark#egg=watermark

If installation is successful, then you can load the **watermark** magic extension into your notebook with the following code:

In [2]:
%load_ext watermark

To generate a basic watermark in your notebook, run the following code:

In [6]:
%watermark

2019-11-25T01:12:13-05:00

CPython 3.7.3
IPython 7.4.0

compiler   : MSC v.1915 64 bit (AMD64)
system     : Windows
release    : 10
machine    : AMD64
processor  : Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
CPU cores  : 8
interpreter: 64bit


The watermark magic function has many optional arguments. They are listed below:

-a AUTHOR, --author AUTHOR 
                        prints author name

-d,  --date            prints current date as YYYY-mm-dd

-n,  --datename        prints date with abbrv. day and month names

-t,  --time            prints current time as HH-MM-SS

-i,  --iso8601         prints the combined date and time including the time
                       zone in the ISO 8601 standard with UTC offset

-z,  --timezone        appends the local time zone

-u,  --updated         appends a string "Last updated: "

-c CUSTOM_TIME, --custom_time CUSTOM_TIME
                       prints a valid strftime() string

-v,  --python          prints Python and IPython version

-p PACKAGES, --packages PACKAGES
                       prints versions of specified Python modules and packages

-h,  --hostname        prints the host name

-m,  --machine         prints system and machine info

-g,  --githash         prints current Git commit hash

-r,  --gitrepo         prints current Git remote address

-b,  --gitbranch       prints the current Git branch (new in v1.6)

-iv, --iversion        print name and version of all imported packages      

-w,  --watermark       prints the current version of watermark

### Examples

Printing date (-n), time (-t) and timezone (-z)

In [25]:
%watermark -n -t -z

Mon Nov 25 2019 01:26:53 Eastern Standard Time


Print date (-n), time (-t), and author (-a) 

(note: watermark forces an output order of Author Date Time regardless of input order)

In [26]:
%watermark -t -n -a "Ten Rules"

Ten Rules Mon Nov 25 2019 01:27:12


Print Python and IPython versions (-v) and machine info (-m)

(note: watermark includes this information within the default %watermark command with no arguments)

In [28]:
%watermark -v -m

CPython 3.7.3
IPython 7.4.0

compiler   : MSC v.1915 64 bit (AMD64)
system     : Windows
release    : 10
machine    : AMD64
processor  : Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
CPU cores  : 8
interpreter: 64bit


Print package versions for pandas, numpy, matplotlib using (-p)

(note: after the -p argument you must pass in the names of the modules in a comma separated format with no spaces

In [31]:
%watermark -p pandas,numpy,matplotlib

pandas 0.24.2
numpy 1.16.2
matplotlib 3.0.3


Print package versions for pandas, numpy, matplotlib using (-iv). The (-iv) method record the version of any module/package loaded into the environment

In [34]:
import pandas

In [35]:
%watermark -iv

matplotlib 3.0.3
pandas     0.24.2
numpy      1.16.2



In [36]:
import numpy
import matplotlib

In [37]:
%watermark -iv

matplotlib 3.0.3
pandas     0.24.2
numpy      1.16.2



In the previous cell, you can see that matplotlib, pandas, and numpy versions were recorded even though pandas was imported three cells before while numpy and matplotlib were imported the cell before.

___

# Helpful References
For more information on using the watermark magic function within Jupyter Notebooks, the following resource is **very helpful**:

https://github.com/rasbt/watermark