In [1]:
# !pip install papermill nbconvert jupytext pandoc

In [2]:
import pandas as pd

# Managing Notebooks and Scripts using Command-Line Tools in Jupyter

This notebook shows how to use command-line tools inside Jupyter notebooks to make work easier and more efficient. It starts by teaching how to run command-line commands, like managing files or installing software, directly from the notebook. Then, it explains how to run entire notebooks from the command line, which helps when you need to automate tasks. The notebook also shows how to turn regular Python scripts into notebooks, making them easier to work with and share. Lastly, it teaches how to convert notebooks into other formats, like HTML or PDF, so they can be shared with others. Overall, it helps users combine the power of command-line tools and Jupyter notebooks to improve their workflow.

## Running Command-Line Commands in Jupyter

A command line is a text-based interface that allows users to interact with their computer’s operating system by typing commands, rather than using graphical interfaces.
In this interface, users can navigate directories, manage files, run programs, and perform a wide range of tasks by typing specific commands.
Popular command-line environments include Bash (common in Linux and macOS) and the Windows Command Prompt or PowerShell.

As researchers we may need to use command-line for file management (move, rename, delete, or organize datasets), automate repeating tasks that may involve external tools, install software etc. 

Incorporating command-line commands into our analysis notebooks allows us to integrate external tools, automate repeating tasks, and manage data all within the same environment. 

**Example** Install `pandas`

In [3]:
!pip install pandas



Install `numpy`

In [4]:
!pip install numpy



Install seaborn

In [5]:
!pip install seaborn



You can use any option that comes along with the command-line command

**Example** Upgrade matplotlib

In [6]:
!pip install --upgrade matplotlib



Upgrade seaborn

In [7]:
!pip install seaborn



Upgrade nbformat

In [8]:
!pip install --upgrade nbformat



**Example** Create a new directory called `data_1`

In [9]:
!mkdir data_1

Create a new directory `data_2`

In [10]:
!mkdir data_2

Create a new directory `data_1/data_1_sub`

(`data_1\data_1_sub` for windows machines)

In [11]:
!mkdir data_1\data_1_sub

We can run Linux command-line commands within a cell using %%bash

**Example** Copy `data/hello.py` to `data_1` directory

In [12]:
%%bash
cp data/python_config.py data_1/python_config.py

Copy `data/text_config.txt` to `data_1`

In [13]:
%%bash
cp data/text_config.txt data_1/text_config.txt

Copy data/notebook_config.ipynb to data_1/data_1_sub with a name `nb_config.ipynb`

In [14]:
%%bash
cp data/notebook_config.ipynb data_1/data_1_sub/nb_config.ipynb

Let's practice deleting files and folders. **Always be cautious when deleting any file**

**Example** Delete data_1/text_config.txt file. (Only file)

In [15]:
%%bash
rm data_1/text_config.txt

Delete data_1/python_config.txt (Only file)

In [16]:
%%bash
rm data_1/python_config.py

Delete data_2 directory

In [17]:
%%bash
rmdir data_2

Delete data_1 including sub-directories

In [18]:
%%bash
rm -r data_1

## Executing Notebooks from Command Line

Running a notebook from command-line can be useful to automate execution of Jupyter notebook as part of a workflow or pipeline.
It can help us integrate it with task scheduling tools to perform routine tasks without manually opening and running the notebook.
Another use would be when dealing with multiple notebooks, running from command-line allows for batch processing enabling us to execute several notebooks sequentially without manually interacting with each one.

Here we will look into a tool called `papermill` that can execute notebooks from command-line. We will also see how to execute notebooks sequentially and in parallel. For this, we use three notebooks

1. `analysis_workflow/1_data_access.ipynb`: Prepares the dataset steinmetz_active.csv
2. `analysis_workflow/2_contrast_level.ipynb`: Uses steinmetz_active.csv for contrast level analysis
3. `analysis_workflow/3_mouse_analysis.ipynb`: Uses steinmetz_active.csv for mouse analysis


Notebooks 2 and 3 are note dependent on each other.
Both use the output from notebook 1 for their analysis. 

Feel free to go through the content. 
**You do not have to know the code in each of the notebooks to follow the exercises.**

**Example** Execute `analysis_workflow/1_data_access.ipynb` as `output.ipynb` and examine `data_analysis` directory.

In [19]:
!papermill analysis_workflow/1_data_access.ipynb data_analysis/output.ipynb

Input Notebook:  analysis_workflow/1_data_access.ipynb
Output Notebook: data_analysis/output.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:30,  2.75s/cell]
Executing:  17%|#6        | 2/12 [00:03<00:15,  1.57s/cell]
Executing:  33%|###3      | 4/12 [00:05<00:08,  1.07s/cell]
Executing:  92%|#########1| 11/12 [00:05<00:00,  3.70cell/s]
Executing: 100%|##########| 12/12 [00:05<00:00,  2.10cell/s]


It has created the `steinmetz_active.csv` file. `output.ipynb` file is the same

Execute `analysis_workflow/2_contrast_level.ipynb` as `output.ipynb` and examine `data_analysis` directory.

In [20]:
!papermill analysis_workflow/2_contrast_level.ipynb data_analysis/output.ipynb

Input Notebook:  analysis_workflow/2_contrast_level.ipynb
Output Notebook: data_analysis/output.ipynb

Executing:   0%|          | 0/18 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   6%|5         | 1/18 [00:02<00:49,  2.91s/cell]
Executing:  11%|#1        | 2/18 [00:07<01:06,  4.16s/cell]
Executing:  33%|###3      | 6/18 [00:08<00:11,  1.03cell/s]
Executing:  67%|######6   | 12/18 [00:08<00:02,  2.64cell/s]
Executing:  89%|########8 | 16/18 [00:08<00:00,  4.03cell/s]
Executing: 100%|##########| 18/18 [00:09<00:00,  1.86cell/s]


Execute `analysis_workflow/3_mouse_analysis.ipynb` as `output.ipynb` and examine `data_analysis` directory.

In [21]:
!papermill analysis_workflow/3_mouse_analysis.ipynb data_analysis/output.ipynb

Input Notebook:  analysis_workflow/3_mouse_analysis.ipynb
Output Notebook: data_analysis/output.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:27,  2.49s/cell]
Executing:  17%|#6        | 2/12 [00:06<00:36,  3.63s/cell]
Executing:  50%|#####     | 6/12 [00:07<00:05,  1.17cell/s]
Executing: 100%|##########| 12/12 [00:07<00:00,  2.89cell/s]
Executing: 100%|##########| 12/12 [00:08<00:00,  1.44cell/s]


Delete `data_analysis/steinmetz_active.csv` file.

Execute `analysis_workflow/3_mouse_analysis.ipynb` as `output.ipynb` and examine `data_analysis` directory. What do you see?

In [22]:
# !papermill analysis_workflow/3_mouse_analysis.ipynb data_analysis/output.ipynb

It gives an error in the output of the cell. In `data_analysis/output.ipynb`, you will see a huge error in red on top of the notebook and another red text before the cell where it encountered an error.

If you are not interested in creating an output file

**Example** Execute `analysis_workflow/1_data_access.ipynb` inplace

In [23]:
!papermill analysis_workflow/1_data_access.ipynb analysis_workflow/1_data_access.ipynb

Input Notebook:  analysis_workflow/1_data_access.ipynb
Output Notebook: analysis_workflow/1_data_access.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:28,  2.63s/cell]
Executing:  17%|#6        | 2/12 [00:03<00:14,  1.43s/cell]
Executing:  33%|###3      | 4/12 [00:04<00:07,  1.02cell/s]
Executing: 100%|##########| 12/12 [00:04<00:00,  4.31cell/s]
Executing: 100%|##########| 12/12 [00:05<00:00,  2.31cell/s]


Execute `analysis_workflow/2_contrast_level.ipynb` in place

In [24]:
!papermill analysis_workflow/2_contrast_level.ipynb analysis_workflow/2_contrast_level.ipynb

Input Notebook:  analysis_workflow/2_contrast_level.ipynb
Output Notebook: analysis_workflow/2_contrast_level.ipynb

Executing:   0%|          | 0/18 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   6%|5         | 1/18 [00:02<00:42,  2.48s/cell]
Executing:  11%|#1        | 2/18 [00:07<00:59,  3.69s/cell]
Executing:  39%|###8      | 7/18 [00:07<00:08,  1.37cell/s]
Executing:  78%|#######7  | 14/18 [00:07<00:01,  3.47cell/s]
Executing: 100%|##########| 18/18 [00:07<00:00,  4.75cell/s]
Executing: 100%|##########| 18/18 [00:08<00:00,  2.13cell/s]


Execute `analysis_workflow/3_mouse_analysis.ipynb` as `analysis_workflow/3_mouse_analysis.ipynb`

In [25]:
!papermill analysis_workflow/3_mouse_analysis.ipynb analysis_workflow/3_mouse_analysis.ipynb

Input Notebook:  analysis_workflow/3_mouse_analysis.ipynb
Output Notebook: analysis_workflow/3_mouse_analysis.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:27,  2.50s/cell]
Executing:  17%|#6        | 2/12 [00:06<00:36,  3.65s/cell]
Executing:  67%|######6   | 8/12 [00:07<00:02,  1.61cell/s]
Executing: 100%|##########| 12/12 [00:07<00:00,  2.62cell/s]
Executing: 100%|##########| 12/12 [00:08<00:00,  1.43cell/s]


**Example** Execute `analysis_workflow/1_data_access.ipynb` and `analysis_workflow/2_contrast_level.ipynb` sequentially

In [26]:
!papermill analysis_workflow/1_data_access.ipynb data_analysis/output_1.ipynb
!papermill analysis_workflow/2_contrast_level.ipynb data_analysis/output_2.ipynb

Input Notebook:  analysis_workflow/1_data_access.ipynb
Output Notebook: data_analysis/output_1.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:27,  2.50s/cell]
Executing:  17%|#6        | 2/12 [00:03<00:13,  1.39s/cell]
Executing:  33%|###3      | 4/12 [00:04<00:07,  1.03cell/s]
Executing:  83%|########3 | 10/12 [00:04<00:00,  3.62cell/s]
Executing: 100%|##########| 12/12 [00:05<00:00,  2.34cell/s]
Input Notebook:  analysis_workflow/2_contrast_level.ipynb
Output Notebook: data_analysis/output_2.ipynb

Executing:   0%|          | 0/18 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   6%|5         | 1/18 [00:02<00:42,  2.49s/cell]
Executing:  11%|#1        | 2/18 [00:07<00:59,  3.70s/cell]
Executing:  39%|###8      | 7/18 [00:07<00:08,  1.37cell/s]
Executing:  83%|########3 | 15/18 [00:07<00:00,  3.77cell/s]
Executing: 100%|##########| 18/18 [00:08<00:00,  2.07cell/s]


Execute `analysis_workflow/1_data_access.ipynb` and `analysis_workflow/3_mouse_analysis.ipynb` sequentially

In [27]:
!papermill analysis_workflow/1_data_access.ipynb data_analysis/output_1.ipynb
!papermill analysis_workflow/3_mouse_analysis.ipynb data_analysis/output_3.ipynb

Input Notebook:  analysis_workflow/1_data_access.ipynb
Output Notebook: data_analysis/output_1.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:28,  2.62s/cell]
Executing:  17%|#6        | 2/12 [00:03<00:14,  1.46s/cell]
Executing:  33%|###3      | 4/12 [00:04<00:08,  1.00s/cell]
Executing: 100%|##########| 12/12 [00:04<00:00,  4.22cell/s]
Executing: 100%|##########| 12/12 [00:05<00:00,  2.28cell/s]
Input Notebook:  analysis_workflow/3_mouse_analysis.ipynb
Output Notebook: data_analysis/output_3.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:27,  2.46s/cell]
Executing:  17%|#6        | 2/12 [00:06<00:36,  3.66s/cell]
Executing:  50%|#####     | 6/12 [00:07<00:05,  1.17cell/s]
Executing:  83%|########3 | 10/12 [00:07<00:00,  2.37cell/s]
Executing: 100%|##########| 12/12 [00:08<00:00,  1.45cell/s]


Execute all the three notebooks one after the other

In [28]:
!papermill analysis_workflow/1_data_access.ipynb data_analysis/output_1.ipynb
!papermill analysis_workflow/2_contrast_level.ipynb data_analysis/output_2.ipynb
!papermill analysis_workflow/3_mouse_analysis.ipynb data_analysis/output_3.ipynb

Input Notebook:  analysis_workflow/1_data_access.ipynb
Output Notebook: data_analysis/output_1.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:27,  2.46s/cell]
Executing:  17%|#6        | 2/12 [00:03<00:13,  1.36s/cell]
Executing:  33%|###3      | 4/12 [00:04<00:07,  1.04cell/s]
Executing: 100%|##########| 12/12 [00:04<00:00,  4.38cell/s]
Executing: 100%|##########| 12/12 [00:05<00:00,  2.39cell/s]
Input Notebook:  analysis_workflow/2_contrast_level.ipynb
Output Notebook: data_analysis/output_2.ipynb

Executing:   0%|          | 0/18 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   6%|5         | 1/18 [00:02<00:42,  2.52s/cell]
Executing:  11%|#1        | 2/18 [00:07<00:59,  3.71s/cell]
Executing:  39%|###8      | 7/18 [00:07<00:08,  1.37cell/s]
Executing:  83%|########3 | 15/18 [00:07<00:00,  3.76cell/s]
Executing: 100%|##########| 18/18 [00:08<00:00,  2.00cell/s]
Input

## Turning Scripts into Notebooks

Converting a script into a Jupyter notebook can be valuable for enhancing code readability, facilitating interactive analysis, and improving collaboration. 
Notebooks provide an environment where code, explanations, and results are combined in a clear, organized format. 
This allows users to document their thought process alongside the code, include visualizations directly within the workflow, and run individual code cells for step-by-step debugging or exploration.

**Example** Create `script.py` with the below code and convert it to notebook. How does the resulting notebook look?

```python
num_mouse = 10
num_contrast_left = 4
num_contrast_right = 4
```

In [46]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


Create `script.py` with the below code and convert it to notebook. How does the resulting notebook look?

```python
num_mouse = 10
num_contrast_left = 4
num_contrast_right = 4

print(num_mouse)
```

In [47]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


Create `script.py` with the below code and convert it to notebook. How does the resulting notebook look?

```python
num_mouse = 10
num_contrast_left = 4
num_contrast_right = 4

num_mouse
```

In [65]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


Let's practice with markdown

**Example** Create a python `script.py` with markdown text "This is markdown text"

```python
# %% [markdown]
# This is a markdown cell
```

In [49]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


 Create a python `script.py` with markdown text with a big title and small subtitle. Convert it to notebook and examine the resulting notebook.

In [48]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


 Create a python `script.py` with multiple lines of markdown text with url. Convert it to notebook and examine the resulting notebook.

In [50]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


**Example** Create `script.py` with the a title "Data Analysis" and `a=10`. Convert it to notebook. How does the resulting notebook look?

```python
# %% [markdown]
# Title

# %%
a = 10
```

In [51]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


Create `script.py` with the a title "Data Analysis" and `a=10`, `b=100`. Convert it to notebook. How does the resulting notebook look?

In [52]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


Create `script.py` with the a title "Data Analysis", subtitle "2024-10-24" and `a=10`, `b=100`, and `c=a+b`, and display `c`. Convert it to notebook. How does the resulting notebook look?

In [53]:
!jupytext --to notebook script.py

[jupytext] Reading script.py in format py
[jupytext] Writing script.ipynb (destination file replaced [use --update to preserve cell outputs and ids])


(Optional) Add in markdown text to the `data/analysis.py`. Convert it to jupyter notebook.

## Turning Notebooks into Other Formats

Sometimes, we would want to convert jupyter notebooks to other formats.
Mainly, we would convert to python scripts or HTML. 
Converting to Python scripts can often help in version controlling and usage in large libraries.

Converting to HTML enables embedding the notebook within websites or presentations, enhancing communication of data and findings.

**Example** Convert `analysis_workflow/1_data_access.ipynb` to python script

In [62]:
!jupyter nbconvert --to script analysis_workflow/1_data_access.ipynb

[NbConvertApp] Converting notebook analysis_workflow/1_data_access.ipynb to script
[NbConvertApp] Writing 929 bytes to analysis_workflow\1_data_access.py


Convert `analysis_workflow/2_contrast_level.ipynb` to python script

In [63]:
!jupyter nbconvert --to script analysis_workflow/2_contrast_level.ipynb

[NbConvertApp] Converting notebook analysis_workflow/2_contrast_level.ipynb to script
[NbConvertApp] Writing 1296 bytes to analysis_workflow\2_contrast_level.py


Convert `analysis_workflow/3_mouse_analysis.ipynb` to python script

In [64]:
!jupyter nbconvert --to script analysis_workflow/3_mouse_analysis.ipynb

[NbConvertApp] Converting notebook analysis_workflow/3_mouse_analysis.ipynb to script
[NbConvertApp] Writing 759 bytes to analysis_workflow\3_mouse_analysis.py


**Example** Convert `analysis_workflow/1_data_access.ipynb` to HTML and open in new browser to examine.

In [54]:
!jupyter nbconvert --to html analysis_workflow/1_data_access.ipynb

[NbConvertApp] Converting notebook analysis_workflow/1_data_access.ipynb to html
[NbConvertApp] Writing 285237 bytes to analysis_workflow\1_data_access.html


Convert `analysis_workflow/2_contrast_level.ipynb` to HTML and open in new browser to examine.

In [55]:
!jupyter nbconvert --to html analysis_workflow/2_contrast_level.ipynb

[NbConvertApp] Converting notebook analysis_workflow/2_contrast_level.ipynb to html
  {%- elif type == 'text/vnd.mermaid' -%}
[NbConvertApp] Writing 332420 bytes to analysis_workflow\2_contrast_level.html


Convert `analysis_workflow/3_mouse_analysis.ipynb` to HTML and open in new browser to examine.

In [56]:
!jupyter nbconvert --to html analysis_workflow/3_mouse_analysis.ipynb

[NbConvertApp] Converting notebook analysis_workflow/3_mouse_analysis.ipynb to html
  {%- elif type == 'text/vnd.mermaid' -%}
[NbConvertApp] Writing 322392 bytes to analysis_workflow\3_mouse_analysis.html


**Example** Execute `analysis_workflow/1_data_access.ipynb` and convert to HTML

In [58]:
!papermill analysis_workflow/1_data_access.ipynb data_analysis/output_1.ipynb
!jupyter nbconvert --to html data_analysis/output_1.ipynb

Input Notebook:  analysis_workflow/1_data_access.ipynb
Output Notebook: data_analysis/output_1.ipynb

Executing:   0%|          | 0/12 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   8%|8         | 1/12 [00:02<00:27,  2.50s/cell]
Executing:  17%|#6        | 2/12 [00:03<00:13,  1.39s/cell]
Executing:  33%|###3      | 4/12 [00:04<00:08,  1.01s/cell]
Executing:  83%|########3 | 10/12 [00:04<00:00,  3.51cell/s]
Executing: 100%|##########| 12/12 [00:07<00:00,  1.54cell/s]
[NbConvertApp] Converting notebook data_analysis/output_1.ipynb to html
[NbConvertApp] Writing 285232 bytes to data_analysis\output_1.html


**Example** Execute `analysis_workflow/1_data_access.ipynb` and convert to PDF

In [60]:
!papermill analysis_workflow/2_contrast_level.ipynb data_analysis/output_2.ipynb
!jupyter nbconvert --to html data_analysis/output_2.ipynb

Input Notebook:  analysis_workflow/2_contrast_level.ipynb
Output Notebook: data_analysis/output_2.ipynb

Executing:   0%|          | 0/18 [00:00<?, ?cell/s]Executing notebook with kernel: python3

Executing:   6%|5         | 1/18 [00:02<00:43,  2.56s/cell]
Executing:  11%|#1        | 2/18 [00:07<01:03,  3.95s/cell]
Executing:  28%|##7       | 5/18 [00:07<00:14,  1.15s/cell]
Executing:  61%|######1   | 11/18 [00:07<00:02,  2.56cell/s]
Executing:  89%|########8 | 16/18 [00:07<00:00,  4.36cell/s]
Executing: 100%|##########| 18/18 [00:08<00:00,  2.07cell/s]
[NbConvertApp] Converting notebook data_analysis/output_2.ipynb to html
  {%- elif type == 'text/vnd.mermaid' -%}
[NbConvertApp] Writing 332412 bytes to data_analysis\output_2.html


(Optional) use nbconvert to convert to PDF and markdown