# Reproducible Data Science in Python &mdash; Index

The goal of this tutorial is to explore ways of doing **reproducible data science** in Python. It begins with a brief discussion of recent history and theory around reproducibility to motivate why it is important and what the benefits are. As reproducibility has gained importance, many tools have started coming up to address this need. These tools are built on a shared set of technical building blocks, and we provide a cursory introduction to the building blocks for implementing reproducible data science in the second section. Finally, we conclude the theory section concludes with a short survey of the current landscape of tools.

The hands-on section of the tutorial focusing on doing reproducible data science using [RENKU](https://renkulab.io).

# Schedule

<table style="font-size: 14px; margin: 10px;">
    <tbody>
        <tr>
            <th>Introduction (1h 10m)</th>
            <td></td>
            <td></td>
        </tr>        
        <tr>
            <th>15 min</th>
            <td><a href="./00-Theory.ipynb">Background &amp; Theory</a></td>
            <td style="text-align: left">Terminology, history, and philosophy of reproducibility</td>
        </tr>
        <tr> 
            <th>30 min</th>
            <td><a href="./01-BuildingBlocks.ipynb">Building Blocks</a></td>
            <td style="text-align: left">Building blocks for achieving reproducibility</td>
        </tr>
        <tr>
            <th>15 min</th>
            <td><a href="./02-Tools.ipynb">Tools</a></td>
            <td style="text-align: left">Survey of the current tool landscape</td>
        </tr>
        <tr>
            <th>10 min</th>
            <td>Break</td>
            <td></td>
        </tr>
        <tr>
            <th>Hands-on (1h 30m)</th>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <th>10 min</th>
            <td>Preview</td>
            <td style="text-align: left">What are we trying to accomplish?</td>
        </tr>
        <tr>
            <th>30 min</th>
            <td>Starting</td>
            <td style="text-align: left">Starting a project, inspecting + importing data</td>
        </tr>
        <tr>
            <th>30 min</th>
            <td>Iterating</td>
            <td style="text-align: left">Adding data, improving analysis</td>
        </tr>
        <tr>
            <th>20 min</th>
            <td>Reflection</td>
            <td style="text-align: left">What is the benefit? How much effort was it?</td>
        </tr>
     </tbody>
</table>

# Set-up


## Set-up Hosted

Follow the instructions in the project [project README-renkulab.md](../README-renkulab.md)


## Set-up Local

Please follow the set-up instructions in the [project README.md](../README.md).

# Hands-on

There are four versions of the hands-on that work through the same task in different environments. The two **local** versions uses renku installed on your computer; the two **hosted** versions use our public https://renkulab.io server as the execution environment. In each execution environment, the **plain** version implements the code in normal Python files, the **notebook** version implements the code in Jupyter Notebooks. Pick one from the matrix below.


<table style="font-size: 14px; margin: 10px;">
    <thead>
        <tr>
            <th></th>
            <th>Plain</th>
            <th>Notebook</th> 
        </tr>
    </thead>
    <tbody>
        <tr> 
            <th>Local</th>
            <td><a href="./hands-on/local_plain/index.ipynb">Local/Plain</a></td>
            <td><a href="./hands-on/local_notebook/index.ipynb">Local/Notebook</a></td>
        </tr>
        <tr> 
            <th>Hosted</th>
            <td><a href="./hands-on/hosted_plain/index.ipynb">Hosted/Plain</a></td>
            <td><a href="./hands-on/hosted_notebook/index.ipynb">Hosted/Notebook</a></td>
        </tr>
     </tbody>
</table>