Skip to content

Data science workflow demonstration with MLFlow, Papermil, and Jupyter.

Notifications You must be signed in to change notification settings

genert/data-science-workflow-demonstration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Workflow Demonstration

This repository provides a demonstration of usual data sciene workflow. Please keep in mind, that this is just one example approach of many, use whatever works best for your team(s).

Before you start

You should have already installed Anaconda.

Getting started

First, configure Conda:

conda config --add channels conda-forge
conda config --set channel_priority strict

Secondly, create environment called data-science-workflow-demonstration.

conda create --name "data-science-workflow-demonstration" python=3.7.9 mlflow scikit-learn jupyterlab papermill matplotlib statsmodels ipywidgets

Conda checks to see what additional packages ("dependencies") will need, and asks if you want to proceed:

Proceed ([y]/n)? y

Type "y" and press Enter to proceed.

After everything has been installed, active the environment data-science-workflow-demonstration:

conda activate data-science-workflow-demonstration

Start Jupyter notebook

cd notebooks && jupyter lab

Run the cells in runner.ipynb!

You should see the following:

jupyterlab

Start MLFlow (in another terminal)

cd notebooks && mlflow ui

You should see the following:

mlflow

Profit!

About

Data science workflow demonstration with MLFlow, Papermil, and Jupyter.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published