disdat-examples

Installation and Notebook examples

Install dependencies

pip install -e .

Initialize disdat

dsdt init

Start Jupyter Notebooks

jupyter notebook

Open example notebooks

Running from the CLI and dockerization

Run simple example pipeline from the command line and create an output bundle return_targets

cd disdat-examples
dsdt apply pipelines.return_targets.ReturnTargets
dsdt ls -v return_targets
NAME                	PROC_NAME           	OWNER   	DATE              	COMMITTED	UUID                                    	TAGS
return_targets      	ReturnTargets____ca7a191361	kyocum  	06-04-19 20:13:34 	False   	bef67232-86b6-4847-a2db-bf55eadc674b

Now dockerize the pipeline (assuming you remain in the repo's top-level directory and Docker is installed on your system).

dsdt dockerize .

Now run the dockerized version of the pipeline.

dsdt run -f . pipelines.return_targets.ReturnTargets
dsdt ls -v return_targets
NAME                	PROC_NAME           	OWNER   	DATE              	COMMITTED	UUID                                    	TAGS
return_targets      	ReturnTargets____ca7a191361	root    	06-04-19 20:17:26 	False   	96abb085-bbdd-48b6-917d-d51d2c8ac744
return_targets      	ReturnTargets____ca7a191361	kyocum  	06-04-19 20:13:34 	False   	bef67232-86b6-4847-a2db-bf55eadc674b

Notice that the run command required us to specify the directory of the setup.py (like dsdt dockerize) and we added -f to force the entire pipeline to re-run.

Additional MNIST / Spacy Examples

The pipelines directory also contains the mnist.py and nlp_spacy.py pipelines.

Unlike the above examples, we will run the MNIST and Spacy examples using the CLI.

Setup

Here we create a example-context data context (the same used in the above examples) into which we'll place our data.

$ dsdt context example-context
$ dsdt switch example-context

Example: MNIST

We've adapted the Tensorflow Keras example here. Here we've broken the example down into three steps in mnist.py <pipelines/mnist.py>_, which you will see as three classes:

GetTFDS: This downloads the mnist tfds and stores the files in a bundle named for the tfds mnist
Train: This PipeTask depends on the GetTFDS tasks and trains a simple Keras NN using it. It stores the saved model into an output bundle called mnist-trained.
Evaluate: This PipeTask depends on both upstream tasks. It restores the model, and evaluates it. It returns a loss and accuracy in its output bundle mnist-evaluation

To run all three steps, tell the Disdat CLI to execute the last step:

$ dsdt apply pipelines.mnist.Evaluate

[ . . . lots of output . . . ]

===== Luigi Execution Summary =====
Scheduled 4 tasks of which:
* 4 ran successfully:
    - 1 DriverTask(...)
    - 1 Evaluate(...)
    - 1 GetTFDS(...)
    - 1 Train(...)

This progress looks :) because there were no failed tasks or missing dependencies

Now you've produced three bundles. Use dsdt ls to see our three bundles. You can cat each bundle to see what's inside. There you'll find all of our output files and values.

$ dsdt ls m.*
mnist-evaluation
mnist-trained
mnist
$ dsdt cat mnist-evaluation
[0.08208457 0.97430003]

Example: Spacy

The Spacy example illustrates how you might include additional packages or data inside your Disdat container. In this case we have created a MANIFEST.in file which tells setuptools to include the data in pipelines/en_core_web.

This trivial example simply shows how to use Python's built-in pkg_resources to get the Spacy en_core_web data. You can run this example via

$dsdt apply pipelines.nlp_spacy.SimpleNLP

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
config/ubuntu-14.04		config/ubuntu-14.04
notebooks		notebooks
pipelines		pipelines
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config/ubuntu-14.04

config/ubuntu-14.04

notebooks

notebooks

pipelines

pipelines

.gitignore

.gitignore

MANIFEST.in

MANIFEST.in

README.md

README.md

setup.py

setup.py

Repository files navigation

disdat-examples

Installation and Notebook examples

Running from the CLI and dockerization

Additional MNIST / Spacy Examples

Setup

Example: MNIST

Example: Spacy

About

Releases

Packages

Contributors 2

Languages

seanr15/disdat-examples

Folders and files

Latest commit

History

Repository files navigation

disdat-examples

Installation and Notebook examples

Running from the CLI and dockerization

Additional MNIST / Spacy Examples

Setup

Example: MNIST

Example: Spacy

About

Resources

Stars

Watchers

Forks

Languages