Introducing workflows #695

matthieudumont · 2015-08-04T13:19:11Z

Introducing the workflows structure with the median_otsu demonstration workflow

arokem · 2015-08-04T16:03:38Z

dipy/workflows/segment.py

+
+def median_otsu_bet(input_file, out_dir, save_masked=False, median_radius=4,
+                    numpass=4, autocrop=False, vol_idx=None, dilate=None):
+


Garyfallidis · 2015-08-05T19:18:26Z

@chrisfilo, @oesteban and @arokem with this workflow design I think it would be trivial to use nipype to call these workflows from dipy. Can you think of this and let me know your thoughts? @matthieudumont and me and hopefully others too will be working on making many of those available very soon in dipy. We want to have workflows that will be called very easily from other pipelines but which also process multiple files and hopefully parameters. The trick here is to have in dipy.workflows all workflows with string params. In this PR we give an example using median_otsu which is an algorithm for scalp removal from EPI images. Let me know what you think. Thanks in advance.

chrisgorgo · 2015-08-05T20:38:56Z

Yes it would be fairly easy to wrap such functions. I'm not sure if "workflow" is the best name (since it suggest many processing steps and long running time), but it's just a minor point.

I know we talked about this numerous times, but I would like to once again encourage you to write a Nipype interface instead of argparse. This way writing code once you get:

command line functionality
integration with Nipype
integration with CBrain (through boutiques)

arokem · 2015-08-05T21:37:54Z

I'm with @chrisfilo on this one. Seems like a duplication of effort to do this here, and if we start down the path, we will have to eventually reinvent all the wheels that nipype has already invented.

For example, it's clear that there are already emerging patterns here, that appear as boiler-plate in the code in this PR. Such things as checking whether output locations exist, whether the files required exist (and if not, what needs to be done), and so on.

Seems like a better use of time to write simplified APIs in dipy for processes that require many steps (e.g. registration with the SyN algorithm, see my attempts over on #673), and write workflows in nipype.

Garyfallidis · 2015-08-05T22:11:10Z

Hi @chrisfilo. Thank you for your quick feedback. Workflow is used in different projects with a different meaning. Nonetheless, I think it is a good name exactly because in dipy.workflows many functions/methods are called from the different individual workflows. The workflow introduced here is now one of the simplest but there there will be other more complicated ones. Also there will be some generic ones where you will be able to do a complete dwi analysis by running a single command.

About the nipype interfaces. As I said before I am very happy to write nipype interfaces. But also I want to be as general as possible. So I would like to support both ArgParser and Nipype and calling this from a web client and from another script etc.

So what I want you to tell me is what would be the best way to write an inferface for dipy.workflows.segment.median_otsu_flow.

Is this a good example to start from?
https://github.com/nipy/nipype/blob/master/nipype/interfaces/dipy/tracks.py

If I remember correctly I need to write an
InputSpec, OutputSpec and a BaseInterface
is this correct?

So assuming that all input and outputs are given as string. Can you give me a recommendation of how this classes would look?

Garyfallidis · 2015-08-05T22:37:28Z

Hi @arokem. I disagree that moving all the workflows in nipype will be a complete solution for dipy. I think we really need functions that handle io without extra dependencies. Also it will be much easier to have our quality assurance workflows tested first from the dipy developers and then submit the best ones to nipype. And these functions can be also called from our visualization module right now without waiting for a nipype release. I would like to play with a new technology before making radical decisions. So please do not rush into conclusions.

Garyfallidis · 2015-08-05T22:51:08Z

As about checking if the files exist. I think these are trivial to write. But our scripts also allow in a beautiful way to run multiple subjects sitting in different directories (using * and other wild-cards) and possibly even multiple parameters and multiple outputs from only one command. I love this option. For example, you can write

dipy_median_otsu "Data/subj_1*0/t1.nii.gz" ...

and this would process all subjects starting with 1 and ending with 0.

If I remember correctly from what @chrisfilo said when we met in OHBM this is still experimental or under development in nipype.

chrisgorgo · 2015-08-05T23:48:02Z

This is a good starting point:
http://nipy.org/nipype/devel/interface_specs.html

Python example:
http://nipy.org/nipype/devel/python_interface_devel.html

Real world example from nipy:
https://github.com/nipy/nipype/blob/master/nipype/interfaces/nipy/utils.py#L28

On the subject of command line inputs. Your current syntax is equivalent to how bash is dealing with wildcards, but will only work if users will specify parameters in parentheses. I would recommend leaving wild-cards expansion to bash (it's more intuitive for any command line user) and writing your function in such way that accepts lists of files instead of strings with wildcards.

In nipype input specification list of files are described this way:

input_files = InputMultiPath(File(exists=True), desc='list of files')

this will accept both "/path/file" and ["/path/file"] as well as ["/path/file1", "/path/file2"].

Let me know if you have any specific questions!

Garyfallidis · 2015-08-06T02:02:23Z

Hi @chrisfilo thanks for the links! Cool, I will have something ready for you to see asap. Will do my best.

Aha, here you say
Your current syntax is equivalent to how bash is dealing with wildcards, but will only work if users will specify parameters in parentheses.

I suppose you mean quotes not parentheses. In my investigation I realized that if I leave the terminal language to parse for multiple files the commands will not be multiplatform. Yes, the user will have to write those quotes but if he does that makes the dipy/bin commands exactly the same to run across many platforms and no assumption is needed for bash or for operating system. That means that those can be run from Windows too. Which is an advantage for a library. I think it will be easy for them to learn to use quotes. And if they don't they can always use the nipype version or just run the commands with single files.

chrisgorgo · 2015-08-06T03:05:15Z

A modified version of this solution: http://stackoverflow.com/a/12501893
would be a more robust way of dealing with the windows problem. This way
you can avoid quotes altogether and use nargs in argparse. We can include
this in nipype_cmd if it will make your life easier.

Best,
Chris
On Aug 5, 2015 7:02 PM, "Eleftherios Garyfallidis" notifications@github.com
wrote:

Hi @chrisfilo https://github.com/chrisfilo thanks for the links! Cool,
I will have something ready for you to see asap. Will do my best.

Aha, here you say
Your current syntax is equivalent to how bash is dealing with wildcards,
but will only work if users will specify parameters in parentheses.

I suppose you mean quotes not parentheses. In my investigation I realized
that if I leave the terminal language to parse for multiple files the
commands will not be multiplatform. Yes, the user will have to write those
quotes but if he does that makes the dipy/bin commands exactly the same to
run across many platforms and no assumption is needed for bash or for
operating system. That means that those can be run from Windows too. Which
is an advantage for a library. I think it will be easy for them to learn to
use quotes. And if they don't they can always use the nipype version or
just run the commands with single files.

—
Reply to this email directly or view it on GitHub
#695 (comment).

Garyfallidis · 2015-08-06T14:38:23Z

Cool! I need to get hands on with your suggestions. Will get back to you asap! 👍

Garyfallidis · 2015-09-12T22:20:32Z

Hi just wanted to give a quick feedback here so that the others now how this goes. A few days ago I proposed a new design to @matthieudumont which will allow automatic generation of cmd interfaces and nipype interfaces from the workflow functions using introspection. Mr. Dumont is currently working and extending this idea first for the cmd interfaces and after we have a first version of this I will also work on the objects that will generate the nipype interfaces automatically too.
If you want a sneak peek right here and right now here is an example of this idea implemented on the median_otsu interface
https://github.com/Garyfallidis/dipy/blob/introducing_workflows/bin/dipy_median_otsu
https://github.com/Garyfallidis/dipy/blob/introducing_workflows/dipy/workflows/base.py
https://github.com/Garyfallidis/dipy/blob/introducing_workflows/dipy/workflows/segment.py
Or you can wait until this PR is updated.

Garyfallidis · 2015-09-14T15:03:54Z

@matthieudumont there is some weird fail with python 3. Can you rebase and see if you continue getting the same error?

arokem · 2015-09-18T22:28:04Z

dipy/workflows/base.py

+import inspect
+
+
+try:


You might want to use the optional_package machinery here. For an example, see: https://github.com/nipy/dipy/blob/master/dipy/viz/fvtk.py#L32

arokem · 2015-09-18T22:45:37Z

Sorry - just a few things I noticed along the way. Looks like it's going in a good direction, with much less boilerplate code than before - that's good. Could you say a little bit more about how this also generates nipype workflows?

Garyfallidis · 2015-09-18T23:06:03Z

Hi @arokem. The nipype generator is not yet in. But the idea is that now that we can get the exact input and output parameters from the introspectiveparser we can generate python files with the InputSpects and OutputSpecs automatically written. Without you writing a single piece of code. Clear? Please let me know if it is not.

arokem · 2015-09-19T00:20:39Z

Yeah. More or less. So the plan is to have some script that goes through
the workflows tree and generates the nipype code?

On Fri, Sep 18, 2015 at 4:06 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Hi @arokem https://github.com/arokem. The nipype generator is not yet
in. But the idea is that now that we can get the exact input and output
parameters and from the introspectiveparser we can generate python files
with the InputSpects and OutputSpecs automatically written. Without you
writing a single piece of code. Clear?

—
Reply to this email directly or view it on GitHub
#695 (comment).

Garyfallidis · 2015-09-19T20:19:48Z

Yes @arokem. And then we can do the same with GUIs, web interfaces etc. Let the flows generate the interfaces ;) It seems this will work pretty well. But lets not be overenthusiastic and see if it works in practice. But yeah things look very positive right now. It seems we can greatly speedup our development.

Hi @matthieudumont this looks great I would suggest to add a test function to test the IntrospectiveArgParser with a dummy workflow with a large range of different parameters where different action strategies would be required.

Also it would be nice to update the other bin/commands to the workflow design. So, some refactoring will be needed. The tensor_fit should probably change too because right now it is calling fsl's bet etc which is not relevant. I know you have a better version.

Finally, we need to investigate the use of wildcards along different operating systems and how we will deal with that. There were some suggestions from Chris on this. GGT!

arokem · 2015-09-20T01:03:59Z

OK - that's cool. It seems to me that you should be able to generate the command line interfaces as well, just from introspection of the dipy.workflows module. That is bin/dipy_median_otsu should also possibly be auto-generated. Presumably they're all going to look the same?

arokem · 2015-09-20T01:04:51Z

dipy/segment/mask.py

@@ -8,7 +8,7 @@

 from scipy.ndimage.filters import median_filter
 try:
-    from skimage.filter import threshold_otsu as otsu
+    from skimage.filters import threshold_otsu as otsu


Is this a change in the skimage API, or a BF in dipy?

no, idk why I modified and committed that sorry!

I did this change. There was a warning that skimage has deprecated skimage.filter and that now we should use skimage.filters.

Cool. That answers my question.

On Mon, Sep 21, 2015 at 12:34 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

In dipy/segment/mask.py
#695 (comment):

@@ -8,7 +8,7 @@

from scipy.ndimage.filters import median_filter
try:

from skimage.filter import threshold_otsu as otsu

from skimage.filters import threshold_otsu as otsu

I did this change. There is a warning that skimage has deprecated
skimage.filter and that now we should use skimage.filters.

—
Reply to this email directly or view it on GitHub
https://github.com/nipy/dipy/pull/695/files#r40014939.

arokem · 2015-09-20T01:06:50Z

And yes: +1 on thorough testing of the IAP!

matthieudumont · 2016-01-11T16:02:34Z

(huge changelog because I rebased)

Garyfallidis · 2016-01-12T21:05:43Z

@arokem if you I agree I think I can merge this.

Next PR should be a cleanup of what is in dipy/bin to actually use the new design. And after that or in parallel start putting in the workflows that @matthieudumont and I have already written.

arokem · 2016-01-12T21:14:33Z

It seems that utils and segment are mostly untested. I think we should have
tests in place for these before we merge.

On Tue, Jan 12, 2016 at 1:05 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

@arokem https://github.com/arokem if you I agree I think I can merge
this.

Next PR should be a cleanup of what is in dipy/bin to actually use the new
design. And after that or in parallel start putting in the workflows that
@matthieudumont https://github.com/matthieudumont and I have already
written.

—
Reply to this email directly or view it on GitHub
#695 (comment).

Garyfallidis · 2016-01-13T19:19:36Z

Okay @matthieudumont added tests as requested. So, I will go ahead and merge this. But something very important to realize with the workflows is that we should have some buildbots which will run the tests of the command lines of the workflows with large and realistic datasets and create some reports. This is going towards a proper quality assurance direction where we do have Travis (and nipy.builders) that check that the internal functions (API) work correctly and then buildbots that check that a specific command line (workflow) or a series of command lines (workflows) generate the required results. This will be quite a task because we will need to generate some kind of QA reports or automated end-to-end tests. I think with the new dipy.viz module we will be able to have some of these things nicely done. But we do need to meet and discuss were we will set the buildbots. I think that @matthieudumont, @jchoude, @matthew-brett, @arokem and me should be in that meeting and share ideas on how to make this done. Can we have for the beginning a linux machine somewhere which will able to run the commands once day and generate some reports? Will the guys in Berkeley or Sherbrooke be willing to set a machine only for that? Or should we look into a cloud service?

Introducing workflows

arokem · 2016-01-13T19:37:38Z

Nice! Thanks for adding the tests -- will help to maintain this code in the
long run.

Re: end-to-end testing, I think that a potential solution would be cloud
machines running a docker container that we would maintain. For an example
of what that might look like, see the docker that I have for building the
docs:

https://github.com/arokem/dipy-docbuild

This gets automatically built into this dockerhub image:

https://hub.docker.com/r/arokem/dipy-docbuild/

Which can then be pulled down onto a cloud machine (e.g. on AWS) and run.
That's what I did on the 0.10 release. I am not sure how to set up an
automated build for this (though I am confident that could be done), but we
probably don't want to run this on every PR, do we? We could also store
some data on S3 (or equivalent), and get the data from there, instead of a
fetch from an http URL. Might help reduce costs.

At least initially, I can put some funding that I have towards running this
on AWS. In the short/medium time-scale, we might be able to get some more
support for this from UW eScience, but we should also think up a way to
maintain this in the longer run. A tip jar, or something :-)

On Wed, Jan 13, 2016 at 11:19 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Merged #695 #695.

—
Reply to this email directly or view it on GitHub
#695 (comment).

arokem reviewed Aug 4, 2015
View reviewed changes

chrisgorgo mentioned this pull request Aug 6, 2015

Added support for lists of files as command line inputs. nipy/nipype#1167

Merged

arokem reviewed Sep 18, 2015
View reviewed changes

arokem reviewed Sep 20, 2015
View reviewed changes

matthieudumont added 16 commits January 11, 2016 09:53

Fixed a few small things with IAP.

6d8b46e

Used optionnal package

03bea96

refacts after review.

3d7ab2a

Fixed boolean parameter bug

9793ebd

Moved numpydoc's for param retreival in documentation.py

806038c

Used full package path, remove ref to optional_package

441bc95

Added tests for the iap

941fb7a

Fix review comments

fb32ad3

Remove leftover print in workflow test

ccf395e

Added nargs support and tests

a1cb8bb

Added epilog

18ef01c

Changed documentation for docstring_parser

7b4c0ec

Fix help display bug caused by recent refactor.

f61b8d7

Added get_outputs

533ca6e

Fix typos

334314a

Fixed output sections

0be28a9

matthieudumont force-pushed the introducing_workflows branch from 1deb76f to 0be28a9 Compare January 11, 2016 16:02

matthieudumont added 3 commits January 11, 2016 11:26

cleaned workflow utils.py

9a8b350

Forgot to modify param name in test

8747a65

Add tests for docstring parser

b594539

matthieudumont added 2 commits January 13, 2016 09:06

Added tests for workflows.utils

e3ef8af

Tests for segment

7c453b5

Garyfallidis added a commit that referenced this pull request Jan 13, 2016

Merge pull request #695 from matthieudumont/introducing_workflows

a5dc8d3

Introducing workflows

Garyfallidis merged commit a5dc8d3 into dipy:master Jan 13, 2016

matthieudumont deleted the introducing_workflows branch June 22, 2016 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing workflows #695

Introducing workflows #695

matthieudumont commented Aug 4, 2015

arokem Aug 4, 2015

Garyfallidis commented Aug 5, 2015

chrisgorgo commented Aug 5, 2015

arokem commented Aug 5, 2015

Garyfallidis commented Aug 5, 2015

Garyfallidis commented Aug 5, 2015

Garyfallidis commented Aug 5, 2015

chrisgorgo commented Aug 5, 2015

Garyfallidis commented Aug 6, 2015

chrisgorgo commented Aug 6, 2015

Garyfallidis commented Aug 6, 2015

Garyfallidis commented Sep 12, 2015

Garyfallidis commented Sep 14, 2015

arokem Sep 18, 2015

arokem commented Sep 18, 2015

Garyfallidis commented Sep 18, 2015

arokem commented Sep 19, 2015

Garyfallidis commented Sep 19, 2015

arokem commented Sep 20, 2015

arokem Sep 20, 2015

matthieudumont Sep 21, 2015

Garyfallidis Sep 21, 2015

arokem Sep 21, 2015

arokem commented Sep 20, 2015

matthieudumont commented Jan 11, 2016

Garyfallidis commented Jan 12, 2016

arokem commented Jan 12, 2016

Garyfallidis commented Jan 13, 2016

arokem commented Jan 13, 2016


		def median_otsu_bet(input_file, out_dir, save_masked=False, median_radius=4,
		numpass=4, autocrop=False, vol_idx=None, dilate=None):

Introducing workflows #695

Introducing workflows #695

Conversation

matthieudumont commented Aug 4, 2015

arokem Aug 4, 2015

Choose a reason for hiding this comment

Garyfallidis commented Aug 5, 2015

chrisgorgo commented Aug 5, 2015

arokem commented Aug 5, 2015

Garyfallidis commented Aug 5, 2015

Garyfallidis commented Aug 5, 2015

Garyfallidis commented Aug 5, 2015

chrisgorgo commented Aug 5, 2015

Garyfallidis commented Aug 6, 2015

chrisgorgo commented Aug 6, 2015

Garyfallidis commented Aug 6, 2015

Garyfallidis commented Sep 12, 2015

Garyfallidis commented Sep 14, 2015

arokem Sep 18, 2015

Choose a reason for hiding this comment

arokem commented Sep 18, 2015

Garyfallidis commented Sep 18, 2015

arokem commented Sep 19, 2015

Garyfallidis commented Sep 19, 2015

arokem commented Sep 20, 2015

arokem Sep 20, 2015

Choose a reason for hiding this comment

matthieudumont Sep 21, 2015

Choose a reason for hiding this comment

Garyfallidis Sep 21, 2015

Choose a reason for hiding this comment

arokem Sep 21, 2015

Choose a reason for hiding this comment

arokem commented Sep 20, 2015

matthieudumont commented Jan 11, 2016

Garyfallidis commented Jan 12, 2016

arokem commented Jan 12, 2016

Garyfallidis commented Jan 13, 2016

arokem commented Jan 13, 2016