Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Revising Contributor Guidelines on adding new features to fMRIPrep #1963

Closed
oesteban opened this issue Feb 4, 2020 · 11 comments
Closed

Comments

@oesteban
Copy link
Member

oesteban commented Feb 4, 2020

With the increasingly extended utilization of fMRIPrep, more and more users find the default options of the tool too limiting.

The addition of new features to fMRIPrep generates immediate tension with the principles of simplicity and standardization. In other words, making fMRIPrep too configurable will decrease the homogeneity of the expected outcomes, inviting for configurations coupled with the intended analysis.

The same way the MRI scanner does not offer an immense space of knobs to turn in the acquisition, fMRIPrep should not add many additional knobs to those for it to be considered a viable augmentation or extension of the scanner hw/sw.

Therefore, all these ideas should be stated clearly in the contributors' guidelines along with some language prescribing how is the decision making process to determine whether a new feature should be implemented or not. Such a strategy should also help keeping feature creep under control, especially if more people are involved in the decision-making.

We've run a long way without these agreements, and so I will not rush this issue. Just wanted to bring it up and pin it for a little while.

Please voice your opinions. Extremely interested in hearing from @poldrack @franklin-feingold @poldracklab/fmriprep-docusprint-leaders @jdkent .

@oesteban oesteban pinned this issue Feb 4, 2020
@adelavega
Copy link
Contributor

adelavega commented Feb 4, 2020

Not much to say, except that I'm in agreement. It seems that feature creep and maintainability burden are issues to watch out for.

Do you suggest having specific guidelines for when to adopt a feature? It also seems that some new features add more confusion than others. Especially when the CLI is affected, and yet another option is added, that makes the tool more complex to use. I also wonder if there are stylistic preferences here. E.g., prefer several boolean arguments, vs a single multi option string arguments.

@oesteban
Copy link
Member Author

oesteban commented Feb 5, 2020

Do you suggest having specific guidelines for when to adopt a feature?

That would be the central idea being proposed.

It also seems that some new features add more confusion than others. Especially when the CLI is affected, and yet another option is added, that makes the tool more complex to use.

+1000

I also wonder if there are stylistic preferences here. E.g., prefer several boolean arguments, vs a single multi option string arguments.

Agreed. This and also a related issue @effigies just touched upon not long ago - nipreps/smriprep#159 (comment)

@jdkent
Copy link
Collaborator

jdkent commented Feb 5, 2020

Here are my suggestions for some questions to ask about implementing a new feature.

  • is there a standard procedure/value for the proposed feature in the literature?
    • if so, could we just use that procedure/value?
  • if the feature is dependent on some attribute of the input data (e.g., TR, duration, etc.)
    • if so, can the procedure/value be determined algorithmically?
  • does the feature interact with other fmriprep settings?
  • What is the difficulty of implementing the procedure outside of fmriprep?
    • example: does fmriprep provide all the necessary outputs for a user to perform the non-standard analysis (or are the necessary outputs "hidden" in the working directory?)

One balance to consider is the estimated maintenance cost of the feature and the estimated number of people that would use the feature and would otherwise create their own ad-hoc solution for their lab (re-creating the wheel)

I do not know how to measure the estimated maintenance cost, but if there is debate surrounding a feature, there could be a poll on neurostars/github to ask if there are others that would have immediate use cases for the feature (with some threshold for accepting the feature?)

@emdupre
Copy link
Collaborator

emdupre commented Feb 5, 2020

I think something like this is what we tried to start sketching out with the development roadmap. The concern, as I remember it, was that we couldn't guarantee (or rule out) specific features when working with a small development team.

Whatever is decided, I think having a clear vision there for what is in- vs out- of scope for fMRIPrep would help to communicate the intention for the tool moving forward. And of course having specific breakdowns in the contributor guidelines also helps 😸

What is the difficulty of implementing the procedure outside of fmriprep?
example: does fmriprep provide all the necessary outputs for a user to perform the non-standard analysis (or are the necessary outputs "hidden" in the working directory?)

Personally, I see this as a key question. I'm not opposed to encouraging post-fMRIPrep tools to operate directly on the working directory -- especially if the use case is not common -- but the trouble we've run into doing that with tedana is that the internals seem to update reasonably regularly with changes to transforms, etc. If we expect a similar level of updates indefinitely, then maybe we could think of a way to more easily request specific files.

@adelavega
Copy link
Contributor

Good points all round.

I don't think I'm in favor of operating on the working directory, I think in some ways having to guarantee some stability there also adds a burden. I think if other tools use it, and most of the work to find the right inputs is done by the other tool, then that's fine.

@oesteban
Copy link
Member Author

oesteban commented Feb 5, 2020

I think something like this is what we tried to start sketching out with the development roadmap. The concern, as I remember it, was that we couldn't guarantee (or rule out) specific features when working with a small development team.

Yup

Personally, I see this as a key question

@satra has proposed a plugin model - these plugins should be capable of doing surgery on the execution graph while maintaining intact Nipype features to keep track of provenance.

If such a plugin model existed, then we could forbid tinkering with the work directory. At this moment we can only discourage to do so.

I'm pinning #1963 (comment) as a good starting point for the new documentation.

@franklin-feingold
Copy link
Contributor

Adding on to the great points brought up in this thread:
I agree with making more clear perhaps the core principles of fmriprep development and philosophy - a 30,000 ft view (this will partially address the decision making aspect). There are answers throughout the RTD but nothing that one can directly point to that captures this. This may help when questions come up and have a more concrete way of discussing new features and deciding how to proceed as a community. (the guidelines may follow @jdkent)

Another way of looking at it can be clearly documenting how fmriprep works into potential analysis pipelines and decisions made to help strive toward our core principle/vision. I think very much in line with @emdupre message of drawing clear "in vs out of scope". A few out of scope cases can build in a sense of the precedent

I like the plugin idea - it is another way of keeping core fmriprep clean while having researchers be able to build around fmriprep. This can be where the flexibility kicks in. I think with plugins it would be good to also provide a bit of additional method language like fmriprep does for reporting and reproducibility. If they do start to customize being able to accurately report this will be more important. (I like @adelavega point of potential decreased ease of use as features grow along with increased feature maintenance too)

That is one of my concerns is that if fmriprep gets too customizable the reproducibility (and slightly robustability to an unseen dataset) benefit starts to get washed out.

@satra
Copy link

satra commented Feb 11, 2020

As you think about this, I would ask you to consider why new features are being requested, and what can we do to: 1) still consolidate the processing and 2) figure out a way to validate the new feature request in the context of the workflow (not a specific algorithm). IMO, the onus of this validation should be on the person/group requesting the feature.

The plugin model I proposed was conceived to allow new instruments/sequences/data to take advantage of the generic fmriprep pipeline. Another way to look at this is to balance "we don't know how to process X" with "we know how to process Y and can expand it to include X".

The one bit that worries me is that fmriprep may become a swiss army knife. I think instead it should just be a paring knife (small, efficient, and works for many things).

So perhaps one goal is to ask how many things can I do without increasing the number of arguments to fmriprep. I feel fmriprep has done a good job of this over the years; let's not change this.

@oesteban
Copy link
Member Author

Hey @satra, I'm coming back to this issue. What did you mean by:

  1. still consolidate the processing

?

@satra
Copy link

satra commented May 31, 2020

still consolidate the processing

as we consider different usage scenarios, how much can we get out tuning a workflow vs adding new nodes. more generally though, fmriprep is complex and uses many if clauses. could we simplify that.

@oesteban
Copy link
Member Author

Moving the conversation to the NiPreps organization.

@oesteban oesteban unpinned this issue May 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants