Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Dynamic citation boilerplate #1024

Merged
merged 48 commits into from
Jul 30, 2018

Conversation

oesteban
Copy link
Member

@oesteban oesteban commented Mar 5, 2018

Compose a citation boilerplate in a per-workflow basis, with up-to-date software versions and generated based on the actual workflows being run.

This PR is more of a request for comments from @effigies, @chrisfilo.

Maybe this can be done already with duecredit (cc @yarikoptic), but the basic idea here is to place a mechanism to build literate descriptions of the workflow. My plan would be then collect all the references with duecredit.

How it works. Workflows in FMRIPREP will have a __predesc__ and a __postdesc__ attributes (in this version the pre- dunder is just __desc__ as I was just prototyping). When writing a new workflow, these fields can be used to add the corresponding literate description of what that workflow does (allowing to dynamically replace version strings, optional parameters, etc).

Let me know if you think this is worth exploring. The most challenging culprit happens with inhomogeneous datasets (e.g. fieldmap has IntendedFor only on 3 out of 5 runs). In that case the citation boilerplate will have two descriptions, as there's is not just one.

Any comments will be very welcome.

EDIT

Results

This is how it looks
boilerplate

Links to more results

In the context of the reports

Compose a citation boilerplate in a per-workflow basis, with up-to-date software versions and generated based on the actual workflows being run.
@chrisgorgo
Copy link
Contributor

I have a feeling this would be very challenging to do if the goal is to provide a snippet of text ready to paste into the manuscript. A solution based on conditional statements mirroring the one we did on the website might be easier to implement (even though harder to maintain).

@oesteban
Copy link
Member Author

oesteban commented Mar 5, 2018

IMHO implementation is fairly natural since each Workflow factory function can (this is not mandatory) set the __predesc__ and __postdesc__ attributes and they are automatically picked up the same way nipype generates the graph. Having them set right after the workflow instantiation feels pretty natural, just like adding another docstring.

For this PR I didn't get to the bottom and generate the exact citation boilerplate we have currently online just for laziness, but this prototype is able to do that already.

I think we could leverage duecredit to generate the lookup table of references (@yarikoptic can correct me if I'm wrong).

On the other hand, I see some positive side effects:

  • Trivial to append the boilerplate to the report (minimizing work for the user)
  • Trivial to maintain the citing.rst. Also, the citing.rst can be generated automatically, becoming an actual test for this feature.
  • Automatically include these new dunder fields when generating the documentation will make it very easy to highlight these excerpts with their corresponding citation in documentation proper.

I'll let this sit here while I focus on more urgent issues.

@oesteban
Copy link
Member Author

oesteban commented Jul 25, 2018

Hey @satra, what do you think about this extension to nipype?. Particularly the magic:

https://github.com/oesteban/fmriprep/blob/b349694455eac1fbbaa7c6ee1b6a133b71c980e4/fmriprep/engine/workflows.py#L14-L49

@satra
Copy link

satra commented Jul 26, 2018

@oesteban - i think this a reasonable starting point that could then be edited by a human. i agree with @chrisfilo that this can get complicated. but this is at least a simple mechanism by which certain workflows could be described. how about adding a flag to allow ignoring nested workflows.

@oesteban
Copy link
Member Author

how about adding a flag to allow ignoring nested workflows

Seems like a good idea.

@oesteban oesteban added this to To do in 1.3.0 via automation Jul 27, 2018
@oesteban oesteban moved this from To do to In progress in 1.3.0 Jul 27, 2018
@oesteban oesteban changed the title [ENH,WIP] Dynamic citation boilerplate [ENH] Dynamic citation boilerplate Jul 29, 2018
@oesteban
Copy link
Member Author

Ready for review!

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The display looks off for Markdown and LaTeX:

https://2366-53608443-gh.circle-artifacts.com/0/tmp/ds005/derivatives/fmriprep/sub-01.html#boilerplate

Also, the text indicates that CIFTI files were created for all of the test datasets, which I think we only do for one dataset. So we may not be handling some conditionals properly.

Once all the sub-workflows of a given workflow have
been visited, then the ``__postdesc__`` attribute is appended
and the execution pops out to higher level workflows.
The dunder attributes are written in Markup language, and may contain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Markdown?

@@ -1,5 +1,7 @@
fmriprep
fmriprep/logs
fmriprep/logs/CITATION.html
fmriprep/logs/CITATION.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add CITATION.tex to expected outputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

workflow = Workflow(name=name)
workflow.__desc__ = """\
The BOLD time-series were resampled on {tpl} standard space,
generating a *preprocessed BOLD run on {tpl} space*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/on/in/

workflow = pe.Workflow(name=name)
workflow = Workflow(name=name)
workflow.__desc__ = """\
The BOLD time-series were resampled on {tpl} standard space,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/on/to/

@oesteban
Copy link
Member Author

The display looks off for Markdown and LaTeX

Yep, I'm working this out right now.

@oesteban
Copy link
Member Author

Okay, with my last commit this should be ready to go.

  • Fixed visualization
  • Fixed @effigies' minor comments
  • Fixed @effigies' major comment about the grayordinates file - I have actually added the HCP pipelines reference and cited the paper in two contexts: the grayordinates files and the epi unwarp with phase difference fieldmap (since that workflow is inspired in HCP pipelines).

This is how it looks now (locally, awaiting for tests in circle):
new-boilerplate-0

and
new-boilerplate-1

@@ -92,6 +92,8 @@ def init_bold_surf_wf(mem_gb, output_spaces, medial_surface_nan, name='bold_surf
workflow.__desc__ = """\
The BOLD time-series, were resampled to surfaces on the following
spaces: {out_spaces}.
*Grayordinates* files [@hcppipelines], which combine surface-sampled
data and volume-sampled data, were also generated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make this conditional on cifti_output? In which case, it might go better back in init_func_preproc_wf, which has access to that variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've just added it to init_func_preproc_wf 👍

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this. You might want to update the GitHub Project, so that things done after 1.1.0 don't get accidentally labeled as such.

@oesteban
Copy link
Member Author

oesteban commented Jul 30, 2018

I'm merging this.

@chrisfilo
We need to figure out how to cite the right version here. Especially considering that the zenodo handle is created after the release is made.

I'm assuming this comment is not meant to block merging the PR. I've opened #1229 to follow up on this problem.

@oesteban oesteban merged commit 38d6e04 into nipreps:master Jul 30, 2018
1.3.0 automation moved this from In progress to Done Jul 30, 2018
@oesteban oesteban deleted the enh/dynamic-boilerplate branch July 30, 2018 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
1.3.0
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants