Possibility to specify plugins within test profiles or non-top scopes #1964

skrakau · 2021-03-11T16:28:55Z

New feature

Hi again, I need to come back to the question already mentioned in #1963 about specifying plugins within test.conf config files. Currently, with the Nextflow 21.03.0-edge release, when adding the nf-amazon plugin definition within a test.conf file (or I guess in any non-top scope) I get:

N E X T F L O W  ~  version 21.03.0-edge
Launching `/home-link/qeakr01/development/mag/main.nf` [mad_waddington] - revision: f377b362c1
Plugins definition is only allowed in config top-most scope

Usage scenario

For some nf-core pipelines the full-size test data is stored on the amazon S3 filesystem and it would be very helpful if the corresponding test profile can still be run on non-aws instances in the future, without the need for the user to specify custom config files extra for this. I do this very often for testing purposes and I think others from the nf-core team would also need this functionality.

The text was updated successfully, but these errors were encountered:

skrakau · 2021-03-12T12:11:33Z

Maybe it's worth an explanation: the reason why I wanted to load the nf-amazon plugin within the test.conf and not within the nextflow.config is that I assumed the latter might cause problems if the user provides additionally data from a different S3 filesystem (by specifying another plugin within a custom config file). But maybe that wouldn't be a problem.

A related question would be, if it's possible to specify nf-amazon for igenomes.config or nextflow.config and another s3 within the custom config file for other input files.

ewels · 2021-03-26T14:36:47Z

To add to this - I was just caught out by the same thing when adding the following to a minimal example main.nf instead of nextflow.config:

plugins {
  id 'nf-amazon'
}

It triggered the following error:

N E X T F L O W  ~  version 21.03.0-edge
Launching `./main.nf` [jovial_kalam] - revision: b9692aedd4
No signature of method: Script_35bd8815.plugins() is applicable for argument types: (Script_35bd8815$_runScript_closure1) values: [Script_35bd8815$_runScript_closure1@4fcc0416]
Possible solutions: print(java.lang.Object), print(java.lang.Object), print(java.lang.Object), print(java.io.PrintWriter)
 -- Check script 'main.nf' at line: 1 or see '.nextflow.log' file for more details

pditommaso · 2021-03-26T16:17:19Z

The plugins cannot go in the pipeline script. One workaround would be to add it in the cli when specifying the genome profile e.g.

nextflow run <pipeline> -plugins nf-amazon -profile ignomes

I know, bit boring

ewels · 2021-03-26T16:19:02Z

Or even better if it doesn't have to be defined at all 🙄 😉

I'm pretty worried that these are going to cause chaos for all @nf-core pipelines if I'm honest, with our heavy usage of reference genomes on AWS...

pditommaso · 2021-03-26T16:19:50Z

yeah, but then the problem is when offline, the plugin cannot be downloaded :/

ewels · 2021-03-26T16:22:27Z

When offline the s3 paths can't be downloaded either..

pditommaso · 2021-03-26T16:24:00Z

I think we can agree on this 😄

ewels · 2021-03-26T16:25:29Z

So is the idea that we define the plugin name in all nf-core pipelines in order to be able to use the AWS-iGenomes references in them? Does that break stuff for anyone wanting to use a different object storage system for their data? (Apologies if this is the wrong place to have this discussion..)

pditommaso · 2021-03-26T16:41:21Z

Putting the plugin name in all nf-core pipelines would break the execution when running in an offline environment because NF would try to download the plugin.

What I'm not understanding are the AWS-iGenomes references used by all pipelines?

ewels · 2021-03-28T20:40:46Z

Most @nf-core pipelines have the igenomes config (it comes with the pipeline), yeah. Users can configure the base path to use a local directory if all of iGenomes is downloaded, and of course use their own references. But a lot of people just use the AWS-iGenomes directly (based on the download stats anyway).

pditommaso · 2021-03-29T15:44:09Z

Need to check if it's possible to have the plugins nested within a profile

ewels · 2021-03-29T18:07:12Z

What I don't really understand is why it needs to be in the pipeline code at all. Ignoring the AWS-iGenomes thing for a minute, surely most of the time this will be something that a user needs to manage rather than a pipeline developer?

If it's possible to have all of these plugins installed at once and Nextflow knows how to deal with the s3 paths, can it not just be part of the -self-update command that Nextflow fetches and updates all core plugins? That way you can still do regular small updates of the plugins but we don't have to worry about them at pipeline level...

pditommaso · 2021-03-30T16:24:54Z

What I don't really understand is why it needs to be in the pipeline code at all. Ignoring the AWS-iGenomes thing for a minute, surely most of the time this will be something that a user needs to manage rather than a pipeline developer?

Because the plan is to have application plugins for example to handle SQL db or access a dataset that requires some special library. This is why the requirement is that the pipeline should declare in the pipeline config.

If it's possible to have all of these plugins installed at once and Nextflow

Actually, the a nextflow plugins install command for that (tho hidden because I still need to refine it), which copies the plugins files into the $HOME/.nextflow/plugins path. However, it's still necessary to declare the plugin either in config file (pipeline or $HOME/.nextflow/config), via cli option -plugins or env variable NXF_PLUGINS_DEFAULT.

ewels · 2021-03-30T18:41:34Z

Right, I'm not against plugins per se - quite the opposite. I think that it can and will be a super powerful feature. Your examples of pipelines which are fixed to specific data sources are super nice.

My objection is for things where the pipeline developer can't know about data sources, primarily file access. Almost all Nextflow pipelines take files as inputs and the beauty of Nextflow being so portable is that they can come from anywhere - local, https, ftp, buckets etc etc. But by mandating that the pipeline developer needs to add nf-amazon, nf-google and nf-azure to access files on those systems, you lose that portability.

ok, so questions:

Is the only way forward with cross-cloud data access for pipeline developers to declare all of these plugins in every pipeline, just in case a user wants to access files on those systems?

Does declaring all at once in one pipeline work? eg:

plugins {
    id 'nf-amazon'
    id 'nf-azure'
    id 'nf-google'
}

ch_from_amazon = Channel.fromPath( 's3://aws-bucket/data/sequences.fa' )
ch_from_azure  = Channel.fromPath( 'gs://google-bucket/data/sequences.fa' )
ch_from_google = Channel.fromPath( 'azure:azure-bucket/data/sequences.fa' )

If we need to do that for all @nf-core pipelines, do you not agree that it makes more sense to have this functionality as a core Nextflow feature? Or at least not need to define it at pipeline level?

pditommaso · 2021-03-31T17:14:28Z

But by mandating that the pipeline developer needs to add nf-amazon, nf-google and nf-azure to access files on those systems, you lose that portability

Kind of disagree, because the pipeline is portable irrespective of the platform, but yes the user pulling the data from a cloud should add the required plugin.

Is the only way forward with cross-cloud data access for pipeline

well, either command-line options, config file and env variable

Does declaring all at once in one pipeline work?

yup

If we need to do that for all @nf-core pipelines, do you not agree that it makes more sense to have this functionality as a core Nextflow feature

Not much because we have already amazon, azure, google cloud. Today we have added dnanexus, and surely more will come. Putting all this stuff as a core dependency would result in a huge bloated runtime that's bad especially when pulling this stuff in the cloud.

However, I understand your concern that the user should not care about configuring the plugin needed when launching a nf-core pipeline. This is why I've also added a check that automatically added the required plugins when it's specified the cloud executor e.g. when the executor is awsbatch the nf-amazon plugin is automatically added.

I think the only problem that remains to address is when a pipeline launched locally needs to access cloud-stored files.

ewels · 2021-03-31T19:30:58Z

This is why I've also added a check that automatically added the required plugins when it's specified the cloud executor e.g. when the executor is awsbatch the nf-amazon plugin is automatically added.

Ok great, this puts my mind at ease quite a lot.. 😄

well, either command-line options, config file and env variable

Could you clarify this a bit? My (limited) understanding was that they had to be declared in the pipeline nextflow.config file and that was the only way to do it. But you're saying that if the user installs the plugin via the command line option (nextflow plugins install) then accessing eg. AWS s3 paths will work without any mention of the plugin in the pipeline code?

pditommaso · 2021-04-01T08:06:11Z

Ok great, this puts my mind at ease quite a lot

Actually, it checks also the pipeline work dir

nextflow/modules/nf-commons/src/main/nextflow/plugin/PluginsFacade.groovy

Lines 237 to 254 in 914d7cb

    
           // infer from app config 
        
           final plugins = new ArrayList<PluginSpec>() 
        
           final workDir = config.workDir as String 
        
           final executor = Bolts.navigate(config, 'process.executor') 
        
           if( executor == 'awsbatch' || workDir?.startsWith('s3://') ) 
        
               plugins << defaultPlugins.getPlugin('nf-amazon') 
        
           if( executor == 'google-lifesciences' || workDir?.startsWith('gs://') ) 
        
               plugins << defaultPlugins.getPlugin('nf-google') 
        
           if( executor == 'azurebatch' || workDir?.startsWith('az://') ) 
        
               plugins << defaultPlugins.getPlugin('nf-azure') 
        
           if( executor == 'ignite' || System.getProperty('nxf.node.daemon')=='true') { 
        
               plugins << defaultPlugins.getPlugin('nf-ignite') 
        
               plugins << defaultPlugins.getPlugin('nf-amazon') 
        
           }

Could you clarify this a bit?

Installation != plugin requirement/activation. Installation means only downloading, unzipping and copy the plugin content into the HOME/.nextflow/plugins folder.

The installation is done via the nextflow plugins install (undocumented) or automatically on NF startup when one or more plugins have been specified in:

-plugins cli option
nextflow config file
NXF_PLUGINS_DEFAULT env variable
implicit plugins inferred from the pipeline config (i.e. the link above)

Worth mentioning, they are listed in order of priority ie. if 1 is provided 2 is ignored is specified, and so on.

AWS s3 paths will work without any mention of the plugin in the pipeline code?

S3 paths work only if the nf-amazon plugin is requested following the above mechanism.

Possible solutions:

Add all plugins in the config, but this will break the pipeline when running offline because NF will try to download the plugin
Make the download fail and report a warning instead of failing. Cons: it could be difficult to debug and could result in faulty behavior
Allow the definition of plugins in nested profiles, therefore if you add igenomes, the plugin is also added. Cons: can create be tricky to resolve the plugin version is different profiles are requiring different versions
Lazy configuring the plugin requiring, i.e. if at runtime NF detect the use of S3 file, install and activate the plugin. Technically should be possible, can be tricky to implement in practice.

pditommaso · 2021-04-06T15:38:18Z

I've managed to implement the solution at point 4. Therefore cloud plugins are inferred and started automatically in all cases.

You may want to give it a try using

NXF_VER=21.04.0-SNAPSHOT nextflow run .. etc

Make sure you are using this version

» NXF_VER=21.04.0-SNAPSHOT nextflow info
  Version: 21.04.0-SNAPSHOT build 5537
  Created: 06-04-2021 15:30 UTC (17:30 CEST)

ewels · 2021-04-06T17:01:29Z

Amazing! And to uninstall plugins I can just do rm -rf HOME/.nextflow/plugins?

pditommaso · 2021-04-06T17:17:27Z

yep

pditommaso · 2021-04-08T12:36:22Z

This is avail as of version 21.04.0-edge. Closing this issue, feel free to comment/reopen if needed.

drpatelh mentioned this issue Mar 12, 2021

Update CHANGELOG and bump NF version to 21.03.0-edge nf-core/viralrecon#159

Merged

abhi18av added the plugins label Mar 13, 2021

skrakau mentioned this issue Mar 15, 2021

Switch to DSL 2 nf-core/mag#74

Closed

ewels mentioned this issue Mar 29, 2021

add nf-amazon plugin nf-core/tools#982

Closed

4 tasks

pditommaso added a commit that referenced this issue Apr 6, 2021

Start default plugins on-demand #1964

cf2a9d7

pditommaso added this to the v21.04.0 milestone Apr 8, 2021

pditommaso closed this as completed Apr 8, 2021

drernie mentioned this issue Jul 9, 2023

Quilt native Groovy quiltdata/nf-quilt#18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possibility to specify plugins within test profiles or non-top scopes #1964

Possibility to specify plugins within test profiles or non-top scopes #1964

skrakau commented Mar 11, 2021

skrakau commented Mar 12, 2021

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021 •

edited

Loading

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021

ewels commented Mar 28, 2021

pditommaso commented Mar 29, 2021

ewels commented Mar 29, 2021

pditommaso commented Mar 30, 2021

ewels commented Mar 30, 2021

pditommaso commented Mar 31, 2021

ewels commented Mar 31, 2021

pditommaso commented Apr 1, 2021

pditommaso commented Apr 6, 2021

ewels commented Apr 6, 2021

pditommaso commented Apr 6, 2021 •

edited

Loading

pditommaso commented Apr 8, 2021

Possibility to specify plugins within test profiles or non-top scopes #1964

Possibility to specify plugins within test profiles or non-top scopes #1964

Comments

skrakau commented Mar 11, 2021

New feature

Usage scenario

skrakau commented Mar 12, 2021

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021 • edited Loading

ewels commented Mar 26, 2021

pditommaso commented Mar 26, 2021

ewels commented Mar 28, 2021

pditommaso commented Mar 29, 2021

ewels commented Mar 29, 2021

pditommaso commented Mar 30, 2021

ewels commented Mar 30, 2021

pditommaso commented Mar 31, 2021

ewels commented Mar 31, 2021

pditommaso commented Apr 1, 2021

pditommaso commented Apr 6, 2021

ewels commented Apr 6, 2021

pditommaso commented Apr 6, 2021 • edited Loading

pditommaso commented Apr 8, 2021

pditommaso commented Mar 26, 2021 •

edited

Loading

pditommaso commented Apr 6, 2021 •

edited

Loading