Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revised LsfExecutor.groovy in order to parse $LSF_ENVDIR/lsf.conf #1035

Closed
wants to merge 11 commits into from
Closed

revised LsfExecutor.groovy in order to parse $LSF_ENVDIR/lsf.conf #1035

wants to merge 11 commits into from

Conversation

evanbiederstedt
Copy link

This needs a bit of work....

As discussed on gitter 15 Feb 2019, Nextflow currently assumes that users have the environment variable LSF_UNIT_FOR_LIMITS=MB set in the $LSF_ENVDIR/lsf.conf. In principle, LSF users should have this exposed on their head and worker nodes.

At the moment, this more of a proposal than a PR.

  • added an init() method to override than in AbstractGridExecutor.groovy
  • added basic function to parse $LSF_ENVDIR/lsf.conf
  • added conditionals to use this value and automatically set the memory unit for LSF submissions

The major problem with this solution is that it doesn't address the problem of a discrepancy between the units the user sets in Nextflow, and the variable LSF_UNIT_FOR_LIMITS. If the user sets process.memory="5.MB" and the variable in $LSF_ENVDIR/lsf.conf is set as LSF_UNIT_FOR_LIMITS=GB, then the above will override what the user sets.

@evanbiederstedt evanbiederstedt changed the title revised LsfExecutor.groovy in oorder to parse $LSF_ENVDIR/lsf.conf revised LsfExecutor.groovy in order to parse $LSF_ENVDIR/lsf.conf Feb 18, 2019
Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punctual for supper time, congrats! :D

I've added a few comments, also please do not forget to add relevant test units, see for example and to sign the DCO

@pditommaso
Copy link
Member

I think NF uses the same units as LSF, this snippet can be useful

int p = UNITS.indexOf(unit)
if ( p == -1 ) {
// try adding a 'B' specified
p = UNITS.indexOf(unit+'B')
if( p == -1 ) {
throw new IllegalArgumentException("Not a valid file size unit: ${str}")
}
}
size = Math.round( Double.parseDouble(value) * Math.pow(1024, p) )

@evanbiederstedt
Copy link
Author

Thank you for the comments; I should be able to fix the above soon.

I think NF uses the same units as LSF, this snippet can be useful

nextflow/modules/nf-commons/src/main/nextflow/util/MemoryUnit.groovy

Lines 84 to 93 in 73977a9

int p = UNITS.indexOf(unit)
if ( p == -1 ) {
// try adding a 'B' specified
p = UNITS.indexOf(unit+'B')
if( p == -1 ) {
throw new IllegalArgumentException("Not a valid file size unit: ${str}")
}
}

size = Math.round( Double.parseDouble(value) * Math.pow(1024, p) )

I'm a bit confused what your plans are for the above---that would mean the user-provided units would be the "correct" units?

@pditommaso
Copy link
Member

It could be possible to add a toUnit(String) method to MemoryUnit class doing basically the apposite of the above e.g.

long toUnit(String unit) {
   int p = UNITS.indexOf(unit) 
  return Math.round( size/ Math.pow(1024, p) )
} 

So that

MemoryUnit.of('10GB').toUnit('GB') == 10

@evanbiederstedt
Copy link
Author

I've yet to add unit tests---this can be done once there's agreement this solution seems reasonable.

I've now installed DCO---future commits should have the sign off

Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
…d toUnit() function to parse this in MemoryUnit.groovy.

Signed-off-by: Evan Biederstedt: evan.biederstedt@gmail.com
Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
… form *B bytes

Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
…t, not KB

Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
Copy link
Author

@evanbiederstedt evanbiederstedt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pditommaso

So, I've started to add unit tests both in nextflow/modules/nf-commons/src/test/nextflow/util/MemoryUnitTest.groovy and nextflow/modules/nf-commons/src/test/nextflow/util/LsfExecutorTest.groovy

My lack of familiarity with Groovy + Spock is causing this to be a bit (unnecessarily) clumsy:

The CI tests fail on Travis at https://github.com/nextflow-io/nextflow/blob/master/modules/nextflow/src/test/groovy/nextflow/executor/LsfExecutorTest.groovy#L71-L84

However, it's not clear to me:

(A) why it's failing there, as the default behavior in MB should still exist

(B) how Groovy developers use Spock to quickly see what the unit tests "expect" to quickly correct the source code and/or tests. If I simply try to compile the NF sourcecode via Sublime, I run into compilation errors which could be related to which version of Java I'm using, e.g. error: package javax.annotation does not exist

Do you have advice how to best do this?

@@ -63,7 +64,7 @@ class LsfExecutor extends AbstractGridExecutor {
super.init()
queueInterval = session.getQueueStatInterval(name)
log.debug "Creating executor '$name' > queue-stat-interval: ${queueInterval}"
def memoryUnitLSF = getLSFmemoryUnits() ? : 'MB'
def memoryUnitLSF = LsfExecutor.getLSFmemoryUnits() ? : 'MB'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible this isn't what you meant by the init() function, so I'm unsure about this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LsfExecutor.getLSFmemoryUnits() is not correct because that implies a static method definition and getLSFmemoryUnits should not br static.

The operator is ?: not ? : as + = is not the same same += .. and I guess that's why it's failing.

Last, I was wrong to suggest you to implement this logic in the init method, it should be register instead (always a template method in the superclass).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback. These comments make a good deal of sense.

Last, I was wrong to suggest you to implement this logic in the init method, it should be register instead (always a template method in the superclass).

This I don't understand. What do you mean by "it should be register instead"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Executor class define either init and register method. The latter is meant to be overridden.

@pditommaso
Copy link
Member

Well, if you are pretending to implement this without even compiling and testing it locally I think you are very low chances to get it working. To compile and test the code is enough running make test in the project root. Use the following command to run only the lsf executor test instead

make test class=nextflow.executor.LsfExecutorTest module=nextflow

If you want to use a integrated UI use IntelliJ as explained the project README file.

how Groovy developers use Spock to quickly see what the unit tests "expect" to quickly correct the source code and/or tests.

The test reports the expected output and the actual output. You can find in the report generated by make test. It even shows a diff between the twos (provided the code can be compiled and does not fail for other reasons).

Signed-off-by: evanbiederstedt <evan.biederstedt@gmail.com>
@evanbiederstedt
Copy link
Author

evanbiederstedt commented Feb 27, 2019

If you want to use a integrated UI use IntelliJ as explained the project README file.

This is exactly what I was looking for, thank you! Excellent, this works well---it's far easier than with a standard text editor

Apologies, I had missed this in the README. I appreciate the help

@ckandoth
Copy link

In theory, Nextflow can simply specify -M 8MB instead of -M 8 when it runs bsub here. This will override whatever default memory unit is used in a local config. This means Nextflow will not need to parse $LSF_ENVDIR/lsf.conf, and does not need to do any unit conversion. Let me know what you think.

@pditommaso
Copy link
Member

Interesting could. Are you aware if since this version is supported this syntax?

@ckandoth
Copy link

ckandoth commented Feb 28, 2019

Yes, this syntax is detailed in the documentation of LSF v9 cited in this PR. And also in the documentation for LSF v10 that we use on our cluster. I also just successfully tested that both bsub -M 8000MB sleep 1 and bsub -M 8 sleep 1 will request 8GB from our local cluster, which is configured to default to GB if a unit is not specified. Let me know if you need more info.

Update: I just realized the documentation above is for cluster admins, not for job submitters. And though bsub -M 8MB will work on LSF v10, it will break for LSF v9 users - which many people still use.

Work on our LSF v10 cluster:

$ bsub -M 8000MB sleep 1
Job <5766188> is submitted to default queue <general>.

$ bsub -M 8 sleep 1
Job <5766190> is submitted to default queue <general>.

Fails on our older LSF v9 cluster:

$ bsub -M 8000MB sleep 1
8000MB: MEMLIMT value should be a positive integer. Job not submitted.

$ bsub -M 8 sleep 1
Job <33950661> is submitted to default queue <sol>.

@pditommaso
Copy link
Member

Fails on our older LSF v9 cluster

Therefore is not a valid solution because there are NF users using even older LSF versions

@ckandoth
Copy link

ckandoth commented Mar 1, 2019

@pditommaso unfortunately yes. On our LSF v9 cluster, I could not find any other way to override the LSF_UNIT_FOR_LIMITS as a non-root user. Doing unit conversion on the fly is likely the only backward compatible solution.

@pditommaso
Copy link
Member

Can any of you attach here an example lsf.conf file?

@ckandoth
Copy link

ckandoth commented Mar 5, 2019

@pditommaso here are $LSF_ENVDIR/lsf.conf files from our v9/v10 LSF clusters:
lsf_v9.conf.txt
lsf_v10.conf.txt

I had to remove personal information, but what remains should be sufficient for NextFlow.

@pditommaso
Copy link
Member

Tx

@evanbiederstedt
Copy link
Author

I think quite a few files have changed since this PR was made. It may be best to start again to isolate why the tests are failing.

@pditommaso
Copy link
Member

I've already an implementation. I'll include in the next month edge release. thanks.

@evanbiederstedt
Copy link
Author

Thanks @pditommaso

Sincere apologies---other projects came up

@pditommaso
Copy link
Member

No pb, all of us have priorities.

pditommaso added a commit that referenced this pull request Apr 22, 2019
The unit used by LSF batch scheduler for memory limits depends
on the setting of the attribute LSF_UNIT_FOR_LIMITS defined in
the lsf.config file.

This commit adds to the LSF executor the ability to fetch such
setting and properly apply when a job is submitted for execution.

Moreover if fixes the default mem unit to KB (instead of MB)
when the above setting is not defined.

Solves #1124. See also PR #1035.
sivkovic pushed a commit to sivkovic/nextflow that referenced this pull request Jun 6, 2019
The unit used by LSF batch scheduler for memory limits depends
on the setting of the attribute LSF_UNIT_FOR_LIMITS defined in
the lsf.config file.

This commit adds to the LSF executor the ability to fetch such
setting and properly apply when a job is submitted for execution.

Moreover if fixes the default mem unit to KB (instead of MB)
when the above setting is not defined.

Solves nextflow-io#1124. See also PR nextflow-io#1035.
sivkovic pushed a commit to sivkovic/nextflow that referenced this pull request Jun 6, 2019
The unit used by LSF batch scheduler for memory limits depends
on the setting of the attribute LSF_UNIT_FOR_LIMITS defined in
the lsf.config file.

This commit adds to the LSF executor the ability to fetch such
setting and properly apply when a job is submitted for execution.

Moreover if fixes the default mem unit to KB (instead of MB)
when the above setting is not defined.

Solves nextflow-io#1124. See also PR nextflow-io#1035.

Signed-off-by: Ivkovic <sinisa.ivkovic@gmail.com>
pditommaso added a commit that referenced this pull request Jul 27, 2019
The unit used by LSF batch scheduler for memory limits depends
on the setting of the attribute LSF_UNIT_FOR_LIMITS defined in
the lsf.config file.

This commit adds to the LSF executor the ability to fetch such
setting and properly apply when a job is submitted for execution.

Moreover if fixes the default mem unit to KB (instead of MB)
when the above setting is not defined.

Solves #1124. See also PR #1035.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants