Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with modellauncher #2262

Closed
serbinsh opened this issue Jan 28, 2019 · 7 comments
Closed

Issue with modellauncher #2262

serbinsh opened this issue Jan 28, 2019 · 7 comments
Assignees

Comments

@serbinsh
Copy link
Member

2019-01-21 23:32:33 INFO   [PEcAn.workflow::run.write.configs] :
   parameter values for runs in
   /data/sserbin/Modeling/sipnet/US-WCr/ILAMB_CRUNCEP.TempBDF.2//samples.RData
2019-01-21 23:32:34 INFO   [start.model.runs] :
   -------------------------------------------------------------------
2019-01-21 23:32:34 INFO   [start.model.runs] :
   Starting model runs SIPNET
2019-01-21 23:32:34 INFO   [start.model.runs] :
   -------------------------------------------------------------------
  |                                                                      |   0%Warning in file.remove(file.path(run_id_dir, "joblist.txt")) :
  cannot remove file '/data/sserbin/Modeling/sipnet/US-WCr/ILAMB_CRUNCEP.TempBDF.2//run/2000098933/joblist.txt', reason 'No such file or directory'
Error in writeLines(c(file.path(settings$host$rundir, run_id_string)),  :
  'con' is not a connection
Calls: <Anonymous> -> start.model.runs -> writeLines

seems it is looking for a "joblist.txt" file to remove it and also hits a complaint about a database connection. I built the model launcher exe and setup my xml like this

    <qsub>qsub -l walltime=36:00:00,nodes=2:ppn=10 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>
    <qsub.jobid>([[:digit:]]+)\.modex\.bnl\.gov</qsub.jobid>
    <qstat>qstat @JOBID@ || echo DONE</qstat>
    <modellauncher>
        <binary>/data/sserbin/Modeling/pecan/contrib/modellauncher/modellauncher</binary>
        <qsub.extra>-l ncpus=10</qsub.extra>
    </modellauncher>
  </host>
@serbinsh
Copy link
Member Author

Are there any folks in PEcAn-land successfully using model launcher in pecan develop?

@tonygardella
Copy link
Contributor

@femeunier

@robkooper
Copy link
Member

problem looks to be in https://github.com/PecanProject/pecan/blob/develop/base/remote/R/start.model.runs.R#L90 the jobfile used to be opened for writing here.

@serbinsh
Copy link
Member Author

serbinsh commented Feb 1, 2019

Ahhh....ok. that makes sense I think....so does it need to be open for writing again?

robkooper added a commit to robkooper/pecan that referenced this issue Feb 1, 2019
This fixes issue PecanProject#2262 preventing modellauncher from working.

Updated documentation for modellauncher
Added an example xml file in tests for modellauncher.
@serbinsh
Copy link
Member Author

serbinsh commented Feb 1, 2019

Blame is pointing to these changes: d66e5a3 . but I am not sure yet if that is it. @ashiklom any thoughts on this PR causing issues with ML? Do you use ML?

@serbinsh
Copy link
Member Author

serbinsh commented Feb 1, 2019

LOL. OK @robkooper is already on it

@serbinsh
Copy link
Member Author

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants