R interface to the Leiden Grid Infrastructure
R Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
R
inst
man
resource
test
DESCRIPTION
ENVIRONMENTS
QUICKSTART
README
TODO

README

README
------

The Rlgi package is an integration of the R Programming languages that allows R users to programatically integrate their R models with a Leiden Grid Infrastructure.

This implementation of Rlgi is an adaptation of the Rsge package (which is based on the Rlsf package, etc..). Thanks to all the authors of these package to make this possible.

The program functions as follows:

1. call lgi.par(L|s|C|R)Apply(X, fun, ..., join.method=cbind, njobs) or lgi.apply 
2. The data object is split as specified by njobs or batch.size
3. Each data segment along with the function, arguemnt list (optionally global variables and library names) are saved to a shared network location using the save call.
4. Each worker process loads the saved file, creates the new environment, executes the saved function call (FUN), and saves the results to the shared network location. 
5. The master script uses join.method to merge the results together and returns the result.

(lgi.submit is similar, but does not split data and is asynchronous)


IMPORTANT NOTE: This is no more than a prototype version that works with simple cases. Someone with some R knowledge should probably make it more robust (large datasets, for example).


LICENSE
------
This code is licensed under GPL, use it, modify it, change it, but most of all, enjoy it.

INSTALLATION
------
To install this package, first ensure that snow and RCurl are installed, then run the following. 

R CMD INSTALL Rlgi

This package requires that the LGI user configuration is setup in ~/LGI.cfg or ~/.LGI. Please refer to QUICKSTART and the documentation for lgi.readConfig() for more information.

  Nodes where this package is installed are referred to as submit nodes.
  Nodes where this package will distribute data for computation are referred to as compute nodes.

  SubmitNode:
    - LGI should be configured
      - certificate and project settings should be configured in ~/LGI.cfg or ~/.LGI,
        or the LGI-related Rlgi options should be set
    - R (tested with 2.6.1/2.7, may work with other versions)
    - R packages snow, RCurl and Rlgi installed.
  
  ComputeNode
    - Should be setup to run an LGI resource daemon with R (see resource/)
    - R, Rlgi and snow should be installed
    - Other packages may be required if you use the "packages" option

Usage
------

To get more information about how to use this package, either see:
  - R docs,
  - check the test directory to see examples that I used for testing. 
       - ignore the framework, its a little too much.
  - Check out the source code, there are only a few hundred lines :)

Configuration:
-------
For more on global configuration, see help(lgi.options)

Example session
-------
  > library(Rlgi)
  Loading required package: snow
  Loading required package: XML
  Welcome to Rlgi
      Version: 0.0.7

  > # set defaults, when needed
  > options(lgi.application='R-2.12')

  > # non-cluster-aware call
  > sapply(c(10:20), function(x){x^3})
   [1] 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000

  > # use Rlgi's function to do sapply, but run locally for testing
  > lgi.parSapply(c(10:20), function(x){x^3}, cluster=FALSE)
  [Rlgi] Running locally
   [1] 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000

  > # now run remotely, blocking until all computations are finished
  > lgi.parSapply(c(10:20), function(x){x^3})
  [Rlgi] Submitting 1 jobs...
  [Rlgi] All jobs submitted, waiting for completion.
  [Rlgi] Waiting for jobs:   0 queued;    0 running;   1 other      
   [1] 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000