R <-> Common Lisp gateway (written by Rif, maintained by Tony Rossini)
Latest commit f37e5a9 Apr 17, 2009 @blindglobe add *r-interactive* variable -- set to 1 if working interactively in …
…Common Lisp

Signed-off-by: AJ Rossini <blindglobe@gmail.com>


 -*- mode: text -*-

(we are really in muse mode; but most people don't have it readily
available and we aren't forcing the use of Emacs/muse.   M-x muse-mode
makes reading easier IMHO).

* License 

See the file COPYING in the top level directory.

* Interfacing Common Lisp and R

This library considers multiple approaches for connecting Common Lisp
and R, particularly for embedding R within CL.  CLSR also attempts to
do the reverse.  RCL is a third approach, which provides more
porcelain around the plumbing.  RCLG also tries to use R with the
threaded SBCL capability.  All of this will be commented on later (in
time, in this file).

* Quick start

** Requirements

Common Lisp Implementation:

A. (WORKS!) SBCL 1.0 and later (known, might work on earlier versions)

B. (Goal, but not working yet):  CLISP is a target, but there are a
   few configuration issues to resolve.

You will need the following libraries available:

1. ASDF   (system definition facility, for loading packages)
2. CFFI   (Common foreign function interface: later than CFFI-060526)
                                                        (May 26 2006)
3. RCLG   (this library)

Once you have these, then the simple way to get started is to:

1. add rclg.asd to the ASDF systems path.
2. See  rclg-demo.lisp  for getting started.  It has incantations for:
   a. compiling and loading cffi
   b. compiling and loading rclg
   c. basic R functions
   d. basic data conversion.

* Past and possibly present "Issues"

1. Need to get it working ("again") with CLISP.

** From the file formerly known as NOTES/06032006.rif (date contained)

1. In the current version of cffi, the variable names get wrapped in
   asterisks, so

(defcvar "R_CStackLimit"  :unsigned-long)  ;; :unsigned long
(defcvar "R_SignalHandlers" :unsigned-long) ;; :unsigned long

   make variables named *R-CSTACKLIMIT* and *R-SIGNALHANDLERS*

2. In "rclg-demo.lisp", I wanted to mention that rnb doesn't mean "no
   blocking", it means "no backconverting".  It's not that it doesn't
   protect --- it does all the normal protection, it just doesn't
   convert the result back from R to Lisp.  It's to avoid pushing big
   objects back and forth through the pipe needlessly.  In some sense,
   rnb is what you'd use to do an R assignment, but you get a CL name
   for whatever it is rather than an R name.

3. At this point, it seems like setting *R-SIGNALHANDLERS* to 0 is
   preventing an immediate seg fault.  So that's good.

4. We are still getting errors related to the stack, and I can't
   actually *find* anything.  Whenever I try to find a function, I'm
   getting back "Unbound value".  I don't know for sure that this is
   related to the stack problems.  The equivalent C program does find
   things.  I've put the C and Lisp programs in subdirectory
   simple-test.  I'm not sure whether want to be using Rf_findVar or

5. I'm not sure 100% I'm handling the stack correctly.  On the mailing
   list, it suggests setting R_CStackLimit = -1.  However, CFFI
   believes that *R-CSTACKLIMIT* is an unsigned long, and won't let me
   set it to -1.  I'm seting it to the two's complement value, which
   is 429476295.  I think this is OK, but tell me if it isn't.

6. It is worth noting that Rf_initEmbeddedR *will* change the value of
   R_CStackLimit.  So it is important to set signalhandlers to 0
   BEFORE starting R, and then probably set StackLimit afterwards.
   I've played a bit with going into the R source and turning off the
   stuff in the initialization that sets it to some value, but that
   produced an infinite loop of stack checks.

7. I had to make a bunch of changes so that all the built-in file
   constants match my directory structure.  I think it's set up so you
   only have to change the path in one place now (the defvar of
   *R-HOME-STR* in rclg-load).

** From the file formerly known as src/NOTES


  "Gets the sexptype of an robj.  WARNING: ASSUMES THAT THE TYPE

2. NAs

(defvar *r-NA-internal* -2147483648) ;;  PLATFORM SPECIFIC HACK!!!

3. SBCL-specific hacks

rclg-convert:sequence-to-robj is sbcl-specific!  Should think about
removing rclg-helpers for more portability, if it's fast enough.

Consolidate with-gensyms somewhere?

We only get r-names and r-dims back at the "toplevel" call to r.
Should rnb protect?  Don't think we're using poss-sexp for anything...

with-r-traps is SBCL specific.  
Multiprocessing stuff (with mutex) is SBCL specific.

In R, memory.c contains allocVector.  Looks like it *should* be doable
to directly pass vectors around (non-portably, of course).

* Comparison of Implementation and Design of RCLG, CLSR.

** Tools to initialize R

1. rclg-init : start-rclg update-R
               start-rclg-update-thread stop-rclg-update-thread 
	       with-R-traps with-r-mutex

   initialize and maintain the R evaluator process. 

2. rclg-load : load-r-libraries  (FFI initializer, not FFI <-> CL specifier.

   initialize environment and load libraries.

3. clsr-loader: uffi-load-r-library (package: clsr)

4. clsr : instantiates an R process.

** Tools to handle R data structures and SEXPs in CL

1. rclg-access  : r-setcar, r-car, r-cdr
                  (R SEXP content <-> CL data)

   low-level defuns for data transport between R SEXPs and CL. 

2. rclg-convert : convert-to-r, convert-from-r, *r-na*, r-nil, r-bound
                  (data type and data conversion)

3. rclg-types : SEXP data structure information.  exports types, print method.

   structures and objects for R internal types

4. clsr-rref : High level R SEXP data structures and mappings

5. clsr-sxp : SEXP data structures and mappings

** Mappings:  Name (Function/variable), object registry

1. rclg-control : r (internal: rname-to-robj, rname-to-rfun

   primary interface to R evaluator

2. clsr-objects : tracks created R objects  in CL (reference counter).

** R evaluation

1. rclg-control : r (converts results),
                  rnb (uneval'd R object, unprotected)

1. rclg-parse-objects : tools to handle string commands.

** internal FFI

1. rclg-foreigns : maps libR to CL

** CL tools (some R, some general)

1. rclg-utils : with-gensyms, over-column-major-indices, to-list, to-vector