Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master
Fetching contributors…

Cannot retrieve contributors at this time

577 lines (451 sloc) 19.341 kb

Time-stamp: <2011-01-24 20:10:44 tony> Creation: <2008-09-08 08:06:30 tony>

Intro and Metadata

File: TODO.lisp Author: AJ Rossini <blindglobe@gmail.com> Copyright: (c) 2007-2010, AJ Rossini <blindglobe@gmail.com>. BSD. Purpose: Stuff that needs to be made working sits inside the Task sections.

This file contains the current challenges to solve, including a description of the setup and the work to solve. Solutions welcome.

What is this talk of ‘release’? Klingons do not make software ‘releases’. Our software ‘escapes’, leaving a bloody trail of designers and quality assurance people in its wake.

Design

(Internal) Package and (External) System Hierarchy

Singletons (primary building blocks)

These are packages as well as

asdf common system loader
xarray common access structure to array-like
(matrix, vector) structures.
cls-config initialization of Lisp state, variables, etc,
localization to the particular lisp.
lift unit-testing
cffi foriegn function library
alexandria
babel
iterate
metabang-bind

Dependency structure

lisp-matrix general purpose matrix package, linking to lapack
for numerics. Depends on:
ffa cffi
fnv cffi
cl-blapack cffi
xarray
cls-dataframe in the same spirit as lisp-matrix, a means to
create tables. Perhaps better called datatables?
cls-probability depends on gsll, cl-variates, cl-? initially,

Need to integrate

cl-randist

rsm-string

?? cl-2d : cl-cairo2 : cffi

?? cl-plplot : cffi

Tasks to Do [4/25]

Usually, we need to load it before going on.

(asdf:oos 'asdf:load-op :cls)

[#B] SET UP

  • State “DONE” from “CURR” [2010-10-12 Tue 13:48]
    setup is mostly complete
  • State “CURR” from “TODO” [2010-10-12 Tue 13:47]
  • State “TODO” from “” [2010-10-12 Tue 13:47]

This is an example of a custom setup, not really interesting at this point except to remind Tony how to program.

(in-package :cl-user)
(progn 
  (defun init-CLS (&key (compile 'nil))
    (let ((packagesToLoad (list ;; core system
                                :lift :lisp-matrix :cls

                                ;; visualization
                                ;; :cl-cairo2-x11 :iterate
                                :cl-2d
                                ;; doc reporting
                                :cl-pdf :cl-typesetting
                                ;;INFRA
                                :asdf-system-connections :xarray
                                ;;DOCS
                                :metatilities-base :anaphora :tinaa
                                :cl-ppcre :cl-markdown :docudown
                                ;; version and validate CLOS objects
                                ;; :versioned-objects :validations
                                ;;VIZ
                                ;; :cl-opengl
                                ;; :cl-glu :cl-glut :cl-glut-examples

                                :bordeaux-threads
                                ;; :cells :cells-gtk
                                )))
      (mapcar #'(lambda (x)
                  (if compile
                      (asdf:oos 'asdf:compile-op x :force T)
                      (asdf:oos 'asdf:load-op x)))
              packagesToLoad)))

  (init-CLS)) ;; vs (init-CLS :compile T)

[#A] Integrate with quicklist support.

  • State “TODO” from “” [2010-11-30 Tue 18:00]

important to merge with quicklisp system loader support.

CURR [#A] Testing: unit, regression, examples. [0/3]

  • State “CURR” from “TODO” [2010-10-12 Tue 13:51]
  • State “TODO” from “” [2010-10-12 Tue 13:51]

Testing consists of unit tests, which internally verify subsets of code, regression tests, and functional tests (in increasing order of scale).

CURR [#B] Unit tests

  • State “CURR” from “TODO” [2010-11-04 Thu 18:33]
  • State “CURR” from “TODO” [2010-10-12 Tue 13:48]
  • State “TODO” from “” [2010-10-12 Tue 13:48]

Unit tests have been started using LIFT. Need to consider some of the other systems that provide testing, when people add them to the mix of libraries that we need, along with examples of how to use.

(in-package :lisp-stat-unittests)
(run-tests :suite 'lisp-stat-ut)
;; => tests = 78, failures = 7, errors = 20
(asdf:oos 'asdf:test-op 'cls)
;; which runs (describe (run-tests :suite 'lisp-stat-ut))

and check documentation to see if it is useful.

(in-package :lisp-stat-unittests)

(describe 'lisp-stat-ut)
(documentation 'lisp-stat-ut 'type)

;; FIXME: Example: currently not relevant, yet
;;   (describe (lift::run-test :test-case  'lisp-stat-unittests::create-proto
;;                             :suite 'lisp-stat-unittests::lisp-stat-ut-proto))

(describe (lift::run-tests :suite 'lisp-stat-ut-dataframe))
(lift::run-tests :suite 'lisp-stat-ut-dataframe)

(describe (lift::run-test
  	       :test-case  'lisp-stat-unittests::create-proto
  	       :suite 'lisp-stat-unittests::lisp-stat-ut-proto))

[#B] Regression Tests

  • State “TODO” from “” [2010-10-12 Tue 13:54]

[#B] Functional Tests

  • State “TODO” from “” [2010-10-12 Tue 13:54]

CURR [#B] Functional Examples that need to work [1/2]

  • State “CURR” from “TODO” [2010-11-30 Tue 17:57]
  • State “TODO” from “” [2010-10-12 Tue 13:55]

These examples should be functional forms within CLS, describing working functionality which is needed for work.

[#B] Scoping with datasets

  • State “TODO” from “” [2010-11-04 Thu 18:46]

The following needs to work, and a related syntax for resampling and similar synthetic data approaches (bootstrapping, imputation) ought to use similar syntax as well.

(in-package :ls-user)
(progn
  ;; Syntax examples using lexical scope, closures, and bindings to
  ;; ensure a clean communication of results
  ;; This is actually a bit tricky, since we need to clarify whether
  ;; it is line-at-a-time that we are considering or if there is
  ;; another mapping strategy.  In particular, one could imagine a
  ;; looping-over-observations function, or a
  ;; looping-over-independent-observations function which leverages a
  ;; grouping variable which provides guidance for what is considered
  ;; independent from the sampling frame being considered. The frame
  ;; itself (definable via some form of metadata to clarify scope?)
  ;; could clearly provide a bit of relativity for clarifying what
  ;; statistical independence means.
  
  (with-data dataset ((dsvarname1 [usevarname1])
                      (dsvarname2 [usevarname2]))
      @body)
  
  (looping-over-observations
     dataset ((dsvarname1 [usevarname1])
              (dsvarname2 [usevarname2]))
       @body)

  (looping-over-independent-observations
     dataset independence-defining-variable
       ((dsvarname1 [usevarname1])
        (dsvarname2 [usevarname2]))
       @body)
  )

[#B] Dataframe variable typing

  • State “DONE” from “CURR” [2010-11-30 Tue 17:56]
    check-type approach works, we would just have to throw a catchable error if we want to use it in a reliable fashion.
  • State “CURR” from “TODO” [2010-11-30 Tue 17:56]
  • State “TODO” from “” [2010-11-04 Thu 18:48]

Seems to generally work, need to ensure that we use this for appropriate typing.

(in-package :ls-user)
(defparameter *df-test*
  (make-instance 'dataframe-array
                 :storage #2A (('a "test0" 0 0d0)
                               ('b "test1" 1 1d0)
                               ('c "test2" 2 2d0)
                               ('d "test3" 3 3d0)
                               ('e "test4" 4 4d0))
                 :doc "test reality"
                 :case-labels (list "0" "1" 2 "3" "4")
                 :var-labels (list "symbol" "string" "integer" "double-float")
                 :var-types (list 'symbol 'string 'integer 'double-float)))

;; with SBCL, ints become floats?  Need to adjust output
;; representation appropriately..
*df-test* 

(defun check-var (df colnum)
  (let ((nobs (xdim (dataset df) 0)))
    (dotimes (i nobs)
      (check-type (xref df i colnum) (elt (var-types df) i)))))

(xdim (dataset *df-test*) 1)
(xdim (dataset *df-test*) 0)

(check-var *df-test* 0)

(class-of
  (xref *df-test* 1 1))

(check-type (xref *df-test* 1 1)
            string) ;; => nil, so good.
(check-type (xref *df-test* 1 1)
            vector) ;; => nil, so good.
(check-type (xref *df-test* 1 1)
            real) ;; => simple-error type thrown, so good.

;; How to nest errors within errors?
(check-type (check-type (xref *df-test* 1 1) real) ;; => error thrown, so good.
            simple-error)
(xref *df-test* 1 2)

(check-type *df-test*
            dataframe-array) ; nil is good.

(integerp (xref *df-test* 1 2))
(floatp (xref *df-test* 1 2))
(integerp (xref *df-test* 1 3))
(type-of (xref *df-test* 1 3))
(floatp (xref *df-test* 1 3))

(type-of (vector 1 1d0))
(type-of *df-test*)

(xref *df-test* 2 1)
(xref *df-test* 0 0)
(xref *df-test* 1 0)
(xref *df-test* 1 '*)

CURR [#A] Random Numbers [2/6]

  • State “CURR” from “TODO” [2010-11-05 Fri 15:41]
  • State “TODO” from “” [2010-10-14 Thu 00:12]

Need to select and choose a probability system (probability functions, random numbers). Goal is to have a general framework for representing probability functions, functionals on probabilities, and reproducible random streams based on such numbers.

CURR [#B] CL-VARIATES system evaluation [2/3]

  • State “CURR” from “TODO” [2010-11-05 Fri 15:40]
  • State “TODO” from “” [2010-10-12 Tue 14:16]

CL-VARIATES is a system developed by Gary W King. It uses streams with seeds, and is hence reproducible. (Random comment: why do CL programmers as a class ignore computational reproducibility?)

[#B] load and verify

  • State “DONE” from “CURR” [2010-11-04 Thu 18:59]
    load, init, and verify performance.
  • State “CURR” from “TODO” [2010-11-04 Thu 18:58]
  • State “TODO” from “” [2010-11-04 Thu 18:58]

<2010-11-30 Tue> : just modified cls.asd to ensure that we load as appropriate the correct random variate package.

(in-package :cl-user)
(asdf:oos 'asdf:load-op 'cl-variates)
(asdf:oos 'asdf:load-op 'cl-variates-test)

(in-package :cl-variates-test)
;; check tests
(run-tests :suite 'cl-variates-test)
(describe (run-tests :suite 'cl-variates-test))

[#B] Examples of use

  • State “DONE” from “CURR” [2010-11-05 Fri 15:39]
    basic example of reproducible draws from the uniform and normal random number streams.
  • State “CURR” from “TODO” [2010-11-05 Fri 15:39]
  • State “TODO” from “” [2010-11-04 Thu 19:01]

(in-package :cl-variates-user)

(defparameter state (make-random-number-generator))
(setf (random-seed state) 44)

(random-seed state)
(loop for i from 1 to 10 collect
                  (random-range state 0 10))
;; => (1 5 1 0 7 1 2 2 8 10)
(setf (random-seed state) 44)
(loop for i from 1 to 10 collect
                  (random-range state 0 10))
;; => (1 5 1 0 7 1 2 2 8 10)

(setf (random-seed state) 44)
(random-seed state)
(loop for i from 1 to 10 collect
                  (normal-random state 0 1))
;; => 
;; (-1.2968656102820426 0.40746363934173213 -0.8594712469518473 0.8795681301148328
;;  1.0731526250004264 -0.8161629082481728 0.7001813608754809 0.1078045427044097
;;  0.20750134211656893 -0.14501914108452274)

(setf (random-seed state) 44)
(loop for i from 1 to 10 collect
                  (normal-random state 0 1))
;; => 
;; (-1.2968656102820426 0.40746363934173213 -0.8594712469518473 0.8795681301148328
;;  1.0731526250004264 -0.8161629082481728 0.7001813608754809 0.1078045427044097
;;  0.20750134211656893 -0.14501914108452274)

CURR [#B] Full example of general usage

  • State “CURR” from “TODO” [2010-11-05 Fri 15:40]
  • State “TODO” from “” [2010-11-05 Fri 15:40]

What we want to do here is describe the basic available API that is present. So while the previous work describes what the basic reproducibility approach would be in terms of generating lists of reproducible pRNG streams, we need the full range of possible probability laws that are present.

One of the good things about cl-variates is that it provides for reproducibility. One of the bad things is that it has a mixed bag for an API.

[#B] CL-RANDOM system evaluation

  • State “TODO” from “” [2010-11-05 Fri 15:40]

Problems:

  1. no seed setting for random numbers
  2. contamination of a probability support with optimization and linear algebra.

Positives:

  1. good code
  2. nice design for generics.

[#B] Native CLS (from XLS)

  • State “TODO” from “” [2010-11-05 Fri 15:40]

[#B] Numerical Linear Algebra

  • State “TODO” from “” [2010-10-14 Thu 00:12]

[#B] LLA evaluation

  • State “TODO” from “” [2010-10-12 Tue 14:13]

;;; experiments with LLA (in-package :cl-user) (asdf:oos ‘asdf:load-op ‘lla) (in-package :lla-user)

CURR [#B] Lisp-Matrix system evaluation

  • State “CURR” from “TODO” [2010-10-12 Tue 14:13]
  • State “TODO” from “” [2010-10-12 Tue 14:13]

[#B] LispLab system evaluation

  • State “TODO” from “” [2010-10-12 Tue 14:13]

[#B] Statistical Procedures to implement

  • State “TODO” from “” [2010-10-14 Thu 00:12]

PFIM

(in-package :cls-user) ;;;; PFIM notes

;; PFIM 3.2

;; population design eval and opt #| issues:

  • # individuals
  • # sampling times
  • sampling times?

constraints: number of samples/cost of lab analysis and collection expt constraints

#

(defun pfim (&key model ( constraints ( summary-function )

(list num-subjects num-times list-times))))

#| N individuals i Each individal has a deisgn psi_i nubmer of samples n_i and sampling times ti{1} ti{n_1} individuals can differ

Model:

individual-level model

#

(=model y_i (+ (f θ_i ψ_i) epsilion_i )) (=var \epsilion_i σ_between σ_within )

;; Information Matrix for pop deisgn

(defparameter IM (sum (i 1 N) (MF ψ_i φ_i)))

#| For nonlinear structureal models, expand around RE=0

Cramer-Rao : MF-1 is lower bound for estimation variance.

Design comparisons:

  • smallest SE, but is a matrix, so
  • criteria for matrix comparison

– D-opt, (power (determinant MF) (/ 1 P))

find design maxing D opt, (power (determinant MF) (/ 1 P)) Design varialables – contin vars for smapling times within interval or set – number of groups for cat vars

Stat in Med 2009, expansion around post-hoc RE est, not necessarily zero.

Example binary covariate C

#

(if (= i reference-class) (setf (aref C i) 0) (setf (aref C i) 1))

;; Exponential RE, (=model (log θ) ( ))

;; extensions

;; outputs

#| PFIM provides for a given design and values of β: compute extended FIM SE/RSE for β of each class of each covar eval influence of design on SE(β)

inter-occassion variability (IOV)

  • patients sampled more than once, H occassions
  • RE for IOV
  • additional vars to estimate
#

;;; comparison criteria

functional of conc/time curve which is used for comparison, i.e. (AUC conc/time-curve) (Cmax conc/time-curve) (Tmax conc/time-curve)

where

(defun conc/time-curve (t) ;; computation #| (let ((conc (exp (* t β1)))) conc)

|# )

;;See (url-get “www.pfim.biostat.fr”)

;;; Thinking of generics… (information-matrix model parameters) (information-matrix variance-matrix) (information-matrix model data) (information-matrix list-of-individual-IMs)

(defun IM (loglikelihood parameters times) “Does double work. Sum up the resulting IMs to form a full IM.” (let ((IM (make-matrix (length parameters) (length parameters) :initial-value 0.0d0))) (dolist (parameterI parameters) (dolist (parameterJ parameters) (setf (aref IM I J) (differentiate (differentiate loglikelihood parameterI) parameterJ))))))

difference between empirical, fisherian, and …? information.

Example of Integration with CL-GENOMIC

  • State “TODO” from “” [2010-10-12 Tue 14:03]

CL-GENOMIC is a very interesting data-structure strategy for manipulating sequence data.

(in-package :cl-user)
(asdf:oos 'asdf:compile-op :ironclad)
(asdf:oos 'asdf:load-op :cl-genomic)

(in-package :bio-sequence)
(make-dna "agccg") ;; fine
(make-aa "agccg")  ;; fine
(make-aa "agc9zz") ;; error expected

[#B] Documentation and Examples [0/3]

  • State “TODO” from “” [2010-10-14 Thu 00:12]

[#B] Docudown

  • State “TODO” from “” [2010-11-05 Fri 15:34]

[#A] CLDOC

  • State “TODO” from “” [2010-11-05 Fri 15:34]

[#B] CLPDF, and literate data analysis

  • State “TODO” from “” [2010-11-05 Fri 15:34]

Proposals

Jump to Line
Something went wrong with that request. Please try again.