Time-stamp: <2012-10-10 05:00:41 tony>
Current Status: SOMEWHAT BROKEN
but we are rebuilding it.
“Languages shape how we …” Need to get and insert this quote that Duncan Temple-Lang found.
The API should distinguish between the realization and the statistical interpretation. Goal is to teach statisticians how to think “systems-computationally”, and programmers, comp-sci types, informaticists and other “data scientists” how to think “statistically”, in order to get a jump on the competition.
The goal of this system is to promote a change in thinking, to move the data analysis approach, currently stuck in a mix of 70s-early 90s approaches, into a new generation/level.
The approach we are taking is one where we provide a general method, and some fundamental building blocks, but don’t force users into approaches in order to allow for experimentation.
DSL’s should be built on top of the core packages, as needed or wanted.
(TonyR:) The DSL I want to build is a verbose statistically precise computing language, but we need quality code underneathe (which others could use for specialized terse DSL’s).
DSL: domain specific language.
Version 2 (David)
(This held for the version before we removed liblispstat and plplot and some other “crutches” which had a bit too much bitrot).
We assume that you have a lisp installed and that you have a passing acquaintence with the unix shell.
- The first point that you should note that is that these instructions are written with the assumption of the availibility of quicklisp.
If you do not have quicklisp , please go to www.quicklisp.org and install it now
- The second point to note is that you will need the “git” utility
installed on your machine.
for mac osx sudo port install git for linux (eg debian) sudo apt-get install git
- Once that is done execute the following shell commands
cd ~/quicklisp/local-projects git clone git://github.com/blindglobe/common-lisp-stat.git cd comon-list-stat git submodules init
These commands copy the the source from the repository and all the associated libraries. It will live as a quicklisp project in the local-projects directory. I find it convenient to symbolically link the quicklisp direct to ~/lisp for easy access
ln ~/quicklisp/local-projects ~/lisp
- Configure the locations of the BLAS and LINPACK libraries
Currently this is a manual operation, which will change in a later version.
Edit the file external/cl-blapack/load-blapack-libs.lisp
Search for the following 3 parameters gfortran-lib blas-lib lapack-lib
For OS X: change the parameters as suggested in the file. Both BLAS and LAPACK are pre installed on Mac OSX.
For linux, make sure you have the neccessary libraries installed, through apt, yum or otherwise
sudo apt-get install libblas sudo apt-get install liblapack
- For visualization we are currently using plplot and the
cl-plplot interface. this requires the installation of the
for MAC OSX you can use macports or homebrew
5.1 sudo port install xquartz (or download from the xquartz home site)
5.2 sudo port install plplot
and on linux your favourite package manager of course.
For windows, we recommend you use cygwin to get straightforward access. I’ll document the steps if there is a demand.
- You need to check that your dynamic library path has been
properly set up in the shell. In your .bashrc (or equivalent
shell init file)
For Mac OSX set
For Linux set
If you get this wrong the load process will not be able to find the libraries and will prompt you.
- Once the pre prequisites have been done, start your favourite lisp and enter
(ql:register-local-projects) (ql:quickload :cls)
Retire for a well earned coffee and upon your return you should find the package completely installed.Obviously, potential errors can creep in with spelling the filenames correctly, so be careful.
Version 1 (Tony)
You probably did (preferred)
git clone git://github.com/blindglobe/common-lisp-stat.git
(or maybe using the repo.or.cz git repository archive), or (coming soon!) from within a Lisp instance:
At one point, I planned a pure git-delivery via cloning and submodules, but this proved to be a bit more complex than needed, thanks to the creation of quicklisp. It’s also a stupid idea if one plans to have users who are not hackers or developers, and eventually we want users.
Despite quicklisp, there will need to be a version for delivering a system development-oriented CLS environment and this will consist of git repositories, possibly through submodules, but this (submodules) is for discussion.
There are quite a few libraries that are needed, and right now we are working on simplifying the whole thing. Once you get past this step, then you should:
- run a common lisp (SBCL, CMUCL, CLISP, CLOZURE-CL) starting in the current directory. You will need ASDF at a minimum, QUICKLISP preferred. And you should have QUICKLISP.
- (on Debian or similar systems: can use CLC (Common Lisp Controller) or SBCL approaches, i.e. ~/.clc/systems or ~/.sbcl/systems should contain softlinks to the cls and other required ASDF files (i.e. cls.asd, cffi.asd, and lift.asd).
There are example sessions and scripts for data analysis, some real, some proposed, in the file:examples/ directory. Also see file:TODO.org for snippets of code that work or fail to work.
Example Usage steps [2/7]
Start and Load
- start your lisp
- load CLS
Setup a place to work
In Common Lisp, you need to select and setup namespace to store data and functions. There is a scratch user-package, or sandbox, for CLS, cls-user , which you can select via:
and this has some basic modules from CLS instantiated (dataframes, probability calculus, numerical linear algebra, basic summaries (numerical and visual displays).
However, it can be better is to create a package to work in, which pulls in only desired functionality:
(in-package cl-user) (defpackage :my-package-user (:documentation "demo of how to put serious work should be placed in a similar package elsewhere for reproducibility. This hints as to what needs to be done for a user- or analysis-package.") (:nicknames :my-clswork-user) (:use :common-lisp ; always needed for user playgrounds! :lisp-matrix ; we only need the packages that we need... :common-lisp-statistics :cl-variates :lisp-stat-data-examples) ;; this ensures access to a data package (:shadowing-import-from :lisp-stat ;; This is needed temporarily until we resolve the dependency and call structure. call-method call-next-method expt + - * / ** mod rem abs 1+ 1- log exp sqrt sin cos tan asin acos atan sinh cosh tanh asinh acosh atanh float random truncate floor ceiling round minusp zerop plusp evenp oddp < <= = /= >= > > ;; complex conjugate realpart imagpart phase min max logand logior logxor lognot ffloor fceiling ftruncate fround signum cis <= float imagpart) (:export summarize-data summarize-results this-data this-report)) (in-package :my-clswork-user) ;; or :my-package-user (setf my-data (let ((var1 )) ))
We need to pull in the packages with data or functions that we need; just because the data/function is pulled in by another package, in that package’s namespace, does NOT mean it is available in this name space. However, the common-lisp-statistics package will ensure that fundamental objects and functions are always available.
Get to work [0/3]
Pull in or create data
Save work and results for knowledge building and reuse
One can build a package, or save an image (CL implementation dependent) or…
Inform moi of problems or successes
NEED TO SETUP A MAILING LIST!!
mailto:email@example.com if there is anything wrong, or even if something happens to work.
- SBCL is target platform. CCL and CMUCL should be similar.
- CLISP is finicky regarding the problems that we have with CFFI conversation. In particular that we can not really do typing that we need to take care of. I think this is my (Tony’s) problem, not someone elses, and specifically, not CLISP’s
- Need to test ECL.
See files in file:Doc/ for history, design considerations, and random, sometimes false and misleading, musings.
Local modifications, Development, Contributions
Since this project is
# git clone git://repo.or.cz/CommonLispStat.git git clone git://github.com/blindglobe/common-lisp-stat.git cd common-lisp-stat # git submodules init # git submodules update
will pull the whole repository, and create a “master” branch to work on. If you are making edits, which I’d like, you don’t want to use the master branch, but more to use a topic-centric branch, so you might:
git checkout -b myTopicBranch
and then work on myTopicBranch, pulling back to the master branch when needed by
git checkout master git pull . myTopicBranch
git rebase myTopicBranch
BETTER DOCUMENTATION EXAMPLES EXIST ON-LINE!! PLEASE READ THEM, THE ABOVE IS SPARSE AND MIGHT BE OUTDATED!
Contributing through GitHub
Alternatively, one can work on the github repositories as well. They are a bit differently organized, and require one to get a github account and work from there.
basically, clone the repository on github on the WWW interface, then make a branch (as below), push back the branch to github, and notify the main repository that there is something to be pulled. And we’ll pull it back in.
Commiting with the MOB on repo.or.cz
of course, perhaps you want to contribute to the mob branch. For that, after cloning the repository as above, you would:
git checkout -b mob remotes/origin/mob
(work, work, work… through a cycle of
<edit> git add <files just edited> git commit -m "what I just did"
ad-nauseum. When ready to commit, then just:
git push git+ssh://firstname.lastname@example.org/srv/git/CommonLispStat.git mob:mob
and it’ll be put on the mob branch, as a proposal for merging.
Another approach would be to pull from the topic branch into the mob branch before uploading. Will work on a formal example soon.
(the basic principle is that instead of the edit cycle on mob, do something like:
git checkout mob git pull . myTopicBranch git push git+ssh://email@example.com/srv/git/CommonLispStat.git mob:mob
Licensing will be important. Next decade. But do think through what you intend with your contributions. Should we become famous (Ha!) make sure that you’ve communicated your expectations…
[fn:1] I´m not including instructions for Emacs or git, as the former is dealt with other places and the latter was required for you to get this. Since disk space is cheap, I´m intentionally forcing git to be part of this system. Sorry if you hate it. Org-mode, org-babel, and org-babel-lisp, and hypo are useful for making this file a literate and interactively executable piece of work.