Nanocubes: an in-memory data structure for spatiotemporal data cubes

Nanocubes are a fast datastructure for in-memory data cubes developed at the Information Visualization department at AT&T Labs – Research. Visualizations powered by nanocubes can be used to explore datasets with billions of elements at interactive rates in a web browser, and in some cases nanocubes uses sufficiently little memory that you can run a nanocube in a modern-day laptop.

Releases

Number	Description
2.1	Javascript front-end, CSV Loading, Bug Fixes
2.0	New feature-rich querying API
1.0	Original release with a simple querying API

Compiling latest release

Prerequisites The nanocubes server is written in C++ 11 (gcc >=4.7.2). You'll need a recent version of boost (>=1.48) and the GNU build system.

Run the following commands to compile nanocubes on your linux/mac system.

$ wget https://github.com/laurolins/nanocube/archive/2.1.zip
$ unzip 2.1.zip
$ cd nanocube-2.1
$ ./bootstrap
$ ./configure
$ make

If a recent version of gcc is not the default, you can run configure with the specfic recent version of gcc in your system. For example

$ CXX=g++-4.8 ./configure

If your system has tcmalloc installed, we suggest linking nanocubes with it. For example

$ CXX=g++-4.8 ./configure LIBS=<path-to-tcmalloc>/libtcmalloc_minimal.a

Loading a CSV file into a nanocube

This procedure assumes your machine is not running anything on ports 8000 and 29512

Installing Pandas in a separate python env (Optional)

 $ wget http://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
 $ tar xfz virtualenv-1.11.4.tar.gz
 $ python virtualenv-1.11.4/virtualenv.py  myPy
 
 # activate the virtualenv, type "deactivate" to disable the env
 $ source myPy/bin/activate
 $ pip install argparse numpy pandas

Start a web server in the "web" directory and send it to background
```
 $ cd web
 $ python -m SimpleHTTPServer &
```

Run the script and pipe it to the nanocubes server using the included example dataset (Chicago Crime)

 $ cd ../scripts
 $ python csv2Nanocube.py --catcol='Primary Type' crime50k.csv | NANOCUBE_BIN=../src  ../src/ncserve --rf=100000 --threads=100

This should be it. Point your browser to http://localhost:8000 for the viewer

    Parameters for csv2Nanocube.py
    --catcol='Column names of categorical variable'
    --latcol='Column names of latitude'
    --loncol='Column names of longitude'
    --countcol='Column names of the count'
    --timecol='Column names of the time variable'
    --timebinsize='time bin size in seconds(s) minutes(m) hours(h) days(D)
    --sep='Delimiter of Columns'
    e.g. 1D/30m/60s'

Further Details

For a better understanding on how to ingest data into nanocubes and how to query nanocubes follow this link. For larger datasets or if you want more flexibility on ingesting/querying data using nanocubes the CSV loading method illustrated above might not be the most efficient way to go.

Asking for help

IRC: #nanocubes channel on irc.freenode.net

mailing list: http://mailman.nanocubes.net/mailman/listinfo/nanocubes-discuss_mailman.nanocubes.net

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
doc		doc
examples		examples
js		js
m4		m4
ncdmp		ncdmp
scripts		scripts
simple		simple
src		src
tmp		tmp
web		web
.gitignore		.gitignore
.gitnotify		.gitnotify
COPYRIGHT		COPYRIGHT
LICENSE		LICENSE
Makefile.am		Makefile.am
README.md		README.md
VERSION		VERSION
bootstrap		bootstrap
configure.ac		configure.ac

License

finid/nanocube

Folders and files

Latest commit

History

Repository files navigation

Nanocubes: an in-memory data structure for spatiotemporal data cubes

Releases

Compiling latest release

Loading a CSV file into a nanocube

Further Details

Asking for help

About

Resources

License

Stars

Watchers

Forks