Experimental work for using IPython.parallel with scikit-learn
PyCon 2012 - PyData sprint

Sandbox to collaborate during the pycon sprint on distributed data analytcis related issues.

Stuff to investigate

  • Distributed grid search
  • Efficient data broadcasting on the local stores of the nodes using a memoization pattern a la joblib.Memory
  • Distributed random forests
  • Leverage data locality from disco's DFS in IPython parallel engines
