Past sprints

Scipy 2011 sprinting: July 15-16

Location At the scipy conference (Austin)

People and tasks

Gael Varoquaux: review code, merge
Marcel Caraciolo: review code, easyfix issues.
David Warde-Farley: review

1st April 2011

Places

In Paris: at Logilab's (104 boulevard blanqui, Paris) - Metro 6 - Glacière

In Boston at MIT (36-537: 5th floor of building 36)

On IRC (#scikit-learn on irc.freenode.net)

People present

Please add skills/interests or planned task, to facilitate the sprint organization and pairing of people on tasks. To share knowledge as much as possible, it would be ideal to have pair-like programming of 2 people on a task, with different skills.

At Logilab, Paris (from 9H to 19H):

Gaël Varoquaux: task: code review, pair programming on specific task where needed.
Julien Miotte
Feth Arezki: could help with coding (w/ the logger?), LaTeX. Interested in learning about scikit.
Nelle Varoquaux: task: minibatch k-means
Fabian Pedregosa
Vincent Michel: task: code review, pair programming. features: ward's clustering.
Luis Belmar-Letelier
Thouis Jones: task: BallTree cython wrapper, documentation, whatever.

At MIT, Boston:

Alexandre Gramfort: task: code review and pair programming
Demian Wassermann: task: Gaussian Processes with sparse data
Satra Ghosh: task: Ensemble Learning, random forests
Nico Pinto
Pietro Berkes

At IRC (from around 9am Brasília time (GMT-3):

Alexandre Passos: task: dirichlet process mixture of gaussian models (In progress)
Vlad Niculae: task: matrix factorization (In progress)
Marcel Caraciolo: task: help in docs and bug fixes (beginner in the project).

Paris coding Sprint, 8-9 Sept. 2010

Place:

INRIA research center in Saclay-Ile de France, also in channel #scikit-learn, on irc.freenode.org. Room to be determined.

Some ideas:

extend the tutorial with features selection, cross-validation, etc

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

Group lasso with coordinate descent in GLM module

Covariance estimators (Ledoit-Wolf) -> Regularized LDA

Add transform in LDA

PCA with fit + transform

preprocessing routines (center, standardize) with fit transform

K-means with Pybrain heuristic

Make Pipeline object work for real

FastICA

Anything you can think of, such as:

Spectral Clustering + manifold learning (MDS/PCA, Isomap, Diffusion maps, tSNE)

Canonical Correlation Analysis

Kernel PCA

Gaussian Process regression

0.4 Coding Sprint, 16 & 17 June 2010

Place:

channel #scikit-learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Some ideas:

adapt the plotting features from the em module into gmm module.

incorporate more datasets : the diabetes from the lars R package, featured datasets from http://archive.ics.uci.edu/ml/datasets.html , etc.

anything from the issue tracker.

extend the tutorial with features selection, cross-validation, etc

profile and improve the performance of the gmm module.

submit some new classifier

refactor the ann module (artificial neural networks) to conform to the API in the rest of the modules, or submit a new ann module.

make it compatible with python3 (shouldn't be hard now that there's a numpy python3 relase)

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

anything you can think of.

Documentation Week, 14-18 March 2010

Place:

channel #learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Possible Tasks:

Document our design choices (methods in each class, convention for estimated parameters, etc.). Most of this is in ApiDiscussion.

Documentation for neural networks (nonexistent)

Examples. We currently only have a few of them. Expand and integrate them into the web page.

Write a Tutorial.

Write a FAQ.

Documentation and Examples for Support Vector Machines. What's in the web is totally outdated. Integrate the documentation from gumpy, see ticket:27 (assigned: Fabian Pedregosa)

Review documentation.

Customize the sphinx generated html.

Create some cool images/logos for the web page.

Create some benchmark plots.

Code sprint in Paris, 3 March 2010

Terminated, see http://fseoane.net/blog/2010/scikitslearn-coding-spring-in-paris/

Participants

Alexandre Gramfort

Olivier Grisel

Vincent Michel

Fabian Pedregosa

Bertrand Thirion

Gaël Varoquaux

Goals

Implement a few targeted functionalities for penalized regressions.

Target functionalities

GLMnet

Bayesian Regression (Ridge, ARD)

Univariate feature selection function

Edouard: Most of things we need are already in datamind, the main main issue is to cut the dependance with FFF(nipy)

Extras, if time permits:

LARS

Proposed workflow

Pair programming:

GLMNet (AG, OG)

Bayesian regression (FP, VM)

Feature selection (BT, GV)

LARS: Whoever is finished first.

Place in the repository

I think GLMNet goes well in scikits.learn.glm.

Edouard: The GLM term is confusing: Indeed in GLMNet the G means "generalized", however in neuroimaging people understand "general" which is in fact a linear model

Bayessian regression: scikits.learn.bayes . It's short and explicit.

Edouard: Again the term Bayes might not lead to a clear organization of algorithms.

Past sprints

Paris coding Sprint, 8-9 Sept. 2010

Place:

INRIA research center in Saclay-Ile de France, also in channel #scikit-learn, on irc.freenode.org. Room to be determined.

Some ideas:

extend the tutorial with features selection, cross-validation, etc

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

Group lasso with coordinate descent in GLM module

Covariance estimators (Ledoit-Wolf) -> Regularized LDA

Add transform in LDA

PCA with fit + transform

preprocessing routines (center, standardize) with fit transform

K-means with Pybrain heuristic

Make Pipeline object work for real

FastICA

Anything you can think of, such as:

Spectral Clustering + manifold learning (MDS/PCA, Isomap, Diffusion maps, tSNE)

Canonical Correlation Analysis

Kernel PCA

Gaussian Process regression

0.4 Coding Sprint, 16 & 17 June 2010

Place:

channel #scikit-learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Some ideas:

adapt the plotting features from the em module into gmm module.

incorporate more datasets : the diabetes from the lars R package, featured datasets from http://archive.ics.uci.edu/ml/datasets.html , etc.

anything from the issue tracker.

extend the tutorial with features selection, cross-validation, etc

profile and improve the performance of the gmm module.

submit some new classifier

refactor the ann module (artificial neural networks) to conform to the API in the rest of the modules, or submit a new ann module.

make it compatible with python3 (shouldn't be hard now that there's a numpy python3 relase)

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

anything you can think of.

Documentation Week, 14-18 March 2010

Place:

channel #learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Possible Tasks:

Document our design choices (methods in each class, convention for estimated parameters, etc.). Most of this is in ApiDiscussion.

Documentation for neural networks (nonexistent)

Examples. We currently only have a few of them. Expand and integrate them into the web page.

Write a Tutorial.

Write a FAQ.

Documentation and Examples for Support Vector Machines. What's in the web is totally outdated. Integrate the documentation from gumpy, see ticket:27 (assigned: Fabian Pedregosa)

Review documentation.

Customize the sphinx generated html.

Create some cool images/logos for the web page.

Create some benchmark plots.

Code sprint in Paris, 3 March 2010

Terminated, see http://fseoane.net/blog/2010/scikitslearn-coding-spring-in-paris/

Participants

Alexandre Gramfort

Olivier Grisel

Vincent Michel

Fabian Pedregosa

Bertrand Thirion

Gaël Varoquaux

Goals

Implement a few targeted functionalities for penalized regressions.

Target functionalities

GLMnet

Bayesian Regression (Ridge, ARD)

Univariate feature selection function

Edouard: Most of things we need are already in datamind, the main main issue is to cut the dependance with FFF(nipy)

Extras, if time permits:

LARS

Proposed workflow

Pair programming:

GLMNet (AG, OG)

Bayesian regression (FP, VM)

Feature selection (BT, GV)

LARS: Whoever is finished first.

Place in the repository

I think GLMNet goes well in scikits.learn.glm.

Edouard: The GLM term is confusing: Indeed in GLMNet the G means "generalized", however in neuroimaging people understand "general" which is in fact a linear model

Bayessian regression: scikits.learn.bayes . It's short and explicit.

Edouard: Again the term Bayes might not lead to a clear organization of algorithms.

Feature selection: featsel? selection ? I'm not sure about this one.

AG : maybe univ?

Edouard: Maybe it is to early to decide the structure of the repository during your coding sprint. I think this organization should follow discussion we had we Fabian, Gael and Bertand. Next I tried to synthesize those discussions, however its just a proposition and many things are missing:

If there's code that we want to share and it does not fit into any of these schemes, it's ok to put it into sandbox/ (it does not yet exist)

Feature selection: featsel? selection ? I'm not sure about this one.

AG : maybe univ?

Edouard: Maybe it is to early to decide the structure of the repository during your coding sprint. I think this organization should follow discussion we had we Fabian, Gael and Bertand. Next I tried to synthesize those discussions, however its just a proposition and many things are missing:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Past sprints

Past sprints

Scipy 2011 sprinting: July 15-16

People and tasks

1st April 2011

Places

People present

Paris coding Sprint, 8-9 Sept. 2010

0.4 Coding Sprint, 16 & 17 June 2010

Documentation Week, 14-18 March 2010

Code sprint in Paris, 3 March 2010

Participants

Edouard: Again the term Bayes might not lead to a clear organization of algorithms.

Paris coding Sprint, 8-9 Sept. 2010

0.4 Coding Sprint, 16 & 17 June 2010

Code sprint in Paris, 3 March 2010

Participants

Goals

Target functionalities

Proposed workflow

Place in the repository

Clone this wiki locally