Past sprints

Paris coding Sprint, 8-9 Sept. 2010

Place:

INRIA research center in Saclay-Ile de France, also in channel #scikit-learn, on irc.freenode.org. Room to be determined.

Some ideas:

extend the tutorial with features selection, cross-validation, etc

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

Group lasso with coordinate descent in GLM module

Covariance estimators (Ledoit-Wolf) -> Regularized LDA

Add transform in LDA

PCA with fit + transform

preprocessing routines (center, standardize) with fit transform

K-means with Pybrain heuristic

Make Pipeline object work for real

FastICA

= Anything you can think of, such as:=

Spectral Clustering + manifold learning (MDS/PCA, Isomap, Diffusion maps, tSNE)

Canonical Correlation Analysis

Kernel PCA

Gaussian Process regression

0.4 Coding Sprint, 16 & 17 June 2010

Place:

channel #scikit-learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Some ideas:

adapt the plotting features from the em module into gmm module.

incorporate more datasets : the diabetes from the lars R package, featured datasets from http://archive.ics.uci.edu/ml/datasets.html , etc.

anything from the issue tracker.

extend the tutorial with features selection, cross-validation, etc

profile and improve the performance of the gmm module.

submit some new classifier

refactor the ann module (artificial neural networks) to conform to the API in the rest of the modules, or submit a new ann module.

make it compatible with python3 (shouldn't be hard now that there's a numpy python3 relase)

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

anything you can think of.

= Documentation Week, 14-18 March 2010 =

Place:

channel #learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Possible Tasks:

Document our design choices (methods in each class, convention for estimated parameters, etc.). Most of this is in ApiDiscussion.

Documentation for neural networks (nonexistent)

Examples. We currently only have a few of them. Expand and integrate them into the web page.

Write a Tutorial.

Write a FAQ.

Documentation and Examples for Support Vector Machines. What's in the web is totally outdated. Integrate the documentation from gumpy, see ticket:27 (assigned: Fabian Pedregosa)

Review documentation.

Customize the sphinx generated html.

Create some cool images/logos for the web page.

Create some benchmark plots.

= Code sprint in Paris, 3 March 2010 =

Terminated, see http://fseoane.net/blog/2010/scikitslearn-coding-spring-in-paris/

== Participants ==

Alexandre Gramfort

Olivier Grisel

Vincent Michel

Fabian Pedregosa

Bertrand Thirion

Gaël Varoquaux

== Goals ==

Implement a few targeted functionalities for penalized regressions.

== Target functionalities ==

GLMnet

Bayesian Regression (Ridge, ARD)

Univariate feature selection function

Edouard: Most of things we need are already in datamind, the main main issue is to cut the dependance with FFF(nipy)

Extras, if time permits:

LARS

== Proposed workflow ==

Pair programming:

GLMNet (AG, OG)

Bayesian regression (FP, VM)

Feature selection (BT, GV)

LARS: Whoever is finished first.

== Place in the repository ==

I think GLMNet goes well in scikits.learn.glm.

Edouard: The GLM term is confusing: Indeed in GLMNet the G means "generalized", however in neuroimaging people understand "general" which is in fact a linear model

Bayessian regression: scikits.learn.bayes . It's short and explicit.

Edouard: Again the term Bayes might not lead to a clear organization of algorithms.

Past sprints

Paris coding Sprint, 8-9 Sept. 2010

Place:

INRIA research center in Saclay-Ile de France, also in channel #scikit-learn, on irc.freenode.org. Room to be determined.

Some ideas:

extend the tutorial with features selection, cross-validation, etc

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

Group lasso with coordinate descent in GLM module

Covariance estimators (Ledoit-Wolf) -> Regularized LDA

Add transform in LDA

PCA with fit + transform

preprocessing routines (center, standardize) with fit transform

K-means with Pybrain heuristic

Make Pipeline object work for real

FastICA

= Anything you can think of, such as:=

Spectral Clustering + manifold learning (MDS/PCA, Isomap, Diffusion maps, tSNE)

Canonical Correlation Analysis

Kernel PCA

Gaussian Process regression

0.4 Coding Sprint, 16 & 17 June 2010

Place:

channel #scikit-learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Some ideas:

adapt the plotting features from the em module into gmm module.

incorporate more datasets : the diabetes from the lars R package, featured datasets from http://archive.ics.uci.edu/ml/datasets.html , etc.

anything from the issue tracker.

extend the tutorial with features selection, cross-validation, etc

profile and improve the performance of the gmm module.

submit some new classifier

refactor the ann module (artificial neural networks) to conform to the API in the rest of the modules, or submit a new ann module.

make it compatible with python3 (shouldn't be hard now that there's a numpy python3 relase)

design a sphinx template for the main web page [here http://www.flickr.com/photos/fseoane/4573612893/] is a temptative design, but was not translated into a sphinx template.

anything you can think of.

= Documentation Week, 14-18 March 2010 =

Place:

channel #learn, on irc.freenode.org. If you do not have an IRC client or behind a firewall, check out http://webchat.freenode.net/

Possible Tasks:

Document our design choices (methods in each class, convention for estimated parameters, etc.). Most of this is in ApiDiscussion.

Documentation for neural networks (nonexistent)

Examples. We currently only have a few of them. Expand and integrate them into the web page.

Write a Tutorial.

Write a FAQ.

Documentation and Examples for Support Vector Machines. What's in the web is totally outdated. Integrate the documentation from gumpy, see ticket:27 (assigned: Fabian Pedregosa)

Review documentation.

Customize the sphinx generated html.

Create some cool images/logos for the web page.

Create some benchmark plots.

= Code sprint in Paris, 3 March 2010 =

Terminated, see http://fseoane.net/blog/2010/scikitslearn-coding-spring-in-paris/

== Participants ==

Alexandre Gramfort

Olivier Grisel

Vincent Michel

Fabian Pedregosa

Bertrand Thirion

Gaël Varoquaux

== Goals ==

Implement a few targeted functionalities for penalized regressions.

== Target functionalities ==

GLMnet

Bayesian Regression (Ridge, ARD)

Univariate feature selection function

Edouard: Most of things we need are already in datamind, the main main issue is to cut the dependance with FFF(nipy)

Extras, if time permits:

LARS

== Proposed workflow ==

Pair programming:

GLMNet (AG, OG)

Bayesian regression (FP, VM)

Feature selection (BT, GV)

LARS: Whoever is finished first.

== Place in the repository ==

I think GLMNet goes well in scikits.learn.glm.

Edouard: The GLM term is confusing: Indeed in GLMNet the G means "generalized", however in neuroimaging people understand "general" which is in fact a linear model

Bayessian regression: scikits.learn.bayes . It's short and explicit.

Edouard: Again the term Bayes might not lead to a clear organization of algorithms.

Feature selection: featsel? selection ? I'm not sure about this one.

AG : maybe univ?

Edouard: Maybe it is to early to decide the structure of the repository during your coding sprint. I think this organization should follow discussion we had we Fabian, Gael and Bertand. Next I tried to synthesize those discussions, however its just a proposition and many things are missing:

If there's code that we want to share and it does not fit into any of these schemes, it's ok to put it into sandbox/ (it does not yet exist)

Feature selection: featsel? selection ? I'm not sure about this one.

AG : maybe univ?

Edouard: Maybe it is to early to decide the structure of the repository during your coding sprint. I think this organization should follow discussion we had we Fabian, Gael and Bertand. Next I tried to synthesize those discussions, however its just a proposition and many things are missing:

If there's code that we want to share and it does not fit into any of these schemes, it's ok to put it into sandbox/ (it does not yet exist)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Past sprints

Past sprints

Paris coding Sprint, 8-9 Sept. 2010

0.4 Coding Sprint, 16 & 17 June 2010

Edouard: Again the term Bayes might not lead to a clear organization of algorithms.

Paris coding Sprint, 8-9 Sept. 2010

0.4 Coding Sprint, 16 & 17 June 2010

Clone this wiki locally