Skip to content

R GSoC 2015: Project plan

Henrik Bengtsson edited this page Jul 19, 2015 · 17 revisions

Project plan

The project consists of two parts:

  • Part A: Subsetted computations
  • Part B: Multi-core processing

Milestones

Tools

  1. Milestone T1: All Milestones on GitHub.
  2. Milestone T2: Continuous Integration using Travis CI.
  3. Milestone T3: Code coverage using covr and Coveralls.

Part A: Subsetted computations

  1. Milestone A1: Requirement specifications (subsetted computations)
  2. Milestone A2: Tests and R mockup implementation (subsetted computations)
  3. Milestone A3: Benchmark reports (subsetted computations)
  4. Milestone A4: C implementation and help pages (subsetted computations)
  5. Milestone A5: Package release (subsetted computations)

Part B: Multi-core processing

  1. Milestone B1: Requirement specifications (multi-core processing)
  2. Milestone B2: Tests and R mockup implementation (multi-core processing)
  3. Milestone B3: Benchmark reports (multi-core processing)
  4. Milestone B4: C implementation and help pages (multi-core processing)
  5. Milestone B5: Milestone B4: Package release (multi-core processing)

Timeline

  • 2015-05-25: GSoC: Development begins.
  • 2015-05-27: Milestone T1: Add all Milestones to GitHub.
  • 2015-05-28: Milestone T2: Continuous Integration using Travis CI.
  • 2015-05-29: Milestone T3: Code coverage using covr and Coveralls.

Part A: Subsetted computations

  • 2015-05-10: Milestone A1: Requirement specification (subsetted computations).
  • 2015-06-01: Milestone A2: Tests and R mockup implementation (subsetted computations).
  • 2015-06-05: Milestone A3: Benchmark reports (subsetted computations)
  • 2015-06-18: Milestone A4: C implementation and help pages (subsetted computations).
  • 2015-06-22 -- 2015-07-05: DJ has exams.
  • 2015-06-29: Midterm evaluation code freeze.
  • 2015-07-03: GSoC: Midterm evaluation deadline.
  • 2015-07-05: Milestone A5: Package release (subsetted computations).

Part B: Multi-core processing

  • 2015-07-06: Part B of project starts.
  • 2015-07-10: Milestone B1: Requirement specifications (multi-core processing).
  • 2015-07-17: Milestone B2: Tests and R mockup implementation (multi-core processing).
  • 2015-07-24: Milestone B3: Benchmark reports (multi-core processing).
  • 2015-08-14: Milestone B4: C implementation and help pages (multi-core processing).
  • 2015-08-17: A week of cleanup, tweaks, bug fixes and documentation.
  • 2015-08-24: Final evaluation code freeze.
  • 2015-08-28: GSoC: Final evaluation deadline.
  • 2015-08-30: Milestone B5: Package release (multi-core processing).

Footnote: (*) Google Summer of Code has some formal GSoC deadlines which are labelled as 'GSoC' above.

Deliverables

Each milestone has corresponding set of deliverables.

Part T: Tools

Deliverable T1: All milestones on GitHub

  1. Setup GitHub milestones based on the milestones in Part A and Part B.
  2. Add @HenrikBengtsson and @hcorrada as GitHub collaborators (so they can edit milestones and labels).

Comment: This will make project more efficient and easier to manage.

Deliverable T2: Continuous Integration using Travis CI

  1. Setup personal Travis CI account and connect to personal GitHub account.
  2. Push to GitHub to build and check package on Travis CI.

Comment: See .travis.yml; should be work out-of-the-box. Example output at https://travis-ci.org/HenrikBengtsson/matrixStats.

Comment: This will make automate testing saving lots of time.

Deliverable T3: Code coverage using covr and Coveralls

  1. Run code coverage analysis using covr via make covr.
  2. Setup personal Coveralls account and connect to personal GitHub account.
  3. Push to GitHub to build and check package on Travis CI and finally coverage report to Coveralls. Example output at https://coveralls.io/r/HenrikBengtsson/matrixStats.

Comment: This will help us identify pieces of code that is not tested.

Part A: Subsetted computations

Please submit all code deliverables as pull-requests to 'feature/subsetting' branch of GitHub repository 'HenrikBengtsson/matrixStats'.

Deliverable A1: Requirement specification (subsetted computations)

  1. Google Document 'matrixStats: Subsetted Indexing'.

Comment: This will help identify what tests to write.

Deliverable A2: Tests and R mockup implementation (subsetted computations)

  1. Fully functional mockup R/*.R implementation of subsetting for all functions.
  2. Test scripts in tests/*_subset.R for above subsetting functions.
  3. All tests should pass. Failed cases should be commented out and explained in comments.
  4. Package should build and pass R CMD check --as-cran with all OK.

Comment: This will allow us to run tests as soon as possible. With CI (T2 + T3) we will have a contentiously overview of what works and whenever a bug is introduced it will be detected automatically and momentarily.

Deliverable A3: Benchmark reports (subsetted computations)

  1. RSP benchmark reports (inst/benchmarking/*_subset.md.rsp) for all subsetted functions, e.g. fcn(x, rows=rows) versus "manual approach" fcn(x[rows,,drop=FALSE]).

Comment: This will help us quantify the improvement relative to the "manual" approach. At the beginning the mockup (A2) and the manual approach should give similar benchmark results, but with the native implementation (A4) we should see improvements.

Comment: The style of benchmark report already proposed is good enough. Later we might extend it with a wider range of subsetting, e.g. 5%, 25%, 50%, 75%, 95%.

Deliverable A4: C implementation and help pages (subsetted computations)

  1. Fully functional src/*.h and src/*.c implementation of subsetting for all functions.
  2. All tests should pass. Failed cases should be commented out and explained in comments.
  3. All functions should have subsetting documented.
  4. Package should build and pass R CMD check --as-cran with all OK.

Deliverable A5: Package release (subsetted computations)

HB will verify/do the following:

  1. Package should build and pass R CMD check --as-cran with all OK.
  2. Submit package to CRAN.

Comment: Releasing package early has the advantage of getting community feedback sooner.

Part B: Multi-core processing

Please submit all code deliverables as pull-requests to '[feature/multicore]' branch of GitHub repository 'HenrikBengtsson/matrixStats'.

Deliverable B1: Requirement specifications (multi-core processing)

  1. Google Document 'matrixStats: Multi-core processing'.

Deliverable B2: Tests and R mockup implementation (multi-core processing)

  1. Fully functional mockup R/*.R implementation of multicore processing for all functions.
  2. Test scripts in tests/*_parallel.R for above multicore processing functions.
  3. All tests should pass. Failed cases should be commented out and explained in comments.
  4. Package should build and pass R CMD check --as-cran with all OK.

Deliverable B3: Benchmark reports (multi-core processing)

  1. RSP benchmark reports (inst/benchmarking/*_parallel.md.rsp) for all functions (whole matrices only; no subsetting needed)

Deliverable B4: C implementation and help pages (multi-core processing)

  1. Fully functional src/*.h and src/*.c implementation of multicore processing for all (relevant) functions.
  2. All tests should pass. Failed cases should be commented out and explained in comments.
  3. All functions should have multicore processing documented.
  4. Package should build and pass R CMD check --as-cran with all OK.

Deliverable B5: Package release (multi-core processing)

HB will verify/do the following:

  1. Package should build and pass R CMD check --as-cran with all OK.
  2. Submit package to CRAN.
Clone this wiki locally