Skip to content
Vijay Barve edited this page Feb 5, 2019 · 3 revisions

Application

Why does your org want to participate in Google Summer of Code?

R is a large and complex software ecosystem involving a base system, several thousand add-on packages and a number of tools and information channels, mostly web-based. We expect to develop some R packages and enhance R’s web presence, as we have done in previous years with GSOC.

How many potential mentors have agreed to mentor this year?

20+

How will you keep mentors engaged with their students?

The number of mentors will probably be the same as in previous years, between 20 and 40 depending on how many project proposals are submitted.

We expect that mentors will be self-motivated to stay engaged with their students throughout GSOC. The main reason why is that mentors volunteer their time to create project proposals on our wiki page, and they are usually looking for students to help writing code, tests, and documentation for new or existing packages. One example from 2018 is Alex Drouin, who volunteered to mentor the successful Max Margin Interval Tree project, which implemented a machine learning model proposed in 2017 and originally coded in Python.

How will you help your students stay on schedule to complete their projects?

We require that students provide a detailed timeline in their project proposals. Furthermore, we suggest weekly calls between mentors and students, so that students can ask for and get help with their projects.

How will you get your students involved in your community during GSoC?

Often our GSOC students are already involved via R User Groups, college or university courses that involve R, and the UseR! conferences. We will recommend that new students blog about their project on R-bloggers, and get involved with some of R’s many mailing lists.

How will you keep students involved with your community after GSoC?

R has many packages, and volunteer developers move among these from time to time. We would be happy to have students stay with the overall R family rather than insist they stick with the particular package that they develop for GSOC.

In the past, we have had many GSOC students stay involved in the R community. For example, several former GSOC students (e.g. Ian Fellows, Susan VanderPlas, Carson Sievert) have returned in subsequent years to become GSOC mentors. Also, Yixuan Qiu was a student in GSOC2011 and has set up an R user group at his home institution in Beijing. Another example is Qin Wenfeng who coded re2r in GSOC2016, and is now the primary maintainer of https://rweekly.org/ Another example is Marlin Na who coded the TnT interactive genome browser R package in GSOC2017, and proposed a new project about symbolic computation for GSOC2018.

Has your org been accepted as a mentoring org in Google Summer of Code before?

Yes

Which years did you participate?

2008-2018

For each year you have participated, provide counts of successful and total students.

Historically we have had very few failures. For example in 2011 we failed 1/14 students and in 2012 we failed 1/16 students. However since 2013 we have instituted a policy of at least two mentors per student, and we have seen the failure rate drop to zero, even though there are more students than ever (24 in 2015, 22 in 2016). In 2017 and 2018 we had to fail students who decided to do full time jobs or summer course loads instead of GSOC.

What year was your project started?

1993

Where does your source code live?

Most packages are developed on GitHub (under their authors’ accounts) and then submitted to CRAN for releases and official distribution https://cran.r-project.org/ The base R source code is also distributed on CRAN.

Anything else we should know?

R is an official part of the Free Software Foundation’s GNU project, and the R Foundation is a not-for-profit organization working in the public interest. It has been founded by the members of the R Development Core Team in order to

  1. Provide support for the R project and other innovations in statistical

computing.

  1. Provide a reference point for interacting with the R development community.
  2. Hold and administer the copyright of R software and documentation.

Profile

URL

https://www.r-project.org/

Tagline

R is a free software environment for statistical computing and graphics

Logo

https://www.r-project.org/logo/Rlogo.png

Primary open-source license

GPL-2

Org Category

Programming languages and development tools

Technology tags

r-project, c, c++, fortran, javascript

topic tags

data science, visualization, statistics, graphics, machine learning

ideas list

https://github.com/rstats-gsoc/gsoc2019/wiki/table-of-proposed-coding-projects

Short description

R provides a wide variety of statistical and graphical techniques, and is highly extensible. R is often the tool of choice for research in statistical methodology.

Long description

R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes

  • an effective data handling and storage facility,
  • a suite of operators for calculations on arrays, in particular matrices,
  • a large, coherent, integrated collection of intermediate tools for data analysis,
  • graphical facilities for data analysis and display either on-screen or on hardcopy, and
  • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

The term “environment” is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.

R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

Many users think of R as a statistics system. We prefer to think of it of an environment within which statistical techniques are implemented. R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics.

R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hardcopy.

Application instructions

  • 1. look for a project that needs a student on

https://github.com/rstats-gsoc/gsoc2019/wiki/table-of-proposed-coding-projects

  • 2. Each project should have “tests” students can complete to demonstrate

relevant skills. After completing at least one test, please post your test results to a github repo, and add a link to your test results on the wiki.

  • 3. Send an email to the mentors of the project. Include a link to your

test results, and explain why you are interested in the project.

  • 4. If the mentors judge that you are capable of the

project, then they will respond and help you to write a proposal to submit to Google. It should include most of the details from the project proposal wiki page, and additionally a detailed timeline that explains your plan for writing code, documentation, and tests.

  • 5. Once your mentors have proof-read your proposal, submit it to google

https://summerofcode.withgoogle.com/

Proposal tags

new package, existing package, visualization, machine learning, data cleaning, statistics, finance, optimization, reproducible research, bioinformatics.

Mailing list

https://github.com/rstats-gsoc/gsoc2019/wiki

General email

gsoc-r@googlegroups.com

Blog URL

http://www.r-bloggers.com/

Clone this wiki locally