Skip to content

Google Summer of Code 2019 Ideas

Shekhar Prasad Rajak edited this page Feb 27, 2019 · 23 revisions

Ideas for Google Summer of Code 2019.

Table of Contents

Contact

Feel free to reach us by joining #sciruby on chat.freenode.net or via our mailing list.

IMPORTANT NOTICE: SciRuby encourages diversity. Scientific progress in general benefits from diversity and software development for science is no exception. We are really happy that the number of people from Asia, Africa and South America applying for GSoC projects is increasing. Our org admin this year is from India, our previous org admin was from Brazil. We have had students from Japan, India, Sri Lanka, Russia, etc. We have women software developers in our programme. We are happy to hear from you all!

Instructions for students

We strongly recommend that you pick one of the ideas listed below. We value contributions in advance of GSoC, even if they're just little ones. Go pick out something in one of our trackers and work on it, talk to folks on the listserv, and get an idea for what features are needed.

You don't need to know a lot about Ruby to work on a project: depending on how much you already know, it'll be pretty easy to learn enough to be able to contribute. However, you may need some familiarity with scientific computation. If you don't have any, take a look at "Numerical Recipes in C", which you'll probably find in your university's library.

In any case, if you feel your skills aren't enough for some project, please ask us on our IRC channel (see contact section above) or our Google Group (see sciruby.com to sign up) and we can help you.

See also:

Read this before you commit your first patches

Most of the main SciRuby’s landing page on Github holds the stable version of SciRuby gems but developers and contributors should work on the very latest (bleeding edge) repositories in order to make sure that changes can be committed without conflict arising.

Try reading Finding The SciRuby Development Repositories on Github if you would like a brief introduction on finding the latest development gems to work on from Github. Also go through the coding guidelines before sending your first patch.

How to submit a patch ("pull request")

Here's a great tutorial: http://www.thinkful.com/learn/github-pull-request-tutorial/

Have a look and feel free to ask if you have any questions.

Instructions for mentors

Guidelines for mentors to submit projects:

  • Specify the name of your project as a heading.
  • Write a paragraph or two with further details.
  • Write a small 'Skills' section detailing the skills that the student must possess to complete the project.
  • Write down your own GitHub handle and contact details in a 'Mentor Details' section over which the student can contact you.
  • If anyone else wants to co-mentor a project, please specify your details along with the mentor's details.

Project Ideas

NMatrix projects

NMatrix is SciRuby's numerical matrix core, implementing dense matrices as well as two types of sparse (linked-list-based and Yale/CSR). NMatrix is a fairly well-established project which has received Summer-of-Code-like grants from both Brighter Planet and the Ruby Association (in other words, from Matz, who created Ruby). Those who contribute to NMatrix will likely eventually become authors of a jointly-published peer-reviewed science article on the library. Additionally, NMatrix is a good place to gain practical C and C++ experience, while also working to improve Ruby.

NMatrix currently relies on ATLAS/CBLAS/CLAPACK and standard LAPACK for several of its linear algebra operations. In some cases, native versions of the functions are implemented, so that the libraries are not required. There are quite a number of areas for growth in terms of the capabilities of NMatrix here.

Improving NMatrix

  • NMatrix reloaded is a reimplementation of NMatrix. It is faster than existing NMatrix (See link).
  • The student needs to work on implementing multiple dtypes and stypes in Ruby. Implemeting Yale notation is a priority.
  • Implement indexers for NMatrix.
  • Mentors: Prasun Anand(@prasunanand) , , Co-mentor - Shekhar (@Shekharrajak)
  • Recommended skills: Some C/C++ would be beneficial, as you'll need to be working under the hood on NMatrix.

Making daru-view independent

Learn basics of daru-view, from sciruby/blog or daru-view/wiki.

Daru (Data Analysis in RUby) is a library for analysis, manipulation and visualization of data. daru-view is for easy and interactive plotting in web application & IRuby notebook. It can work in frameworks like Rails, Sinatra, Nanoc and hopefully in others too.

It is a plugin gem to Data Analysis in RUby(Daru) for visualisation of data

Currently daru-view have dependencies with lazy_high_charts and googlevisualr, where SciRuby don't have any control. We have solved problems like (mainly):

  • daru dataframe or vector compatible plotting gem.
  • a gem that can work smoothly in any Ruby web application framework, IRuby notebook as well as terminal.

So now it is the time to be independent

Because -

  • we don't have much control over these gems and also we will be keep adding new features directly from HighCharts and Google Charts official sites.

  • we have extended (overload and override) most of the methods from lazy_high_charts and googlevisualr, to make it compatible for IRuby notebook and all ruby frameworks or to add new chart features already presents in HighCharts and Google Charts.

  • daru-view should be able to handle future chart types as well without (or very less) modifying codebase.

You can find more details about in this wiki page - 'Making daru-view independent'.Along with this we also want to consider new ideas written in Idea wiki page

Related links

About project

  • Skills: Basic knowledge of Ruby, Design pattern and Design Principles, Javascript and Ruby web application frameworks.
  • Mentors: Shekhar (@Shekharrajak), Sameer (@v0dro), Athitya Kumar (@athityakumar)
  • Difficulty: Moderate.

Rubyplot projects

Visualization is one of the single most important things for any non-trivial scientific stack, and Ruby is seriously lacking any serious support for a comprehensive plotting solution. Rubyplot aims to fill that gap.

Rubyplot aims to be the best visualization framework in Ruby for plotting anything, anywhere.

It started off as a GSOC 2018 project and has since undergone a complete rewrite. Visualization is a very important focus area for the Ruby community at this point of time, and successful student proposals should expect their work to be accepted into major Ruby conferences worldwide (which usually includes free travel ;) ).

Working on this project will mean working closely with both the international and Japanese Ruby communities and identifying use cases, usage patterns and how best to reflect them in your work. Expect lots of collaboration and networking from this project.

Read up the README and CONTRIBUTING guides in the rubyplot repo to get a brief overview of the current state of the library. Some of the projects listed might not be 3 full months of work so you should think about clubbing one or more projects together.

Following are GSOC 2019 projects for rubyplot:

Merge Scatter and Line plots into a single 'plot' interface

In the current state, Line and Scatter plots exist as two different kinds of plots. However, the crux of plotting both of these is exactly the same, the only difference being that in one kind of plot we have straight lines connected to co-ordinates while in the other the co-ordinates are simply 'decorated'. Thus can be many combinations of these plots, which can be combined together under a single 'plot' interface. This task will involve the following:

  • Write a 'plot' function for Rubyplot::Artist::Axes that is similar in function to matplotlib's plot function.
  • Support all the properties of the plot function as per matplotlib using a Ruby-like interface.

This project will greatly enhance the plotting interface of Rubyplot and pave the way for much greater expansion.

Add support for subplots and various other kinds of plots

Currently the support for various types of plots is very limited. This project will involve two things:

  • Adding support for multiple Axes in the same Figure.
  • Adding support for various kinds of plots as listed here.

The above list is not exhaustive. The student should feel free to propose their own plots.

  • Difficulty: Moderate
  • Skills: Ruby, geometry, familiarity with visualization.
  • Mentors: @v0dro (Sameer Deshmukh), Co-mentor - (Shekhar @Shekharrajak)

Speed up Daru

daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data in Ruby. Th has various features like :

  • Flexible and intuitive API for manipulation and analysis of data.
  • Easy plotting, statistics and arithmetic.
  • Easy splitting, aggregation and grouping of data.
  • Quickly reducing data with pivot tables for quick data summary. and so on.

You can find most of the examples in here

While it has many methods for data wrangling, it is slow for a lot of use cases (check out these benchmarks). This task will involve figuring out the slow areas of daru and porting them to Rubex, which is a language for writing C extensions for Ruby or using simple Ruby C extension.

Student needs to benchmark various daru methods and prove that porting them to Rubex will significantly impact performance.

Why this project is important:

  • SciRuby is planning for a powerful and fast Machine learning gem, that will be completely compatible with daru and namtrix gem. So we have to make daru faster and more powerful accordingly. We need to find a solution using namtrix as well.

Other tasks

  • Better error handler. Refer #479
  • Follow-up of GSoC'17: remove obsolete parts from main gem #405

Related links

More about daru

Skills: Experience in data analysis | Experience in Ruby and C | General understanding of how compilers work | Understanding of good benchmarking practices

Difficulty: Advanced

Mentor: @v0dro, Shekhar (@Shekharrajak), Co-mentor: Athitya Kumar (@athityakumar)


Binding SciRuby against HPC D libraries for artificial intelligence, linear algebra etc.

Usually C-extensions are written for speed. D is a language with C-like syntax that compiles to similar runtime speeds. D is safer than C and provides high-level OOP and FP support to boot. Here we propose to bind Ruby against D extensions. For this we will take some existing D projects, such as MIR, a high performance math library, and make good use of them in Ruby. For Python automatic wrappers exist and we may need to replicate those for Ruby.

The student will bind a number of pre-agreed functionalities, optimize them, document them and provide a path for similar exercises that can be done by others. Software deployment of mixed languages often proves difficult. For software deployment we will use Bioconda and/or GNU Guix to make sure others can use the setup.

Skills: Interest in multi-languages, high performance computing, C, D etc.

Difficulty: Advanced (indeed)

Mentor: @pjotrp, @george-githinji, members of @biod


Submitting Your Own Idea

If you have something completely different idea in your mind. First, you should start a discussion thread on the mailing list for your idea. The SciRuby will surely look into it and the idea may get improved during the discussion to be selected for GSoC period.

The best project for you is one you are interested in and are knowledgeable about. That way, you will be the most successful and productive in your project and have the most fun doing it, while we will be the most confident in your commitment and your ability to complete it.

Please use the below Idea Template to Mention Ideas:

Title

Idea

(project idea, how it will help Ruby community and future of the project)

Current status of the idea

(Describe the work that has been done and timeline)

Involved Software and technology

Difficulty

(Advanced, Intermediate, or Beginner and any specific comments on the difficulty)

Skills and Knowledge required

(Any prerequisite knowledge or approach needed)

You can’t perform that action at this time.