Skip to content
Frédéric Bastien edited this page Apr 9, 2015 · 30 revisions

This is a list of ideas for the Google Summer of Code 2015. You can have other good ideas. In all cases discuss them on theano-dev mailing list to make you are known to our community and to understand them better. This is important for your application; it should demonstrate that you understand what is needed to do.

For more information on how to apply, you should read this Python SoC2015 page because we participate via the Python Software Foundation. They request that you have one PR merged for candidate student. You can look at ticket marted as easy fix.

The current mentors are Frédéric Bastien, Pierre Luc Carrier and Arnaud Bergeron.

Theano Organization

Theano is a software library written in Python, C and CUDA (we also have a start of an OpenCL back-end). It is a domain specific compiler. This means that a user writes Python code to build a computation graph and asks Theano to compile it. Theano optimizes the graph and generates mainly C and/or CUDA code to perform the computation efficiently. It mainly supports computation on numpy.ndarray, but Theano also supports sparse matrices.

Theano is mostly developed by the LISA lab that works in machine learning, but Theano isn't restricted to machine learning; its capacity for optimizing computation makes it useful for many applications that rely on large amounts of numerical computation. There are also many Theano contributors outside of the LISA.

As you probably know, deep learning is changing the world and Theano is one of the main libraries that support this field!

Contacting us

The main communication channel is via our mailing list: theano-dev and theano-user. For GSoC related questions or discussion of projects, please use the theano-dev mailing list.

We mainly reside in the Eastern Standard Time zone so you will usually receive replies faster during our work day. Some of us, however, frequently work outside normal work hours or reside in other time zones.

For student in GSoC, we prefer that discussions stay public as much as possible. The mailing list or github is great for that. But for more interactive discussion, other means can be used. Last year, we used g-chat. This need to be discussed with mentors.

Highlighted ideas:

  • Add more linear algebra operation here, here and here

    • Difficulty: easy
    • Skill needed: Python
    • Problem: There is still many operation in numpy.* that we do not have under theano.tensor.* We have request for some of them from time to time. We should provide those operations. We should also implement the infer_shape and grad method when possible.
    • Mentors: Frédéric, Arnaud and Pierre Luc.
  • Lower Theano function call overhead

    • Difficulty: easy
    • Skill needed: Python and C code.
    • Problem: Each call to a function compiled by Theano have some overhead. If the graph does not contain much compututaion (like if it works on scalar) this can be significant.
    • Create a Filter Op and reuse it to move the logic that validate/convert the input in the Theano graph instead of the wrapping Python code.
      • Create c code of this Op to remove Python overhead
    • Split the Python code when calling a Python version in 2 layers: one with a fixed number of inputs without any keyword arguments and default values and one with those.
    • Move elemwise computation on numpy 0d array to theano scalar op(that represent the c type, so no object allocation)
    • Disable garbage collector for numpy 0d array?
    • Mentors: Frédéric, Arnaud
  • Interactive visualization of computational graphs

    • Difficulty: Easy/medium
    • Skill needed: Python. Knowledge of Python libraries for graph visualization and/or graphical displays is a plus.
    • Problem: Theano provides a feature to output a visual representation of a computation graph to an image file. This is very useful for debugging and it works well for small and simple graphs. However, for large and more complicated graphs, the image becomes very large, dense, and hard to navigate which makes it harder to analyse such graphs.
    • To allow debugging of large graphs, the interactive tool would need to allow a user to reposition nodes in the display, offer the option to display or hide functions inside of Scan nodes in the graph, and allow the user to highlight nodes based on certain criteria.
    • For consistency, it should attempt to use the same color scheme for different types of nodes and edges as the current visualization tool.
    • Mentors: Frédéric and Pierre Luc
  • Faster optimization phase during compilation

    • Difficulty: Easy/medium
    • Skill needed: only Python code. Understanding of algorithm complexity (O(1) vs O(n)) useful. Some experience with Theano useful to start faster.
    • Problem: The Theano compilation phase optimization, is slow for big graph. This make Theano hard to use with big graph, especially when the user is developing its model.
    • ticket Ask on the mailing list user for slow case. Then profile them and find/fix the bottlenecks.
    • This can work as all the problems found up to now for slow case are due to how the optimization are implemented or the optimization order or the algorithm or how they got applied. Only real use case can reveal the real bottleneck.
    • ticket Another complimentary approach is to finish OpFromGraph. This would allow to split the optimization on different section and so work round the super linear optimization time in the graph size.
      • For this, we need to make it work on GPU, make sure it works with C code and document it. Make sure infer_shape, grad work with it. Ideally, make Rop work too.
    • Mentors: Frédéric and Arnaud
  • Faster linker phase during compilation

    • Difficulty: medium
    • Skill needed: Python and C code.
    • Problem: The first time we compile a Theano function, we compile many C shared library. This time consumming. As we cache them, it less of a problem for later call, but as it still can take ~1h in some case, an upgrade there would be very useful.
    • The only way is to compile less shared module (Currently we compile about 5k Python module for Theano tests)
      • Check the content of the Theano cache for Theano tests and slow case. Then find way to combine many case together. For example, the indexing operation could be more generic without loosing execution speed. So this would generate less different case.
      • (Started) Make elemwise c code generate code for many dtypes at the same times. Elemwise is the op with the highest number of generated c code.
    • Mentors: Frédéric and Pierre Luc
  • Less Theano function compilation by reusing compiled function

    • Difficulty: medium
    • Skill needed: Python.
    • Problem: As the compilation take times, and that it happen frequently that we want to compile the same function but with only different shared variable, this make the compilation time even more important. For example, we want to test a modele on a validation dataset and a test dataset where the dataset are stored in shared variable. Then we current need to compile 2 function. Also we need to compile separate train and test function even if in most of the time, the test function is just a subset of the train function. Also with multi-thread/multi-processing and python, we need each Theano function to be used by only 1 thread/process at a time. So this request compiling the same function multiple time if we want parallelism.
    • First step, allow to re-generate a function from a compiled one.
    • Then allow to swap shared variable for the new function.
    • Then allow to remove outputs/updates at the same time (just remove computation from the optimized graph)
    • Mentors: Frédéric, Arnaud and Pierre Luc.
  • Lower peak memory usage

    • Difficulty: medium
    • Skill needed: Python, algorithmic understanding (O(1) vs O(n))
    • Problem: We currently compare the peak memory usage we currently have again the min theoretic peak. But the computation of the min theoretic peak is too slow for many basic case. So speed it up. We have see in some case that we use more then the min theoretic peak. So it would be useful to know in more normal case if we got hit by that.
      • As that algo is too slow, we can try approximations during profiling: random search and a fast algo that give the right result when the graph is a tree (this would be an approximation as Theano graph are dag)
      • After comparing those 2 algo on during profiling, make Theano function use them. This will request that the user pass expected shapes for the inputs.
    • Mentors: Frédéric, Pierre Luc and Arnaud
  • Bridge Theano with other compiler and library (Numba, Cython, Parakeet, ...)

    • Difficulty: medium
    • Skill needed: Python mostly, but knowing C would help some of them.
    • Problem: There is many other system that have very optimized code for some case or allow to generate faster code then Python code more easily then writing C code (like Numba, Cython). Making it easier to use them with Theano would be very valuable.
    • Update the compilation system to compile other library more easily by reusing the Theano compilation cache mechanims.
    • Make an easy to use interface to use Cython with Theano. We currently do it manually for the Scan op.
    • Make an easy to use interface to reuse Numba (we provide just an example for now)
    • Make Theano use the C interface of a Numba function
    • Mentors: Frédéric, Arnaud and Pierre Luc.
  • Generate a shared library (a proof of concept is available as a starting point)

    • Difficulty: medium
    • Skill needed: Python and C.
    • Problem: It would be very useful to generate a shared library from a Theano function. This would allow to reuse it in other program and on embdede system more easily.
    • Bring the prototype to a working version without adding new feature.
    • Document it.
    • Add support for scalar constant value in the graph.
    • Make a configuration option to enable/disable GC of intermediate results.
    • Make an interface to support shared variables.
    • (If time permit) To make it work on Windows, we need to back-port some c code that use C99 features.
    • Mentors: Frédéric Bastien, Arnaud and Pierre Luc
  • An example for Android

    • Difficulty: medium
    • Skill needed: Python, C. Knowing Android would help.
    • 2 possible cases:
      • Full Theano with dynamic compilation
      • Only the dynamic DLL from the point above. This could need only the first part of above.
    • Mentors: Frédéric and Pierre Luc
  • OpenCL

    • Difficulty: medium.
    • Skill needed: Python and C. Understanding parallel computation a must. Knowing CUDA and/or OpenCL a plus.
    • Continue ongoing work in development branch to build OpenCL support
    • Port current CUDA implementations to OpenCL
    • Add OpenCL implementations for unsupported expression types
    • Tune existing OpenCL kernels for various operations
    • Mentors: Frédéric and Arnaud
  • Partial evaluation of a Theano function

    • Difficulty: medium/hard
    • Skill needed: Python and C code.
    • This would help to make compilation faster, but would allow to compile the eval function on the train set much faster (only 1 compilation + 1 recompilation/partial evaluation).
    • This would allow to unroll graph and select to compute only for some partial execution (unrolled graph)
    • Mentors: Frédéric and Arnaud
  • Improve pickling of Theano objects

    • Difficulty: very hard.
    • Skill needed: Python.
    • Theano Shared variable pickling with and without GPU.
    • Cache the compilation step in the compiledir (started, but need to be finished gh-)
    • Mentors: Frédéric and Arnaud

Other ideas not sorted and not developed:

  • IfElse (lazy evaluation) c code and can be inplace on two inputs
  • Faster optimization phase (use a SAT Solver?)
  • Allow to do memory profiling in the CVM (now it use the VM)
  • Re-write DebugMode to reuse the CVM and simplify it
  • less opaque theano.function()
  • Track user usage of Theano with their permission
    • Allow to find bug that would have affected you in the past too.