-
Notifications
You must be signed in to change notification settings - Fork 44
Google Summer of Code: 2020
For a general overview of Google Summer of Code (GSoC) and Boost's related activities, please read GSoC Overview
This year's GSoC will be housed at https://github.com/BoostGSoC20 Students may find examining past GSoC source code and commit histories of use.
To any potential mentors, please add your projects here
To any potential students, in addition to the list of project ideas, we encourage original ideas proposed by students. If you have got one, please post your idea to the mailing list and ask mentors for comments. See Finding the Right Project in the Google Summer of Code Guides.
Boost.uBlas: Matrix and Tensor Computations
Potential mentor: David Bellot, Cem Bassoy
- hands on experience with modern C++ template metaprogramming techniques
- basic knowledge in matrix and tensor computations
- strong mathematical background
Boost.uBlas has been initially designed to support users with matrix computations for linear algebra applications. The design and implementation unify mathematical notation via operator overloading and efficient code generation using expression templates. Starting with the GSoC 2018 project, uBlas has been extended by a flexible tensor data type and basic tensor operations supporting general tensor contractions and the Einstein notation. The goal of the new tensor extension is to support the implementation of algorithms for e.g. machine learning and quantum computing applications. A very good introduction is given by Tamara Kolda on Youtube. GSoC students genaralized and extended the tensor typein GSoC 2019. Last year, GSoC students added Github Actions and Clang analyzers and sanitizers as part of GSoC 2020 .
We propose two projects from project list that will further increase Boost.uBlas capabilities:
Project 1: Finalize and improve the subtensor
type
Potential mentor: Cem Bassoy, David Bellot, (Amit Singh)
Subtensors, i.e. views of tensors, are used in many numerical algorithms in which specified array sections are accessed. Subtensors, however, do not own but refer to the tensor data. An extensions for subtensors is already available in the subtensor feature branch such as subtensor.hpp. A simple example of tensor and subtensor creation and usage is for instance:
using tensor = boost::numeric::ublas::dynamic_tensor<float>;
using slice = boost::numeric::ublas::sliced_span;
using end = boost::numeric::ublas::span::end;
// Create tensor t of order 3 and dimensions 4,3 and 5
auto t = tensor{4,3,5};
// Create subtensor ts where ts references the domain specified by the slices
// Subtensor ts has the same order 3 but the dimensions 3,2,1
// Tensor elements are not copied
auto ts = t ( slice(1,3), slice(1,end), end-1 );
// Create a copy tss of subtensor ts
// Both instances have the same order and dimensions
// Tensor elements are not copied
auto tss = ts;
The student shall improve the existing subtensor implementation and create during GSoC multiple pull requests with well tested code. Possible improvements could be the usage of a separate tag such as ublas::subtensor_tag
in order to simplify the overall design and usage. The student should also experiment with the auxiliary types and functions to enhance maintainability. This is an important and a challenging project. We expect the student to be motivated and to have a strong background in mathematics and algorithm design using generic programming techniques with C++17.
Project 2: Create new matrix
and vector
types
Potential mentor: David Bellot, Cem Bassoy, (Amit Singh)
uBlas has been written many years ago using the C++03 standard using older template metaprogramming techniques. With the new uBlas tensor
extension, we want to simplify the existing ublas::matrix
and ublas::vector
templates by using the tensor extension as a base implementation.
We expect the student to generate experimental ublas::experimental::matrix
and ublas::experimental::vector
templates by specializing the ublas::tensor_core
declared in tensor_core.hpp and implement the existing matrix and vector operations using the C++17 standard, see operation overview. Depending on the progress of the first project, we want the student to also implement submatrix
and subvector
types and provide a simple qr-decomposition. We expect the student to be motivated and to have a strong background in mathematics and algorithm design using generic programming techniques with C++17.
Potential mentor: David Bellot
In the field of data science, R, Julia and Python are presumably the most used languages to perform data analysis. They all come with many desirable features. Last year, a sucessful GSoC'19 student implemented data frames for uBlas. We want to take his project to another level and:
- finalize and productionize his implementation of data.frames for an immediate release in the next version of Boost
- add a collection of free form functions, and modern data analysis procedures (as found in the R tidyverse, in Python pandas/numpy, and in many other packages) based on the use of vector and matrices The goal of this project is to lay the foundation of uBlas as a new data science library and to see production-ready code being published by the end of the summer of code.
The student will have to study the code base of uBlas first and build on top of the existing API. We are in favor of promoting a functional programming style in order to get all the benefits of laze evaluation to optimize the code as much as possible. Documentation must be written too for the project to be succesfull and production-ready by the end of the summer.
- Fork uBlas and use the
develop
branch for your implementation - Create a
matrix
folder in-
include/boost/numeric/ublas
for your implementation -
test
for the unit-tests -
examples
for the QR decomposition example
-
- Provide a new
matrix
andvector
C++17 implementation with a functionality similar to:A = zeros(3,2);
-
C = A+B
,C = 2*A
, etc. for elementwise matrix operations -
C = A*B
for matrix multiplication -
C = A'
for matrix transposition -
A==B
for elementwise comparision -
c = A[3]
for accessing elements with a single zero-based index -
c = A(3,2)
orA(3,5) = c
for accessing elements with a two zero-based indices -
C = A(1:2,1:3)
to generate amatrix
instance that contains data ofA
referenced by the ranges1:2
and1:3
- Use the C++ standard library for your matrix implementation wherever possible (e.g. use
std::tuple
andstd::tie
) - Implement auxiliary types such as
range
/span
and helper functions such aszeros
,size
,norm
to offer maximum readibility - Modify the
README.md
for Github Actions usage - Implement a C++
qr
function QR-decomposing a matrix with a minimal number of code lines based on the following Matlab code - Implement a C++
main
function showing withA == Q*R
that yourqr
decomposition works correctly - Once you are finished, place the link of your repository inside your GSoC proposal. DO NOT POST/SEND your repo link
function [Q,R] = qr(A)
[m, n] = size(A);
Q = zeros(m,n);
R = zeros(n,n);
for k = 1:n
R(1:k-1,k) = Q(:,1:k-1)' * A(:,k);
v = A(:,k) - Q(:,1:k-1) * R(1:k-1,k);
R(k,k) = norm(v);
Q(:,k) = v / R(k,k);
end
end
Boost.Multiprecision: Multiprecision Big Number Types
Potential mentor: Christopher Kormanyos
Project requires knowledge of modern C++ template metaprogramming techniques.
Boost.Multiprecision http://www.boost.org/libs/multiprecision is a mathematical library that offers high-performance multiple precision integer, rational and floating-point types with precision vastly exceeding those of built-in float, double and long double.
We are looking for a student to assist with extending and optimizing Boost.Multiprecision to higher precision of many thousands of bits or more.
This is a challenging project with relatively large visibility. It will strengthen mathematical and algorithmic programming skills and is suited for students whose studies and interests include modern C++, algorithms, mathematical programming, and advanced template and generic programming. High-precision is essential in numerous areas, for instance to identify and correct numerical instability. It is also an important tool used extensively in climate and oceanic simulations, transportation, navigation, signal and image processing and in AI research.
Extend and optimize Boost.Multiprecision to higher precision of thousands of bits or more.
Tasks include:
- We will begin by getting the test system ready for higher precision.
- Adapt Boost.Multiprecision Arithmetic Tests to higher precision of thousands of bits.
- Adapt Boost.Math.Constants to higher precision of 1,000 to 10,000 decimal digits.
- Adapt some elementary transcendental function tests to higher precision.
- Test an existing Karatsuba multiplication prototype in this environment.
- Optionally optimize Karatsuba multiplication.
If time permits:
- Extend special function calculations for high-precision.
- Implement successive iterative and AGM methods.
- Begin with elementary functions such as log, exp and progress to selected special functions.
- Optionally implement and test the PSQL algorithm to search for mathematical constants.
Using latest Boost.Multiprecision, calculate and print the real-valued,
positive integral square root of an unsigned integer having 1024 bits or more.
Print the result to std::cout
in both decimal as well as hexadecimal
representation.
Use the existing Boost.Multiprecision cpp_bin_float
type to perform
an extended-precision (say 50 decimal digits of precision)
floating-point calculation of any function you like.
This could be a linear combination of elementary transcendental functions,
a numerical derivative, a numerical integration or similar.
Show your mastery of looping and handling iterative calculations.
In this sense, it would be good to select a computation that
has enough depth to require an iterative loop such as a for
-loop.
A Taylor series approximation or another kind of asymptotic approximation
of your choice will be a good choice.
Boost.GIL: Generic Image Library
Potential mentor: Mateusz Loskot, Pranam Lashkari
All projects with Boost.GIL require knowledge of C++11 and at least a basic understanding of image processing algorithms.
This proposal is continuation of Image Processing Algorithms project submitted and developed by Miral Shah during GSoC 2019.
Boost.GIL provides core features for images and, thanks to Miral Shah's work, a collection of basic image processing algorithms. However, there still is more algorithms to be added to the collection.
Check GIL's wiki page Image Processing Algorithms with table presenting the algorithms implementation status and the wish-list.
- Dilation
- Erosion
- Blurring / Smoothing
- Sharpening
- Wiener filter
- Average filter
- Median filter
- The GSoC 2019 project implemented lots of improvements and new features in the kernels and convolutions, but this area may still benefit from further development, optimisations and documenting.
5. Your favourite image processing algorithms
6. Image Processing Documentation
- We need a beautifully written and presented documentation of the image processing features in GIL
- For example, https://github.com/boostorg/gil/issues/396, describes an idea of the docs structure based on the "Principles of Digital Image Processing" book
Histogram is one of essential tools in the image processing techniques. Although it belongs to the image processing algorithms, we propose this as a separate self-contained project.
Boost.GIL presents how to compute histogram in the documentation, Tutorial: Histogram, as well some existing algorithms already compute histogram as in case of the Otsu's thresholding, but there is no functionality in GIL available via public interfaces for histogram computation and operations.
This project proposal is about making histogram a first-class feature in GIL.
List of suggested topics [1]:
- Computing histograms
- 1D histogram of single image (8-bit grayscale image)
- Histograms of images with more than 8 bits (Binning technique)
- Histogram of color images
- Intensity histogram (Luminance)
- Individual color channel histogram
- Combined color histogram
- 1D histogram of individual components of any color space of single image (HSV, RGB, XYZ)
- 2D histogram [2] projections [3] of single image (e.g. H and S of HSV, R and B of RGB, etc.)
- 3D histogram of single image [4]
- Cumulative histogram
- Point operations - histogram transformations or histogram-based image enhancements
- Histogram Normalization
- Histogram Equalization
- Histogram Specification - adjusting an image to a given reference histogram
- Segmentation
- Thresholding - use of new histogram computation in the GIL thresholding algorithms
-
Extras - optional ideas for when the histogram core features are implemented
- Histogram Backprojection [5] - Where in the image are the colors that belong to the object being looked for? [6]
- 2D histogram scheme for colour image segmentation [7]
- Integration with Boost.Histogram
- Histogram visualization - for testing and debugging purposes
- If there is support for Boost.Histogram, we can rely on its ASCII 1D plots
- GIL extension for basic plotting of histogram SVG (e.g. see SVG in Boost.Geometry)
- GIL extension based on a third-party plotting library (e.g. minimal integration with https://plot.ly API)
References:
- [1] Principles of Digital Image Processing - Fundamental Techniques by Wilhelm Burger, Mark J. Burge
- [2] https://docs.opencv.org/master/dd/d0d/tutorial_py_2d_histogram.html
- [3] Digital Image Processing: An Algorithmic Introduction Using Java by Wilhelm Burger, Mark J. Burge
- [4] 3D Color Inspector plug-in for ImageJ
- [5] https://docs.opencv.org/master/dc/df6/tutorial_py_histogram_backprojection.html
- [6] http://www.inf.ed.ac.uk/teaching/courses/av/LECTURE_NOTES/swainballard91.pdf
- [7] https://www.researchgate.net/publication/233658046_A_new_2D_histogram_scheme_for_colour_image_Segmentation
Using latest Boost.GIL, implement a simple convolution filter and show its use in a test application to detect edges in a grayscale input image.
Alternatively, choose a simple algorithm from any topic of image processing, preferably related to your interests regarding the GSoC project, and propose its implementation using latest Boost.GIL.
Alternatively, bonus project that allows you to show your C++ skills beyond usage of Boost.GIL: Using C++11 type traits and Boost.MP11, define promote
metafunction that tries to promote (find) fundamental integral type T
(e.g. char
, short
, etc.) to another integral type with size roughly twice the bit size of T
. For example, for using P = promote<char>::type
should assert the following static_assert(std::is_same<P, short>::value)
or possibly static_assert(std::is_same<P, int>::value)
.
If you get stuck, don't hesitate to submit even partial solution.
Potential mentor: Damian Vicino, Laouen Belloli
In 1936, Alan Turing introduced computable real numbers in his foundational paper ‘On Computable Numbers, with an Application to the Entscheidungsproblem’ [1]. He defines: ‘The computable numbers may be described briefly as the real numbers whose expressions as a decimal are calculable by finite means’. Different authors provided several equivalent definitions, for example, an alternative equivalent definition is 'a number is computable if there exists an algorithm to produce each digit of its decimal expansion and for any digit requested it finishes the execution successfully' [2].
Around these ideas, a new area of applied mathematics was developed based on the theory of computable real numbers, called Computable Calculus [2]. In Computable Calculus, it is common to represent numbers using algorithms that can generate their digits and, define arithmetic functions that lazily operate them [2].
Using this approach is possible to calculate solutions for arithmetic problems that require accurate results. For example, in the construction of generic Discrete Event Simulation timelines [3].
During GSoC 2018 and 2019, Real was partially implemented exploring different designs. Now, with a stable design, we want to focus in finishing up the details to start the review process. Some topics being discussed are the integration with Boost.Math, implementation of good examples of usage, improvements to the division and multiplication algorithms.
Current status is described here:
- https://medium.com/@laobelloli/boost-real-9e2dfbfbed5b
- https://universenox.github.io/gsoc19_Final_Eval
- https://sagnikdey92.github.io/GSoC
- Describe an optimal representation for Real number "digits".
- Explain how could you provide a "Small String Optimization"-like approach for Real numbers representation at compile time.
[1] Alan Mathison Turing. On computable numbers, with an application to the Entscheidungsproblem. J. of Math, 58(345-363):5, 1936.
[2] Oliver Aberth. Computable Calculus. Academic Press, 2001.
[3] Vicino, Damián, Olivier Dalle, and Gabriel Wainer. An advanced data type with irrational numbers to implement time in DEVS simulators. Proceedings of the Symposium on Theory of Modeling & Simulation. Society for Computer Simulation International, 2016.
Potential mentor: Pranam Lashkari, Sarthak Singhal
This project was first introduced in GSoC 2018 and then continued in 2019. This library is not yet released with boost and still in the development phase. All the project listed below targets to bring this library to a review ready state. This library tries to mitigate the problems with the astronomical coordinate system and FITS file which is the most commonly used file type to store the astronomical data.
Currently, the skeleton for storing data of the different coordinate system is developed which involves ICRS, CIRS, Galactic, Supergalactic, Heliocentric, Geocentric and Alt-Az. This project involves the development of a generic system to convert these coordinate systems from one to another.
Develope a class for affine transformation which should take an affine matrix(3x3) as a parameter. Using this class object we must be able to transform cartesian_representation vector. You can use any boost library if required. (Plus points if all representations can be transformed)
Note: A good fit for this project will be a student who has prior knowledge of astronomy, geometry and algebra apart from programming in C++
This project involves developing user APIs. Basic parser[1] to read FITS[2] file has been already developed but there are no hight level APIs available which can make this FITS module easy for the user. (i.e: By calling a single function user should be able to read the file. Currently, the parser is divided into several parts and each part reads a particular type of data in file)
Develope a small class/function which will be able to read the primary header[3] of the FITS file. Also, provide a way to extract the desired header value from the primary header. You can use any boost library if required. You can find sample FITS file here (Plus points for developing ways to enter the header-value in the primary header and writing into the file)
This project do not require any knowledge of astronomy or math but a good knowledge of file storage and file handling with C++
[1] https://github.com/BoostGSoC19/astronomy/tree/develop/include/boost/astronomy/io
[2] http://archive.stsci.edu/fits/users_guide/
[3] http://archive.stsci.edu/fits/users_guide/node19.html#SECTION00511000000000000000
Boost.Multiprecision: New Big-Double Backend Types
Potential mentor: Christopher Kormanyos
Project requires knowledge of modern C++ template metaprogramming techniques.
Boost.Multiprecision http://www.boost.org/libs/multiprecision is a mathematical library that offers high-performance multiple precision integer, rational and floating-point types with precision vastly exceeding those of built-in float, double and long double.
We are looking for a student to assist with a feasibility study for realizing big-double backend types. There are several existing interesting backend options, including the so-called double-double and quad-double, libbf, or extending libquadmath to a portable form. This GSoC project will include implementing, testing, banchmarking and checking compatibility with Boost.Math.
This is a challenging project with relatively large visibility. It will strengthen mathematical and algorithmic programming skills and is suited for students whose studies and interests include modern C++, high-performance mathematical programming, and advanced template and generic programming. The implementations of double-double and quad-double use existing hardware floating-point processor operations. Compared with traditional extended precision methods such as those of GMP, this can potentially offer improved performance in low-to-medium digit ranges of less than 100 decimal digits. Other potential backends such as libbf offer other advantages in particular domains. The resulting work can be of essential use for high-performance calculations such as many-body simulations, lattice calculations and other state-of-the-art areas branching into physics, climatology, communication, transportation, and others.
Implement a feasible realization of double-double and quad-double or libbf. This project is based on existing, well-researched, modern user requests such as this one for quad-double. and this one for libbf.
Tasks include:
- Wrap an existing implementation, such as QD or libbf.
- Test the performance of this new type compared to Boost's existing wrapped version of MPFR.
- Get a test system ready for strongly exercising these types in the relevant precision range.
- Make specific math tests for this type and verify numerical correctness and proper C++ behavior.
If time permits:
- Extend double-double and quad-double to a generic multiple-quad-double of even higher precision.
See PROJECT 1 above.
Boost.Geometry: Generic Geometry Library
Potential mentors: Vissarion Fysikopoulos, Adam Wulkiewicz
All projects requires knowledge of C++ template metaprogramming techniques.
Boost.Geometry part of collection of the Boost C++ Libraries, defines concepts, primitives and algorithms for solving geometry problems.
See http://www.boost.org/libs/geometry
Implement algorithms for the concave hull problem (e.g. from [1][2] etc.). This is the POSTGIS related function https://postgis.net/docs/manual-2.5/ST_ConcaveHull.html See also: http://boost-geometry.203548.n3.nabble.com/concave-hull-td4026717.html
Implement support for non-cartesian geometries in convex_hull() either by adapting existing algorithm and strategies (preferable) or implementing it as different algorithm. See: https://en.wikipedia.org/wiki/Convex_hull_algorithms
As mentioned above you may choose to either provide links to existing library you developed or take the competency test. In case of the latter the requirements are listed below.
PROJECT 1, 2. Implement and test “Gift wrapping” convex hull algorithm for types adapted to Boost.Geometry MultiPoint concept.
[1] Moreira, Adriano J. C. and Maribel Yasmina Santos. “Concave hull: A k-nearest neighbours approach for the computation of the region occupied by a set of points.” GRAPP (2007).
[2] https://www.iis.sinica.edu.tw/page/jise/2012/201205_10.pdf
- Home
- Getting Started
- Development Guidelines
- Releases
- Community
- Google Summer of Code
- Google Season of Docs