GSoC 2013 Application Benjamin Fishbein: Span Class

INTRODUCTION

My name is Benjamin Fishbein. I am an undergraduate student at the University of Illinois at Champaign-Urbana where I am double majoring in Computer Science and Mathematics. I initially realized my passion for both Computer Science and Mathematics while in high school in New Jersey. My passion for both of these fields has only intensified in college. I can be contacted at fishbeinb@gmail.com or at fishben2@illinois.edu. My GitHub username, IRC Handle, and Google Code username is fishbeinb.

BACKGROUND AND PROGRAMMING EXPERIENCE

I have experience coding in C++, Objective-C, Python, and Java. I code on a laptop running Windows. My favorite editor is Eclipse because of its ease of use and its strong debugging tools. I taught myself Python while in high school and it is my language of choice. Throughout my career as a programer I have created apps for both iOS and Android. Most recently I have created Tic-Tac-Ception which is a multi dimensional tic-tac-toe game which implements customizable graphics, online multiplayer, and a fully functional secure login system. My favorite feature of Python that is lacking in most other languages is its simple yet powerful looping mechanism. For example, looping through a set of elements in a list (for example “a”) is simply “for x in a:” In other languages this is a more difficult task. In my opinion this is also the most powerful feature I have used in Python as it allows for more fluid code. My favorite feature of SymPy is its ‘Solve’ function, a simple, yet essential function which is lacking in nearly all other programming languages. I frequently use this function. I used the ‘Solve’ function of SymPy to help me easily complete the Bullseye problem in Google Code Jam:

x = Symbol(‘x’)

answer = solve(x2r + 4x(x - 1)/2 + x - t, x) #r and t are integer variables.

I have used git to send a patch to SymPy. My patch added the ‘rank’ function to SymPy’s Matrices class. Here is a link to the patch https://github.com/sympy/sympy/pull/1870

PROJECT

I am proposing a class which will improve SymPy’s matrices modulus making it a more dynamic and versatile tool. Specifically, the project involves converting Matrices to Spans, being able to find out useful information about the Spans, and being able to manipulate the Spans. Also I will implement more advanced features of Spans such as the basis of a Span, and the image, basis image, kernel, and basis kernel of a matrix. Additionally, I will implement a feature that converts a Span back into a matrix.

Converting a matrix to a Span would function as follows:

x = Span([1,2,3],[4,5,6])

x

/[1] [4]\

|[2] [5] |

[3],[6]/

x.dim()

2

Manipulating Spans:

x

/[1] [4] [7]\

|[2] [5] [8] |

[3],[6], [9]/

x.dim()

3

Finding the basis of a Span:

y = x.basis()

y

/[1] [4]\

|[2] [5] |

[3],[6]/

Finding the basis image of a matrix:

x = Matrix([1,2,3],[4,5,6],[7,8,9])

y = x.basis_image()

y #Span([1,2,3],[4,5,6])

/[1] [4]\

|[2] [5] |

[3],[6]/

As needed, further features will be implemented, and as suggested by David Joyner everything will work over finite fields^1.

IMPLEMENTATION

I plan on using many resources to help me understand, solve, and implement functions currently lacking in SymPy. One of these resources “Linear Algebra with Applications Fourth Edition” by Otto Bretscher outlines algorithmic methods for functions such as finding the Kernel or Basis image of a matrix.

Here is Bretscher’s method for finding the kernel of a matrix (slightly paraphrased for formatting purposes).

Given a n X m matrix A, for example:

[1, 2, 2, -5, 6]

[-1,-2,-1, 1, -1]

[4, 8, 5, -8, 9]

[3, 6, 1, 5,- 7]

We have to solve the linear system:

Ax = 0

First, find the reduced row echelon form of A (rref(A)):

[1, 2, 0, 3, -4]

[0, 0, 1, -4, 5]

[0, 0, 0, 0, 0]

[0, 0, 0, 0, 0]

Next set up a system of equations involving all rows in rref(A) with a leading coefficient of 1:

[1x1 +2x2 + 0x3 + 3x4 + -4x5 = 0] or [x1 = -2x2 - 3x4 + 4x5]

[0x1 +0x2 + 1x3 + -4x4 + 5x5 = 0] [x3 = 4x4 - 5x5 ]

Expand the system to include all variables:

[x1 = -2x2 - 3x4 + 4x5]

[x2 = x2]

[x3 = 4x4 - 5x5 ]

[x4 = x4]

[x5 = x5]

The kernel of A consists of the solutions of the system

let x2, x4, x5 be arbitrary constants s, t, r respectively

......[x1] [-2s -3t 4r ] [-2] [-3] [4]

......[x2] [s ] [1] [0] [0]

x = [x3] = [ 4t -5r ] =s[0] +t [4] + r[-5]

......[x4] [ t ] [0] [1] [0]

......[x5] [ r ] [0] [0] [1]

Using the concept of the Span

......................./[-2] [-3] [4] \

......................./ [1] [0] [0] \

Ker(A) = Span | [0] [4] [-5] |

.......................\ [0] [1] [0] /

.......................\[0], [0], [1 ]/

Quick re-cap:

1. Find rref(A)

2. Set up a system based of rref(A)

3. Interpret the system by combining like terms

To find the basis image of a matrix Bretscher recommends transforming the matrix into rref form and taking the original pivot columns.

For example:

Given a Span

.........../[1] [2] [2] [-5] [6] \

x=span| [-1] [-2] [-1] [1] [-1] |

............| [4] [8] [5] [-8] [9] |

............\ [3], [6], [1], [5], [7]/

Convert to a matrix ‘A’:

[1, 2, 2, -5, 6]

[-1,-2,-1, 1 ,-1]

[4, 8, 5, -8, 9]

[3, 6, 1, 5 ,-7]

Find the reduced row echelon form of A (rref(A)):

[1, 2, 0, 3, -4]

[0, 0, 1, -4, 5]

[0, 0, 0, 0, 0]

[0, 0, 0, 0, 0]

There are pivots in the first and third column, so the basis is the original first and third column:

.................../[1] [2] \

basis=span| [-1] [-1]|

...................| [4] [5]|

...................\ [3], [1] /

Unfortunately, SymPy’s current matrices system is limited in certain regards. Specifically, it is unable to compute characteristics of matrices over finite fields. This means that if one wants to find the rank of a matrix one can do so over all rational numbers (QQ). However, SymPy has no implementation for this when the matrix is restricted to finite field of numbers.

An example of this discrepancy can be seen when trying to find the Rank (the number of linear independent columns/rows) of a matrix.

Take, for example, the matrix B:

[1, 0, 1]

[0, 1, 1]

[1, 1, 0]

When unrestricted using all rational numbers each row in this matrix is linearly independent. However, when restricted to only using numbers in the finite field GF(2) only two of the rows are linearly independent.

The finite field GF(2) only has two numbers {0, 1} These numbers satisfy the addition and multiplication rules depicted in the tables:

-/+ 0 1

0|0|1

1|1|0

X 0 1

0|0|0

1|0|1

Following addition rules set out above we can see that row 3 is actually row 1 + row 2.

My current linear algebra book only mentions matrices over finite fields in passing and does not include exact algorithmic methods for finding their characteristics. However, after a very preliminary search it is apparent that there are a plethora of resources available for finding such methods. Books such as “Finite Fields, Linear Algebra, and Number Theory” by Aiden A. Bruen and Mario A. Forcinito, papers such as “Computational linear algebra over finite fields” by Jean-Guillaume Dumas and Clement Pernety http://arxiv.org/pdf/1204.3735v1.pdf, and to a lesser extend general knowledge websites such as wikipedia’s many articles on finite fields and Finite-rank operator can all be referenced.

PROJECT RELEVANCE

I’m excited about this project because the inclusion of this functionality will allow for the more extensive use of SymPy. The inclusion of my code will allow physicists, chemists, engineers and many other professionals to incorporate SymPy into their fields of study.

I am confident that I can accomplish this task because I have over two years of experience using Python. Additionally, I have a firm knowledge of Spans, matrices, and linear algebra, and have taken a course in Applied Linear Algebra.

This idea has been implemented to some degree by sites such as WolframAlpha and http://www.math.odu.edu/~bogacki/cgi-bin/lat.cgi. I will be consulting books that discuss algorithmic methods for implementing these topics^2. I am confident that I can emulate this effectively in SymPy.

Over the summer I will be investing at least 40 hours a week working on SymPy. I will be available for the entire duration of the project without any interruptions.

As I continue to take math classes I intend to implement the material I learn into SymPy.

PAST PULL REQUESTS

I have already written a patch which has been merged into SymPy’s current master. This patch adds the Rank function to SymPy’s matrices modulus. https://github.com/sympy/sympy/pull/1870

TIMETABLE

My weekly plan follows Google’s suggested timeline:

Community Bonding Period - May 27 - June 17. I will become versed in SymPy’s code, specifically, the requirements to create new classes and objects.

Work Period:

Weeks 1-4 : Over these weeks I will build the Span class which will be filled with functions in the weeks to follow.

The first pull request will occur during week 4.

Weeks 5 - 6: Implement functions such to the Span class. span - Takes a matrix and returns its Span basis - takes a Span and returns its basis add_column/remove_column - Adds/removes a column to an existing Span add_row/remove_row - Adds/removes a row to an existing Span Other standard Span manipulations i.e. changing specific values

The second pull request will occur towards the end of week 6.

Mid-term evaluations

Weeks 7 - 9: Continue to implement more advanced functions image - Returns the image of a Span basis_image - Returns the basis image of a Span kernel- Returns the image of a Span basis_kernel - Returns the basis image of a Span

The third pull request will occur towards the end of week 9.

Weeks 10 - 11: Transforming a span to a matrix Optimization of functions Implementing ‘Span’ into SymPy’s PrettyPrint modulus As suggested by David Joyner make sure everything works over finite fields Catch up period if necessary

Weeks 12 - end: Documentation and fixing bugs

The final pull request

References: