GSoC 2016 Application Ashutosh Saboo : SymPy Live and SymPy Gamma (on Google App Engine)

Ashutosh Saboo edited this page Mar 25, 2016 · 40 revisions
Clone this wiki locally

SymPy Live and SymPy Gamma (on Google App Engine)

Table of Contents-:

Basic Information-:

Open Source Contributions-:

I have been contributing to Open Source since around 2 months now. When I started, and wanted to contribute to Open Source, I was searching for a challenging organization, which solves an interesting issue relevant to my interests. Being a Mathematics + Computer Science undergrad, I am really interested in Abstract as well as Numerical Algebra, and hence SymPy was the organization, that I chose to contribute for.

Ever since I started, I have been learning, and trying to give my best to contribute to patches for SymPy. It's been a very good experience, and I look forward to contributing to SymPy. The details of my Pull Requests can be found below-:

All in all, I have submitted 20 PRs, as a part of my total Open Source Contributions, out of which 14 PRs are closed as of now.

Contributions to SymPy-:

  1. #10478 : Ensured that factorial2 doesn't go into an infinite loop for fractions and negative integers.

  2. #10576 : Improved the xThreaded Decorator. Ensured that Integral(vt,t).doit() and integrate(vt,t) return the same answer.

  3. #10585 : Support for h.primitive() Added. Removed Redundant Terms from Integral Value.

  4. #10591 : Removed raise StopIteration from relevant places.

  5. #10703 : Matrix.rank() now gives the correct result.

I have been mentioned here as well, as a contributor to SymPy 1.0 - SymPy 1.0 Authors .

  • I also try to review certain less complicated PR's, and also follow the discussion of some other PR's as well, which has helped me a lot to learn new things about SymPy in general.

Contributions to SymPy Gamma-:

  1. #73 : Fixed Plot Error.

  2. #76 : Corrected Integration Steps for integral evaluation.

  3. #74 : Made sure, that the constant multiplies the integral.

  4. #80 : Improved PhantomJS tests, Upgraded SymPy to v0.7.6.

  5. #75 : Ensured that == results in Actual Mathematical comparison, rather than structural comparison.

Apart from this, I have been pretty active on SymPy Gamma. I have tried to understand the code perfectly, opened certain important issues for SymPy Gamma - #78, #70, #77.

I have also tried to understand other PR's that are currently open like #32.

I have tried to understand the code-base of SymPy Gamma as much as possible, and I feel I have managed to understand it quite successfully.

Contributions to FossAsia-:

Solved 8 Issues in Total for FossAsia.

  1. #33 : Added Instagram links at relevant locations.

  2. #130 : This PR Solves 4 Issues in Total .

  3. #135 : This PR Solves 3 Issues in Total .

  • In FossAsia, as well I have tried to review several PR's and also follow up on some other PR's as well, which has helped me in learning more about the code-base of FossAsia.

Contributions to OiWorld (CarbonFootprintGoogleMaps)-:

  1. #38 : Added French Language Support for CarbonFootprintGoogleMaps.

I may have solved a less number of issues, but it's definitely been a wonderful learning curve for me, since the time I started contributing to Open Source.

The Project-:

What I Want to Achieve -:

Although, SymPy's competition to Wolfram Alpha - SymPy Live , is very different from the user's end point of view. SymPy Gamma which was created by Ondřej, is a little closer to Wolfram Alpha, but still can't do things intelligently like Wolfram Alpha does. I would like to pick up the work started by Ondřej, to further improve SymPy Gamma, to ensure that it's able to understand the user's input intelligently, and then produce the corresponding output. I feel, with the growing popularity of SymPy as a CAS (Computer Algebra System), it's really important to improve SymPy Gamma, because that will eventually give so much ease to the end-users, and will also turn out to be a strong alternative to Wolfram Alpha, in the coming years.

Here's the link to the topic on the mailing list

What Excites Me -:

Now, As I stated above, Being a Maths and Computer Science undergrad student, I have always had a huge interest in Algebra, be it abstract or non-abstract. I also have been working on Web Development since quite long time now, and I have experimented with different Web Frameworks - like Django, Flask etc. Moreover, I also work on development using Python, Image Processing as well.

Now, the main part of this project, the Natural Language Processing part, i.e, understanding the user's input and then converting it to the closest and most relevant SymPy code, is the one that seems to be the most challenging to me. All this while in my Web Development experience, I haven't solved such a problem as of now, and hence it excites me the most to solve this challenging problem.

Right from the time I started contributing to Open Source, I have found immense interest in contributing to Open Source. I also prefer to complete things, and it feels good to me, if my work is used by people. Henceforth, I will definitely try my best to continue to improve SymPy Gamma even after GSoC, to ensure that it gets better and even more better for the users.

Participating in GSoC, will most importantly expose me to the talented open source contributors as well, which will help me a lot, in developing new skills, and learning from them about Problem Solving, in general.

Project Details and Execution-:

The objective of this project is to improve the functionalities involved in SymPy Live or SymPy Gamma. I would personally prefer to go about only improving SymPy Gamma as a part of GSoC. If we manage to add certain necessary functionalities to SymPy Gamma, like Improve the Parser of the site, and implement Natural Language Processing on SymPy Gamma, to ensure that it understand's user's input much more easily, and produces the relevant output, then SymPy Gamma can become much more user-friendly for the general users, as well as since, it has features like it displays the integration steps, which Wolfram offers to only it's paid users, and also that since SymPy as a CAS is getting developed and improved at a good rate, We can surely expect SymPy Gamma to attain a huge popularity in the coming years. Users may even start to prefer SymPy Gamma over Wolfram Alpha in the coming years, because SymPy Gamma offers the users so many features completely free of cost which Wolfram Alpha offers to it's paid customers. If SymPy Gamma is developed and the important features like those that I mentioned above, are added to SymPy Gamma, then SymPy Gamma can get a very good popularity in the coming years. That is the reason why I want to focus solely on SymPy Gamma as a part of this GSoC, and improve it to the best possible.

Here are some ideas which I feel need to be implemented as a part of this project-:

1.Improving the parser of SymPy Gamma : For instance, if we type plot sinx, then it pops up with an error, and that plot sin(x) produces the required result. As a part of this project, I want to go about improving this parsing issue of SymPy Gamma.

We can do, this by implementing non-existing rules in the SymPy Parser by using re module in the parse function, (which has already been implemented in SymPy here- SymPy Parser )to search, for common function names in the user input string, and then calling the relevant function for displaying the output. In addition to this, the string distance comparison function of difflib can also be used for comparing different strings, and then parsing them. I will use, difflib to check comparing different strings with a pre-defined set of math functions like integrate, differentiate etc., and then evaluating the result using the appropriate function.

2.Allowing LaTeX queries on SymPy Gamma : Most of the mathematicians are quite familiar with SymPy Gamma, hence I would like to include this functionality of parsing LaTeX queries. For implementing this, I will use Latex2SymPy to parse the LaTeX input given to the user, and then parse it to SymPy code, which can then be evaluated.

3.Addition of a Math Input Keyboard : This is another addition which will ensure the user-friendliness for the users. Wolfram Alpha offers this feature, only to it's paid users. There will be a Math Input keyboard, which will have all the Math symbols, like those of differentiation,integration,limit etc. Users can use those symbols for providing input, and then the symbols will be parsed to SymPy code, which will then produce the corresponding output. This will be in addition, to the text-only input which users can give.

I will use MathQuill, to get the LaTeX code for the Math Input provided by the user, and then I will use Latex2SymPy to convert the LaTeX code, to SymPy readable code, which can then be evaluated by SymPy Gamma. For this, I will first set up a fixed number of buttons (like Wolfram Alpha does) for some popular Math inputs, like integrate, differentiation, limit, exponent, and many more such important buttons etc. Clicking those buttons will enter the MathQuill code for the corresponding button in the input text field of SymPy Gamma using JavaScript, and then the MathQuill API will return the LaTeX code for the input given by the user, which will then be used by Latex2SymPy to generate the corresponding SymPy code. For instance, if the user clicks on the exponent button, then ^ gets added to the Input box, since ^ corresponds to the power function in MathQuill, which will then get converted to LaTeX by the MathQuill.

4.Improving the User Interface : The User Interface(UI) of SymPy Gamma can be improved drastically. For instance, with the plot function in SymPy, when the resulting graph of plot function is displayed, we can add an interactive feature, to ask user to input in what ranges of values does he want the graph to be displayed. Upon entering the values, and submitting the same, the plot() function can be called with the constrained initial and final values of x, to display the new graph.

UI can be improved in the following ways-:

  • Ensure that the results are copy-able in plain-text easily by clicking a button.
  • Addition of iPython Notebook to display the results of a query made on SymPy Gamma, which can be viewed online (Link will be added on the SymPy Gamma results page, as to where the notebook can be viewed online) as well as downloadable, to display the results. The iPython Notebook will contain all the inputs, headings, outputs of all the cards, as well as the plots/graphs.
  • Add an option plot card drop-down, where the user can enter initial and final values of x and y, and then the graph changes and gets displayed in those ranges of x and y.
  • Ability to download the results of a particular query.

I have already started working on some of these features.

  • I have implemented a part of the copy-able plain-text results (mentioned above). The results can be seen at this link - Link. It can be seen, that it allows the user to copy the text of the output. I am still working to ensure, that all the outputs of the cards can also be copied in plain-text.

  • I have also started working on Integrating the iPython Notebook with SymPy Gamma, which can be viewed in the bottom link of this page - Link. The results can be seen here - iPython Notebook, after consulting with #29.I have also started working on adding the functionality of adding graphs/plots to the iPython notebook as well.

5.Adding a Feedback System : Currently, at this stage, SymPy Gamma already has a Login feature - Website's top-right corner, and it also records search queries too. The objective is to design a separate page, displaying the search queries of the user, as well as the Feedback form. Filling the form will submit and save the data, to the Google App Engine database, which can be accessed by the administrators. Also, a user can fill the feedback form, irrespective of whether he has logged in or not, and Google ReCaptcha can be used to ensure spam-free responses are give. The User can provide his feedback about SymPy Gamma, and in case if he feels any new feature needs to be implemented and added to SymPy Gamma. I feel, once SymPy Gamma attains a good popularity, users will actually provide constructive feedback, which will also help SymPy in general, to add any necessary feature if required. In addition, to this, the feedback system can also be integrated with all error-ed queries, i.e, As soon as an error pops up in SymPy, then user will have the option of clicking a button, upon clicking which, the user will directly be able to submit the feedback regarding the issue, and the same will get saved to the Database, which the administrators can access.

Currently at this stage, SymPy Gamma already saves the search queries and displays them. For implementing the Feedback System, a form based input can be taken from the user, which can then be added to a new database on the GAE (Google App Engine), with Feedback in one of the columns, and the same User's Name, User's EMAIL ID in other columns. Along with this, a button can be added in each card, to report an issue with that card's output, so that, the user can report an issue/feedback for that particular card as well. In addition, to this, whenever SymPy Gamma, returns an error, then there can be a link to post a feedback, which will also feed the entire trace-back of the error in the feedback posted by the user as well.

  • I have already started working on this topic. I have created a separate page - Support - which displays user's search queries, when he is logged in, and also displays a Feedback form. I am working on this to attach the Send Feedback Option to each card as well (as explained above). I will also change, that the user will be able to send the feedback, even in case if he's not logged in.

6.Displaying proper Documentation on SymPy Gamma : Inspired by Mathics , I liked the ability to view the documentation of supported commands, along with their implementation procedure (syntax rules). It's very user friendly for a user, if he can view the precise documentation on the Web page itself (rather than search for the same online).

A button will be added in every card, stating View Documentation. Clicking that button, will open up a full screen window, showing the documentation of the function that was called in that particular card. The same can be viewed on Link - Documentation Link on Series Representation Card.

Timeline-:

Now, from my past experiences, things can change in a very small period of time. For example, some bug comes up which has to be fixed before anything more can be done. However, I'll propose a tentative timeline for this project too after discussing the plan of execution with my mentor. This, at least, will help me in keeping track on how much am I lagging.

Community Bonding Period-:

Since I am already a SymPy, as well as a SymPy Gamma developer, I won't waste much time in learning basic things about SymPy. Instead, I'll use this time to get started working on my project. I'll also have exams during this period from 1st May to 14th May, so my speed might be slow. In this period, I'll start working on the Feedback System Feature as well as the Documentation feature (both mentioned above in 5, 6 respectively).

Week 1,2,3-:

In this period, I'll start working on Addition of the Math Input Keyboard, and also, Supporting LaTeX user-input queries (both mentioned above in [3], [2] respectively).

Week 4,5,6,7-:

In this period, I will start working on all the sub-points mentioned in Improving the UI part (mentioned in [4] above). In addition to this, I will try to make progress on whatever has been left till now.

Week 8,9,10-:

In this period, I'll start working on Improving the Parser (mentioned above in 1 respectively).

Week 11,12-:

Time to catch up here.

  • I'll spend time towards getting my unmerged work merged.
  • Documenting about everything that's necessary to be documented.

Background and Programming Skills-:

I am a second year undergraduate student from BITS Pilani, India, currently pursuing M.Sc (Hons.) Mathematics + B.Tech Computer Science. I started getting familiar to the code-base of several Open Source organizations about 3 months ago, and contributing particularly to SymPy as well about 2.5 months ago.

I generally prefer to use Ubuntu for development purposes. I use Sublime Text and Atom for development.

Being a Mathematics + Computer Science, undergrad I have done several courses which may be of relevance to SymPy in general-:

  1. Discrete Mathematics
  2. Abstract Algebra 1 - Deals with Group Theory and Ring Theory
  3. Abstract Algebra 2 - A very advanced successor of Algebra 1.Deals with Galois Theory, Euler's Construction Lemmas, Explains Factorization in general in every field/ring
  4. Graph Theory
  5. Cryptography Project Course - Worked under the guidance of a professor for studying on the details of several security mechanisms
  6. Image Processing Project Course - Working under the guidance of a professor for implementing better and efficient image processing algorithms, and creating a product out of it

Other than these courses, I prefer to work and study on different kind of projects, out of my interest. I have been currently working a lot on Image Processing, as well as Algebra. I have nearly an experience of around an year in Web Development, with experience in different web frameworks like Django, Flask as well. Apart from this, I like to develop and implement new things using Python. I started coding when I was in 11th in C, and I have a good experience in coding using Python, C, C++. I have also worked with Java, and Android.

Here are some of my Open Source Projects-:

  • Droplet - A local sharing Web Based client (Similar in concept to Direct Connect Clients like, Apex DC++ ), dedicated solely for movie sharing in our campus, with the advantage that movie details and reviews are displayed for each movie. Users can share their movies, and download movies at very high speeds. Hashes are used to implement file sharing. This was implemented by me, along with my 3 other friends. We developed this locally, and after this project I started to use GitHub after understanding it's importance. This code on the GitHub repository, contains the fully completed back-end for Droplet, which was implemented in Python.

  • CodeScan - Implemented a Open Source Bar-Code / QR Code Scanning Android App, which scans the Bar-Code or QR Code and then automatically, searches on Google and displays the search results to the user.

  • Flaskr - Implemented a Form Based Web App based in Flask, which displays the relevant results, upon submission of inputs.

  • DjangoMail - A newsletter registration page implemented using Django, which sends a mail to the user upon successful registration of the newsletter.

  • Speech Analyzer - Analyzes the live recording of speech of a person, and finds the verbs, adverbs, nouns, adjectives, proper nouns etc in his speech. Implemented using Natural Language Processing (NLP) and Python

  • GenSubs - Generates the subtitles of any video, and highlights the verbs, adverbs, nouns, adjectives, proper nouns etc in every sentence spoken in the video. Implemented using Natural Language Processing (NLP) and Python

Apart from these, I am currently working on a Image Processing project, under a professor, at my campus. For the same, I have also implemented certain things. They can be viewed in my repositories- Repositories .

I generally prefer to use C/C++ for learning purposes. Although, Since Python is an interpreted language, which has a lot of libraries available, as well as a lot of support online (on help forums like StackOverflow ), and hence I prefer it for implementing something new. For prototyping one of the products that I am planning to create under my Image Processing project, I am myself using Python along with OpenCV for prototyping it, and then we plan to use C to finally model it.

I've been using Git since about 1.5 years (after we completed the back-end of Droplet, as mentioned above), and I feel I have mastered the basics. I am completely adjusted to the Git Workflow now.

Notes-:

  • I have my final exams during 1st May to 14th May. After that, I have absolutely no other commitment, hence I'll devote at least 35+ hours/week for working on my project.

  • Some of this proposal, has points suggested by Aaron Meurer, David Li.

  • I'll send Pull Requests as soon as possible, so that reviewing the code becomes easier, and hence there are more chances for my code to get merged into the master.

  • As I mentioned above, I prefer if my work gets used by people for their own use. Hence, just in case, if my work doesn't get merged by the end of Summer, then I would definitely work towards merging my work beyond summer as well.

References-:

[2]: Documentation - MathQuill

[3]: My modified version of SymPy Gamma

[4]: This is the iPython notebook that I have been working on

[7]: Wolfram Alpha

[8]: Latex2SymPy integration with SymPy Discussion

[9]: Documentation - Jupyter Notebook

[10]: iPython Widgets

[11]: Topic Discussion on Mailing List