Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Papers on Dimensionality Reduction #279

Merged
merged 3 commits into from Jun 17, 2015
Merged

Conversation

tchitra
Copy link

@tchitra tchitra commented Feb 3, 2015

I've added a few papers on the theory of dimensionality reduction. A lot of this work is used to theoretically justify the performance of a variety of machine learning techniques and heuristics such as random forests, approximate least-squares and compressed sensing. The papers included are:

  1. "The Fast Johnson-Lindenstrauss Transform" (N. Ailon, B. Chazelle) — Only paper not from arXiv, was in a SIAM Journal and taken directly from B. Chazelle's personal website [0]
  2. "A Sparse Johnson-Lindenstrauss Transform" (A. Dasgupta, R. Kumar, T. Sarlós) — arXiv/cs/1004:4240; STOC 2010 proceedings
  3. "Towards a unified theory of sparse dimensionality reduction in Euclidean space" (J. Bourgain, S. Dirksen, J. Nelson) — arXiv/cs/1311.2542; Accepted in an AMS Journal but unpublished at the moment

I'd be willing to give a talk on one of these papers (presumably not the last one, though, as it is quite deep and hard to cover in a short talk!)

[0] https://www.cs.princeton.edu/~chazelle/pubs/FJLT-sicomp09.pdf

Tarun Chitra added 2 commits February 3, 2015 00:25
@DarrenN
Copy link
Contributor

DarrenN commented Feb 3, 2015

HI @tchitra - what chapter is closest to you?

@tchitra
Copy link
Author

tchitra commented Feb 3, 2015

New York!

Sent from my iPhone

On Feb 3, 2015, at 6:36 AM, Darren_N notifications@github.com wrote:

HI @tchitra - what chapter is closest to you?


Reply to this email directly or view it on GitHub.

@tcfuji
Copy link
Contributor

tcfuji commented Feb 12, 2015

@tchitra Can you add a small description for each paper? I'm interested in this subject too.

@tchitra
Copy link
Author

tchitra commented Feb 12, 2015

@tcfuji Here are short summaries:

  1. "The Fast Johnson-Lindenstrauss Transform" — The Johnson-Lindenstrauss transform (JLT) prescribes that there exists a matrix of size k x d, where k = O(1/eps^2 log d) such that with high probability, a matrix A drawn from this distribution preserves pairwise distances up to epsilon (e.g. (1-eps) * ||x-y|| < ||Ax - Ay|| < (1+eps) ||x-y||). This paper was the first paper to show that you can actually compute the JLT in less that O(kd) operations (e.g. you don't need to do the full matrix multiplication). They used their faster algorithm to construct one of the fastest known approximate nearest neighbor algorithms.
  2. "A Sparse Johnson-Lindenstrauss Transform" — The JLT is still computationally expensive for a lot of applications and one goal would be to minimize the overall operations needed to do the aforementioned matrix multiplication. This paper showed that a goal of a O(k log d) algorithm (e.g. (log(d))^2) may be attainable by showing that very sparse, structured random matrices could provide the JL guarantee on pairwise distances.
  3. "Towards a unified theory of sparse dimensionality reduction in Euclidean space" — This paper attempts to layout the generic mathematical framework (in terms of convex analysis and functional analysis) for sparse dimensionality reduction. The first author is a Fields Medalist who is interested in taking techniques for Banach Spaces and applying them to this problem. This paper is a very technical paper that attempts to answer the question, "when does a sparse embedding exist deterministically?" (e.g. doesn't require drawing random matrices).

@tcfuji
Copy link
Contributor

tcfuji commented Feb 12, 2015

@tchitra Cool! Good luck explaining Banach Spaces (treading into 2nd year graduate-level math)!

👍

@tchitra
Copy link
Author

tchitra commented Feb 12, 2015

@tcfuji and @DarrenN : I'm also going to add some papers on sublinear algorithms (focusing on the ones that deal use dimensionality reduction methods). Let me know if you ever need a speaker!

@zeeshanlakhani
Copy link
Member

@tcfuji Thank you. We'll definitely keep you in the loop on speaking.

In terms of this PR, we're all for adding these papers, but the copyrights/perms on them may cause issue. Do you mind adding their links to the README of each of the categories... w/ your short summaries? Here's one example: https://github.com/papers-we-love/papers-we-love/tree/master/experimental_algorithmics#included-papers.

Thanks!

@zeeshanlakhani
Copy link
Member

@tchitra just checking in if you got my last message in regards to permissions issues.

@NewAlexandria
Copy link
Contributor

Dimensionality Reduction is interesting to me, too. What needs to be done here?

I see the link to the FJLT at Princeton, and the other two can be links to arXiv (in addition to the backup in-repo). These are all under the Dimensionality Reduction section.

Then there are 2 more papers, in Sublinear Algorithms. What's intended to happen with them?


  • Does anyone know if MathJax or another web-Latex has been implemented for Github? (for the equations in the summaries)

@zeeshanlakhani
Copy link
Member

@NewAlexandria sounds good! The Sublinear ones can stay as pdfs... but we've been also putting up links to binaries in the README's for each category as well, e.g. https://github.com/papers-we-love/papers-we-love/blob/master/cryptography/README.md... the one w/ the 📜 means link & pdf.

I'd love to include @tchitra's summaries, #279 (comment), in the readme as well... along w/ the papers.

From reading around, support for something like MathJax is afloat for GH, but nothing yet.

@zeeshanlakhani
Copy link
Member

@NewAlexandria were you still planning to finish adding what's needed here?

@tchitra
Copy link
Author

tchitra commented May 28, 2015

@zeeshanlakhani Sorry I haven't gotten back to you guys on the
legality/rights stuff. I'll update that stuff over the weekend. All of the
papers that I have uploaded are available for free with proper licenses.

@NewAlexandria: The sublinear papers include count-min-sketch and a simple
proof of the Johnson-Lindenstrauss theorem. I've found these papers to be
really useful to both programming and my research, so I figured that they
would fit within the scope of papers we love.

On Monday, April 6, 2015, NewAlexandria notifications@github.com wrote:

Dimensionality Reduction is interesting to me, too. What needs to be done
here? I see the link to the FJLT at Princeton, and the other two can be
links to arXiv (in addition to the backup in-repo). These are all under the Dimensionality
Reduction
section.

Then there are 2 more papers, in Sublinear Algorithms. What's intended to
happen with them?


Reply to this email directly or view it on GitHub
#279 (comment)
.

@tchitra
Copy link
Author

tchitra commented May 28, 2015

@zeeshanlakhani Let me know if there is anything else you would like me to contribute and/or if there is a certain subset of the aforementioned topics that you would prefer is covered well within the repo.

@zeeshanlakhani
Copy link
Member

@tchitra thanks. On the licenses though, the Fast Johnson-Lindenstrauss Transform paper, for example, has an explicit copyright on the bottom: Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. So, we'd have to go w/ a link for something like that.

@NewAlexandria NewAlexandria mentioned this pull request May 31, 2015
@NewAlexandria
Copy link
Contributor

I have added the README changes in #308. Merge this branch first, as that branch is rebased. Hope this works across forks, but I'll rebase again if not.

@hakutsuru
Copy link
Member

@zeeshanlakhani requested I get this merged. The SIAM paper was not removed by @NewAlexandria, so my plan is to merge this, then merge that #308, and then remove the paper in another pull request... Not ideal, but gets this cleaned up.

hakutsuru added a commit that referenced this pull request Jun 17, 2015
Papers on Dimensionality Reduction
@hakutsuru hakutsuru merged commit 7fbbfe9 into papers-we-love:master Jun 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants