Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



NOTE: The project page is currently under development and may subject to change.


The project can be theory-oriented and application-oriented. Group work and discussion is encouraged. Note that the efforts from each of the team should be clearly documented in the project report. For projects of the same quality, a smaller team will get higher scores.

For PhD/Master students: it is ok if you want to align the class project with your current research projects. HOWEVER, it cannot be results you have already published, and you should have additional novelty and contribution. Please come and talk to me if this is the case.

It is strongly recommendeded to use LaTeX to format the project report, and use the ACM SIG template. The latex code for the reports, algorithm code, and experimental scripts (not data!), should be maintained by GitHub along the development. Once the group is decided, the group leader should send me the group members and and also the their github IDs. A project repository will be created in the dedicated GitHub space.

Theory Thrust

  • Maximum group size: 2
  • Focus: study statistical properties of new machine learning algorithms.
  • Pick a paper from 2015 ICML or 2015 NIPS.
  • Complete a term paper that surveys the area, and show every detail of the proof.
  • Implement the algorithm and verify the theoretical properties (synthetic datasets can be used)
  • Bonus: use the algorithm to solve a real-world problem (apply on a real-world dataset and get empirical results).

Application Thrust

  • Maximum group size: 3
  • Focus: Implement and compare different algorithms to solve a real-world problem.
  • Data can be from SIGKDD Cup 2016 (talk to me if you are interested!) or Kaggle competitions.
  • Start with reading some KDD/ICDM papers.
  • Bonus: derive theoretical properties of the algorithms.

Important Dates

  • Proposal due: Feb 19, 2016
  • Intermediate project report due: Mar 18, 2016 Mar 25, 2016
  • final project presentation: Apr 26 and 28, 2016
  • final project report due: Apr 28, 2016

All project deadlines are 11:59PM EST.



The code (proposal, report, and program) should be maintained in GitHub, with commits reflecting the efforts from each team member.

Project Proposal

Project proposal (1-2 pages) should cover:

  • Project title
  • Team members
  • Description of the problem.
  • A brief survey of what have been done and how the proposed work is different.
  • Preliminary plan (milestones)
  • Reference (a list of papers)

Intermediate Project Report

The intermediate project report (3-5 pages) should cover:

  • a high quality introduction and problem description
  • description of the data used in the project
  • what have you done so far
  • what remains to be done

Final Project Report

The final project report (10-15 pages) should cover:

  • Introduction, including a summary of the problem, previous work, methods, and results
  • Problem description, including a detailed description of the problem you try to address
  • Methodology
    • Theory: details of technical proof.
    • Application: detailed description of methods used
  • Results, including a detailed description of your observations from the experiments
  • Conclusions and future work, including a brief summary of the main contributions of the project and the lessons you learn from the project, as well as a list of some potential future work.

Final Project Presentation


  • 10 minutes
  • Describe the motivation and problem description
  • Briefly present the intuition behind the technical details (methodology)
    • Theory: Algorithm and proof sketch of major properties
    • Application: Algorithm and results (you can use a demo)
  • Schedule