Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zhe's milestone (undetermined) #17

Open
2 of 7 tasks
azhe825 opened this issue Jun 21, 2016 · 3 comments
Open
2 of 7 tasks

Zhe's milestone (undetermined) #17

azhe825 opened this issue Jun 21, 2016 · 3 comments
Assignees

Comments

@azhe825
Copy link

azhe825 commented Jun 21, 2016

Learning based Systematic Literature Review (For Ph.d)

Current Baseline Approach:
Searching + filtering, then linear review.

Challenges:

  1. Reduce review cost
  2. Imbalance (target class is always minority)
  3. Utilize knowledge from users

To Do:

  • Work bench
    • get data (linear review, time recorded --- baseline approach)
    • construct citation matrix
  • Basic method
    • searching (Elasticsearch) [1, 2, 3]
    • learning (lexical analysis with term frequency, l2 normalization, SVM...)
    • active learning (uncertainty sampling, certainty sampling) [1, 2]
    • data balancing? need to test. [2]
  • Obtain a better initial training set for active learning with fewer reviews [1, 2]
    • baseline: random sampling
    • clustering based on lexical analysis [1, 2]
    • spectral clustering based on citation matrix (citemap?) (Possible expansion: utilize citation matrix on learning) [1, 2]
  • Get user involved [3]
    • show important features and let user to mask them [3]
    • allow user to re-review important documents (support vectors) [3]
    • let user explore the clusters [3]
  • Visualization of Learning Result (not sure how this can help right now) [1, 3]
    • pretty graphs (d3.js, kibana)
    • present results in each cluster
@azhe825
Copy link
Author

azhe825 commented Oct 6, 2016

Implementations of machine assisted reading

  1. (Multi-objective) optimization: reduce number of evaluations by ranking candidates. Ongoing
  2. Defect prediction:
    • standard machine assisted reading on projects with no labeled data. (test->train->rank->test->train->rank->...)
    • updating with machine assisted reading on new version of codes
    • reuse with machine assisted reading on similar projects

@timm timm assigned timm and azhe825 Oct 12, 2016
@timm
Copy link
Contributor

timm commented Oct 12, 2016

Lets talk to this next time we meet

@azhe825
Copy link
Author

azhe825 commented Oct 12, 2016

Sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants