-
Notifications
You must be signed in to change notification settings - Fork 3
Home
-
GUI
- Accesses ACM DL data
- Operates over any subset of conferences or journals selected by the user
- Allow users to name topics
- Generate a Noam-style visualization for each data set
- Explore additional methods for visualization
- Allow users with logins to save their analyses for other users to view
-
TOPIC: NUMBER GENERATION
- Generate names for topics automatically (in GUI, allow user override)
-
TOPIC: SEMI-AUTOMATED TOPIC CONSTRUCTION
- Allow users to generate topics using a GUI (split, merge, see ITM )
-
TOPIC: CHOOSING THE RIGHT NUMBER OF TOPICS
- See Noam-style AIC fitting
- For a casual user, can we generate a heuristic for the right number of topics? Generate rules of thumb? Use number of conferences? Number of papers? Different priors for different areas? eg: Networks vs PL? Will this work?
-
TOPIC: ALLOW USERS TO SELECT MORE ADVANCED MODELS
- Dynamic topic models
- How do we visualize the results?
- Can we compare the results to non-dynamic models? How do we analyze the impact?
- track the influence of papers forward to other papers
-
TOPIC: COMBINE CORRELATED TOPIC MODELS AND DYNAMIC TOPIC MODELS
-
TOPIC: REPLACE THE ACM CLASSIFICATION SYSTEM.
- Generate a topic model for all of the ACM with names. Justify your decisions.
-
TOPIC: REPLACE THE ACM SEARCH AND/OR RELATED WORK SEARCH
- Combine topic models and citation and co-author data to improve related work search
-
TOPIC: SUBMIT A PAPER OR COLLECTION OF PAPERS
- Return a person
-
TOPIC: SUBMIT A SET OF PAPERS, ONE PER PERSON; DETERMINE the OVERLAP WITH A CONFERENCE
- Useful for seeing if your PC covers the topics of your conference
-
NOTES:
- Have students sign form declaring they will not release ACM DL data and will take appropriate measures to protect its privacy
- How will we get and pay for the cycles?
-
Code
- Error handling in
run_lda.sh
. More parameters inrun_lda.sh
(e.g., select different data sets). - Parameterize data sources
- current directory layout
- ACM data
- http://adsabs.harvard.edu/
- SIGCOMM
- Error handling in
-
LDAvis integration
-
Check out ITM.
- User interface for introducing new topics/killing old topics
-
Check out DTM Dynamic Topic Models.
- Explore topics being born, topics dying, evolving topic levels.
- DIM: Document Influence Model. Tracks which papers are influential
-
Check out CTM Correlated topic models.
-
Projects/extensions
- ACM Classifier inference
- Researcher models
- including citation graph information in citation graph
- doing it for networking research or another domain (use other data sources!)
- a general scraper API for acquiring data/getting info in to a database
Do something with the session data that Michael collected.
We need to migrate the server away from Michael's hosting and to something more permanent at Princeton.
Request project and webspace at Princeton.
- https://csguide.cs.princeton.edu/cs_request_forms_project
- https://csguide.cs.princeton.edu/cs_request_forms_webspace
Request Princeton accounts