Super naive conference keyword trend analysis tool. It:
- scrapes DBLP conference pages for paper titles by year
- does simple stemming and computes word counts by conference year
- clusters trends for each keyword using kmeans wit configurable number of clusters
- generates an HTML page that visualizes the clusters and top K keywords in that cluster.
pip install click pygg wuutils
There are three steps:
# scrape titles from DBLP pages python scrapedblp.py --help # turn titles into word counts python parse.py --help # turn word counts into HTML pages with pictures python cluster.py --help