Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Hot" issues #42

Closed
brson opened this Issue Jun 6, 2016 · 5 comments

Comments

Projects
None yet
2 participants
@brson
Copy link

brson commented Jun 6, 2016

Develop a metric that quantifies the current activity level on all issues/PRs across the orgs and lists them in order on one page.

Then as an extension take those same issues and extract a word map that shows the hot topics as english words.

@brson

This comment has been minimized.

Copy link
Author

brson commented Jun 6, 2016

@anp

This comment has been minimized.

Copy link
Member

anp commented Jun 6, 2016

I'm sure there are many ways to define "hot". Without looking at any prior work in that area, I can think of a few items, ranging from super dumb to overkill.

  1. Pick a window (last 30 days, last 7 days, etc etc), and rank each issue/PR by the number of comments. (measure of raw activity)
  2. Same as 1, but also include ranking for number of commenters, number of comments per commenter. (measure of how much of community may be affected)
  3. Measure timestamp intervals between all comments in the given period. (measure of heatedness of discussion)
  4. Count number of characters in comments, add weight for number of characters, average length of comments, etc. (measure of ~weight of issues)
  5. And so on and so forth.

I suspect that a level of complexity between 2 & 3 above will be appropriate, but it'd need an implementation in practice to be sure. Other ideas are of course welcome.

In terms of a word map -- shouldn't be too difficult. There are existing options like Wordle and various JS/HTML tools. I think I can port the NLP portions (discarding noise words, word counting) to Rust and handle presentation in the JavaScript frontend.

Any work on this will be blocked on me updating the scraper to handle multiple repos, which I'm probably going to get to tomorrow night.

@brson

This comment has been minimized.

Copy link
Author

brson commented Jun 7, 2016

I'd say a 30 day window counting comments + tags + those github link notifications is probably sufficient.

@anp

This comment has been minimized.

Copy link
Member

anp commented Jun 26, 2016

I've added some super basic functionality in 778acbe. Still trying to find a way to efficiently query the database for issue mentions -- I think it might require a change to scraping to process those on the way in, rather than at query time.

That said, looking at the number I don't think that tag activity and mentions will significantly alter the ranking unless they're weighted differently.

@anp

This comment has been minimized.

Copy link
Member

anp commented Jul 1, 2016

Note to self: add column(s) for each metric to make sorting obvious-er.

@anp anp closed this in 7b20084 Jul 3, 2016

anoadragon453 pushed a commit to matrix-org/mscbot that referenced this issue Jul 6, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.