Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine stars algorithm (was "hidden gems") #560

Closed
cgome opened this issue May 10, 2012 · 44 comments
Closed

Refine stars algorithm (was "hidden gems") #560

cgome opened this issue May 10, 2012 · 44 comments
Assignees
Labels
3: Mostly done 5: Enhancement Build up on existing function that adds or improve functionality significantly Area: Logbooks Ticking & tick lists Area: Statistics Related to climber or node statistics such as qty of x per node, profile hardest sends,... Social

Comments

@cgome
Copy link
Member

cgome commented May 10, 2012

Ability to find high quality, low popularity routes at a crag (could be part of search, crag guru, crag stats, other?)

@brendanheywood
Copy link
Member

This would just be another facet to search on.

I don't think it has enough value to be a new list especially given we
maybe merge icons and classics.

Sent from my iPhone

On 10/05/2012, at 8:51 PM, cgome
reply@reply.github.com
wrote:

Ability to find high quality, low popularity routes at a crag (could be part of search, crag guru, crag stats, other?)


Reply to this email directly or view it on GitHub:
#560

@ghost ghost assigned scd Mar 5, 2013
@willmonks
Copy link

In case this is still an open issue, I often would like to be able to sort and/or search routes based on quality ratings. For my recent moonarie trip I ended up checking dozens of routes' quality ratings, by laboriously navigating to each route page and eyeballing the quality histogram, to help decide which routes I would prioritise once I got there. Searching/sorting by quality could've saved lots of time! (Am I missing something?)

I find a quality check really useful because it nearly always throws up some surprises compared to relying only on the guidebook/website stars (under-starred classics, over-starred junk). (I do check the "nearby icons" list but if you're only after a particular grade bracket it only gives 1 or 2 very predictable suggestions)

I found the old quality score, between 1 (crap) and 7 (megaclassic), to be very useful, until it got dropped maybe 5 years ago (?). I do remember it required a minimum of 3 user quality ratings for a route to get any score at all, which I didn't like (even if only 1 person has done the route, if they thought it was megaclassic that is still valuable to know - especially if its a person who you know has good taste in routes).

it seems a bit incongruous to be able to search by so many factors (rock type, sun/shade, walk-in, stars, etc etc!) but such a key factor as quality is not searchable. Cheers!

@scd
Copy link
Member

scd commented Jun 26, 2014

Will's processes seem like a massive time waster, so with 15 minutes work I have added a sort by quality (live fix) in the faceted search url. Unfortunately I do not have time to fix the UI, but just so Will knows it is now possible if he knows how to write the url. Here is an example:

http://www.thecrag.com/climbing/australia/moonarie/routes?sortby=quality,desc

Use the UI to get to the point where you sort by popularity and change the word 'popularity' to 'quality' in the url.

@brendanheywood
Copy link
Member

Linking this one to #1179 to get cleaned up properly and add the UI for sorting / filtering

@brendanheywood
Copy link
Member

I found the old quality score, between 1 (crap) and 7 (megaclassic), to be very useful, until it got dropped maybe 5 years ago (?).

Will I'm a bit confused by this comment, this is still in use and alive and kicking. Is it not appearing somewhere you would expect it to?

Also Simon this new sort method seems to be sorting on the consensus rating (ie the average crap .. megaclassic which is still in use), and not on the star rating, which is why some 2 stars are mingled with the 3 stars when you visit that url above (good to see they are mostly aligned though). Is there also value is allowing sort by stars (ie authors rating) vs consensus?

@scd
Copy link
Member

scd commented Jun 26, 2014

Actually stars is already in there, so it probably looks more consistent. Stars takes into account publisher stars and requires a minimum number of ratings:

http://www.thecrag.com/climbing/australia/moonarie/routes?sortby=stars,desc

so it is probably better. But I still think there is value in also being able to sort purely on consensus.

Will, we value your feedback here!

@brendanheywood
Copy link
Member

Cool. Isn't it the case that for routes with lots of ratings that the consensus will overwhelm the original star rating, so for old popular crag stars === consensus anyway? If this is the case then I'd say keep it simple when I get around to #1179 and just have the option for stars to save confusion.

I'd also be happy to drop the minimum no of ratings

@willmonks
Copy link

Simon: awesome!! thanks I will use that regularly. Takes 2 seconds, brilliant!

The Moonarie list illustrates my point about the minimum number of quality ratings required - for example Endless Love (http://www.thecrag.com/climbing/australia/moonarie/route/13597471) doesn't appear until page 2 on that list, because its only had 1 user rating. However that was a "megaclassic" rating (and boy does it look mega, and JT knows his classics) so IMO even a single rating should be given some weight in the quality rankings...

Brendan: the route page previously gave a single number reflecting quality consensus, say 7.0 if everyone thinks its mega or 5.0 for "very good" or 5.7 if there was a few votes spread around, etc. That doesn't appear any more (anywhere that I can see). I think it disappeared around 2007-9? Maybe the quality rating histogram replaced it? But anyway knowing the number can help with benchmarking. E.g. the #1 quality route at moonarie is not equally good as the #1 quality route at a crap crag, but the only way to work this out (if say you are choosing which crag/sector to go to) is the consensus quality number. (Sorting by stars can be very inaccurate given that guidebook authors often do "stars by crag" where 3 stars means "best route here even if its junk" - eg 3 stars at Mt Keira is rarely 1 star anywhere else). Or it might be good to know that even the 10th quality route at moonarie sector A is still a 6.0 (ie that sector has heaps of classics) whereas at moonarie sector B the 10th quality route might be a 4.0 which helps you decide how much time to spend at each sector. Sorting routes into a list by quality is a great start but doesn't permit these comparisons. The number probably doesn't need to appear on the route page (now that the histogram is there - its much more illustrative of the diversity of opinion) but perhaps the consensus quality number could appear on the route listing returned for any search?

Also, yes the guidebook author stars correspond ok with the consensus rating at moonarie, but that list definitely brings out some hidden gems. eg Buzzard Arete & Reality Factor have 0 stars but are in the top 50 based on consensus quality. And conversely it weeds out the overstarred routes - eg Orion falls below dozens of lesser-starred routes. And the results would probably be even more interesting at busier crags (80% of moonarie routes haven't had enough ascents logged yet to sort by consensus quality)... you can be sure I will be checking this for other crags as I plan future trips! ;)

@willmonks
Copy link

Simon: "But I still think there is value in also being able to sort purely on consensus" - definitely agree with this. Sorting by stars highly rates "2-3 star" routes which have had NO ascents (eg Barren of Emotion). As a first time or rare visitor to a crag its a bit of a give away if none of the locals do the route that it shouldn't be high on your list. Sorting by consensus filters those ones out. (plus finds the hidden gems etc as above)

@brendanheywood
Copy link
Member

Just a thought, it seems that the consensus rating has consistently more value than the 'registered' stars, which are after all simply the opinion of who ever added the route, and quite often a route has no stars by omission rather than not being worth no stars.

If we tweak the stars algorithm so it uses whatever is available at all whether published book stars, editing stars, or ascents ratings (even if only a single one), then we'd remove some of the complexity around this and only need one sort by quality method instead of two.

Looking at assessRouteQuality if we just comment out line 3132 where it checks if there is 20+ ascents ratings then I think we'll get exactly this? So an ascent rating would get counted as one vote, an editors stars as two votes and a guidebook stars cite as 5 votes regardless of ascent count. In the case above where there is only just a single ascent rating then that will be it, rather than being blank as it is now. In a case where the original guidebook author gave it 1 star, but it has since had 30 ticks sayings it's a classic then this will get outvoted.

@brendanheywood brendanheywood modified the milestones: Release 38 - Topo rewrite, Unscheduled issues Jul 7, 2014
@scd
Copy link
Member

scd commented Jul 7, 2014

I am ok with making this change and seeing how it goes. I guess I was predicting problems at edge cases by putting in limits. Maybe it is just ok.

From memory there is an issue with stars faceted search, where the algorithms did not exactly match so you would get two stars appearing above three stars. Part of this work should be to align the algorithms.

Also we should implement the one person one vote for quality.

In summary:

  • remove limits for consensus stars
  • align star faceted search algorithm with and general star assessment algorithm
  • one person one vote

@brendanheywood brendanheywood changed the title hidden gems Refine stars algorithm (was "hidden gems") Jul 7, 2014
@scd scd modified the milestones: Release 40 - profile update, Release 38 - Stats and annotation titles Jul 28, 2014
@brendanheywood
Copy link
Member

A part of #1179 the sort by is now exposed in the UI, see the work in progress here:

http://dev.thecrag.com/climbing/australia/moonarie/routes/?sortby=quality,desc

Not sure how we should expose the quality as this is a distribution not a discrete value. Have we done something on this front before already?

@scd
Copy link
Member

scd commented Oct 12, 2014

Awesome work. Shall we standardise all facets to full page width?

I am happy for the discrete value vs continuous quality discussion to continue. What you have done is great.

@scd
Copy link
Member

scd commented Oct 12, 2014

loving the photos in faceted search.

@brendanheywood
Copy link
Member

Yeah re 'discrete value vs continuous quality' I don't want to get bogged down in the discussion about changing the algorithm (although that is important and I think worth doing - it even came up today while I was out bouldering). I just want to focus on making the 'quality' field visible in some way. You can now sort by it, and it looks roughly right because all the 3 stars are fairly correlated and are at the top, but some times they diverge so it looks funny. We need to just an extra column with some value that represents the 'quality' data.

How about we take the same data that is used to build this quality chart:

http://dev.thecrag.com/climbing/australia/moonarie/route/13601851
image

and then stack them together horizontally similar to the grade bar and style bar charts, or maybe as a little spark line and add that as a new 'quality' column?

@brendanheywood brendanheywood assigned brendanheywood and unassigned scd Oct 12, 2014
@scd
Copy link
Member

scd commented Oct 12, 2014

The quality indicator is really fine tune feedback on the stars. In otherwords, if you wanted to compare three star climbs, then going to the quality field has real value. Maybe we should just go with a number for now and see what the community think?

@brendanheywood
Copy link
Member

Not really, that only holds if there are lots of ticks, in which case the stars algorithm kicks in when there is > 20 ticks and the two end up in sync as the stars is derived from the quality. But before that they can be quite divergent. So ultimately we should just fix the algorithm so they are always consistent and then merge the two concepts, there isn't value in having two quality ratings.

Also there isn't any reason why under the hoods stars is an integer even though it displays as one, so the sort by stars should cover the scenario you mentioned above.

So I guess instead of trying to avoid getting 'bogged down' we should just bite the bullet and fix this properly and do the 3 points you came up with on the 7 Jul and then this will all go away.

@willmonks
Copy link

Hi guys
Thanks for getting on to this, I for one will appreciate this
functionality!

There seems to be a few subtleties which are lost on me ... so I'll just
say that Brendan's horizontally stacked quality chart sounds cool. I'll
also repeat my vote for disclosing the "average" quality number, to a
single decimal place. E.g. looking at the above chart, Robbing Hood gets
7x"mega", 4x"classic",1x"very good", and 1x"good" so it's a 6.3. Having
this number presented beside the horizontally stacked quality chart would
be very useful IMO.

Then again Adam Demmert thinks I am completely OCD about quality, so maybe
I'm not a representative user and you should ignore me ;)

Cheers
Will

On Sun, Oct 12, 2014 at 10:31 PM, Brendan Heywood notifications@github.com
wrote:

Not really, that only holds if there are lots of ticks, in which case the
stars algorithm kicks in when there is > 20 ticks and the two end up in
sync as the stars is derived from the quality. But before that they can be
quite divergent. So ultimately we should just fix the algorithm so they are
always consistent and then merge the two concepts, there isn't value in
having two quality ratings.

Also there isn't any reason why under the hoods stars is an integer even
though it displays as one, so the sort by stars should cover the scenario
you mentioned above.

So I guess instead of trying to avoid getting 'bogged down' we should just
bite the bullet and fix this properly and do the 3 points you came up with
on the 7 Jul and then this will all go away.


Reply to this email directly or view it on GitHub
#560 (comment).

@brendanheywood
Copy link
Member

@willmonks what is your perspective on the 'stars' vs 'quality' from a conceptual level? Do you think there is any value at all in having two concepts for the same thing? Or should we just fix the stars so that they are based more on consensus feedback?

@brendanheywood
Copy link
Member

Simon it should be fairly easy to wrangle up a list of ascents of routes that the same person has also added a star rating for and directly compare these. I just had a quick go but am missing the join to get the star rating from each citation, any hints?

@willmonks
Copy link

Hi guys, just bumping my concerns from Dec 2014 about the thresholds. LOTS of "good" (42%-54% quality score) routes are still languishing with no stars in thecrag.

This is also being noted by some other users - who agree that there is a direct correlation that "good" = 1 star, "very good" = 2 stars, "classic" = 3 stars. E.g. see the comments on these routes:
http://www.thecrag.com/climbing/australia/grampians/the-tower/route/15183319
http://www.thecrag.com/climbing/australia/blue-mountains/western-tier-shady-side-lower/route/722745879

So I'd still really like to see the 55/62/72 thresholds for converting quality ratings to consensus stars changed to 41.6/58.3/75. But I will go away again now ;)

@scd
Copy link
Member

scd commented Mar 30, 2016

I think we agree. It is now just a matter of finding the time to fix. This fix is fairly high priority, but their are some other things that keep jumping in the way.

@willmonks
Copy link

Excellent, thanks!

@scd
Copy link
Member

scd commented May 12, 2016

@willmonks I have made some updates to the star algorithm on our development machine dev.thecrag.com

See issue #2171 for discussion on the specifics of what has been done.

@brendanheywood please pull from repo

@scd scd modified the milestones: Release 49 - Update stars algorithm, Release 50 - Clean urls May 23, 2016
@scd
Copy link
Member

scd commented Jul 19, 2016

I think this is done

@scd scd closed this as completed Jul 19, 2016
@scd scd modified the milestones: Release 50 - Clean urls + I18N, Release 50 - initial translations release Jul 19, 2016
@willmonks
Copy link

Thanks heaps guys. I think the algorithm is working exactly as it should.

I saw Simey complaining on Chockstone about thecrag "throwing stars around
like confetti" at Araps ...but when you look at the climbers' ratings of
the routes I think the algorithm is actually working perfectly. If Simey
or any one person thinks a route is junk, but everyone else calls it "very
good", well, that's the exact definition of a 2 star route.

keep up the good work!

cheers,
Will

On Wed, Jul 20, 2016 at 8:50 AM, Simon Dale notifications@github.com
wrote:

I think this is done


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#560 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHnu-GOOo0I08G3Y-pwvanMwKriv8hNYks5qXVTMgaJpZM4CHrA8
.

@scd
Copy link
Member

scd commented Jul 20, 2016

Lucky I don't really read the forums, otherwise I might engage in some of these comments.

The old system probably aligned with guidebooks more, however I think we can move on. Generally people hate change and we end up getting lot's of negative comments simply because it is different.

This is the way it will be done for the next couple of years. If somebody else wants to put in a compelling case to change it again after we have had sufficient time for the community to digest all the issues then we may change it again. But for now I think everything that was implemented was transparent, simple to explain and logical. @willmonks thanks for your investigation and input on this issue :)

@brendanheywood
Copy link
Member

Thanks @willmonks , yeah I think it is working well too - however the one thing which now needs some more thought is the 'nearby icons' logic which leveraged this, now needs to be tweaked accordingly, see also #1456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3: Mostly done 5: Enhancement Build up on existing function that adds or improve functionality significantly Area: Logbooks Ticking & tick lists Area: Statistics Related to climber or node statistics such as qty of x per node, profile hardest sends,... Social
Projects
None yet
Development

No branches or pull requests

4 participants