-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing the rank of an approved kata #157
Comments
Yeah this is a problem that has been concerning me for a bit now. The scale has definitely shifted from its original intentions. I've approved kata on occasion that I thought deserved to be ranked as more difficult than the public consensus, even if both power and non-power users averaged to the same thing. The hardest ones to rank are the puzzles that are more about finding the solution and not about the code needed to implement the solution. Math ones tend to be very tricky as well because for those who know math incredibly well (like many of our power users) - these tasks are trivial. For now I have upgraded it to a 6 kyu. I'm open to making it a 5 kyu if you can find some additional support for doing so. I would personally like to see this discussion turn into a thought process for how we can better come up with heuristics on grading difficulty. I think the ones listed in the rank voting section on CW are not thorough enough and at this point out-of-date. That discussion could happen here or on the kata discuss thread. |
Hi Jake and thanks for the quick reply, for the moment at hand, given that notifications on a kata are not always working properly or that this github area already hosted interesting discussion, I would stay here and relink this discussion on the kata itself; this all and, of course, the hope that the discussion will evolve towards a more general direction, not just focusing on the current kata. Concerning the difficulty of this one... Well, as I mentioned in the kata itself, the basic task comes from the last 2 digit problem on Euler (the rest of the problem, parsing a given file, is a 2 lines trivial matter on Python or most other languages). Now, at least in my book, Euler is not where one picks the easiest coding problems that can be found around. Regrettably Euler seems to be partially down ATM, but IIRC that problem was at least 15% difficulty, i.e. not something most of the users could do in the blink of an eye. What's more, as mentioned above, putting on some edge test (like comparing powers with base 1) complicated things a bit more. Finally, the currrent stats are that of about 2100 submission, less than 100 were successful; and a success rate lower than 5% is something that tells a lot (assuming, of course, that the kata is properly explained, has working tests and so on, but I think this was the case as I also had the support of other users, including native speakers, to quickly fix any problem in that regard). As far as the idea of better heuristics goes, maybe a mix of having users with at least X honour be able to re-assess a rank and looking at the successfully completed/total submissions ratio may help a bit more, possibly also warning admin of dubious situations (like, say, a 8 kyu kata with less than 10% of successful submissions). |
[Separeted - partially related thoughts on one's perception of difficulty] I was wondering how often you still go and grind problems on CW, as I assume one's perception can also be biased by that regard: if you get the habit of solving problem, it is somehow physiological to shift your perception of a given challenge as if it was easier than it actually was; you start to get used to it, your mind has some short-cut to problem-solving that a less frequent solver misses and you tend to be impressed by truly hard problems, thus seeing the others easier by comparison. And the reverse may be also true, so, in a few word, CW shifting toward lower ranking and harder problems may be as well a good sign, like a sign of maturity and loyalty building on the brand. It could be a double-edged sword (the risk of becoming too elitist is on that road too), but that also brings into consideration what are your vision and your hopes on CW: a totally mass-marketed product and a small niche product usually don't sell that well if put on the same shelf. But that does not mean it can't be done, with the due precaution. |
Regarding optimal way to modernize kata ranking - I posted an idea here: |
@MMMAAANNN well, that's going to be interesting. Let's say you have three things, However, it's rather likely that the graph of katas and difficulty gradings of a user will have many circles. In the worst case, (s)he will rate a rather difficult kata easier than one of the easy ones, and all the ratings are useless.
Isn't the given honor based on the difficulty? That would lead to some interesting honor changes. All in all, rating the "difficulty" of something is rather hard. Also, the kata's difficulty might fluctuate between languages (something that should be avoided). Last but not least, the personal assessment of a kata will most-likely change over time, as one gets better. That's why I like the kyu checklist. Maybe that should get reworked. |
The kyu checklist needs now certainly to be reworked, but while the "circular" voting surely makes sense, imho that is almost completely lost once you have not a single user, but a user base numbering in the hundreds to rank them. What would concern me more, would be the irrational side of human mind, like the weight of an anchoring bias that could stain the purity of any given poll of answers. Not to mention that the question itself should be as bias-proof as possible (for example, whether you are "was X easier than Y" or "was X harder than Y", you get different numbers), which is far from trivial. This is all I personally am more in favour of collecting data mostly from user-behaviours, not from users' answers: given a basic standard of undestandability and working tests, if a kata has a certain ratio of success/attempts, then you can more effectively infer its perceived difficulty. Because, you know, working on perceived difficulty is what should matter here, not trying to reap some kind of abstract perfect ideal of difficulty: if a user succeed in solving a problem 95% of the users couldn't, he has to be rewarded; conversely, a user unable to solve a problem which is ranked as very easy (while very few managed to solve it), only gets frustrated. In this perspective, I would not bother changing the honour score: you still keep it if your solution does not work anymore, so I see little point in running after an ever-changing landscape of katas. Now, that would open a way, WAY bigger point on what honour could possibly mean (I am close to the top, yet I feel quite far from being among the most gifted or skilled on CW, for example) and how one could improve CW gamification aspects in order to move users toward better behaviours (like, say, focusing on solving problems in a given area, courteously ranking or voting other coders and so on). But I would leave it for another day. |
I would rather discuss my idea at the ideas page, as it may be offtopic here and may be not noticed by other people wishing to discuss this. For your example with asssessments A > B, B > C, C > A, the three katas will be assessed as having equal difficulty, AFAIK. And I do not see why it is bad. |
Well, my "algorithm" deduced the same. It's not bad per-se, but if you get a big circular graph, everything will have the same difficulty. @GiacomoSorbi: I thought that the graph/decision mentioned by MMMAAANNN was on a per-user base, not per-community base. A complete community base would be rather big and lead to a lot of circular votes. That being said, if the rank of a kata would really change over time, one could start every kata at 1 kyu and simply increase its rank the more people it solved. Only 1% has solved the kata? 1kyu. More than 90% have solved the kata? 8kyu. |
Your suggestion for ranking kata considers the user-who-completed-it/total-users ratio, I seem to understand, which would not make much sense: a success/attempts ratio is more significative, particularly considering that before getting a rank each kata gets at least something like 20 successful submissions. And that would also be what I would do to avoid circular votes (or to avoid to ask for it altogether). |
Success/attempts ratio can be a sign of a bug which persisted for a long time. It is not a specific measure of kata difficulty, in my opinion. I do not understand why you guys are so concerned about "circular votes". Shulze method allows to handle them adequately. |
Once a kata has been greenlit, I aspect very little bugs to come up; giving greater powers to power users may also make fixing them even quickier. I am not concerned with circular votes: I am more concerned on biased votes, as humans tend to be biased. A lot. |
There's at least one kata that has been broken since I started Codewars. Greater power would be definitely helpful in such cases, but to be honest it's not really clear what you can do at the moment with 3k+ or 5k+ honour. Additionally, as long as translations don't have some kind of review/beta, they can always introduce bugs that weren't in there. And, of course, future improvements of katas can also create new bugs, especially if we tackle the 500-lock rule, see #123.
Do you mean "attempts" as "times the solution has been submitted", or "amount of users who tried this kata"? The latter seems fine, but the first can lead to abuse: write a script that submits a wrong solution like every 15 to 30 seconds. |
Sorry, somehow missed your link. Schulze seems fine, since it needs only a strict weak ordering. However, given it's O(n^3) runtime, the kata ranking process would need to be an asynchronous process. Overall, 👍 on Schulze. |
Good point for broken translations. Actually I intended the total amount of submissions: I don't expect many users to cheat and you can easily filter off the 1% which does brazillions of attempt on each kata. [I have written about the 500-lock rule (which I am starting to hate, I have to admit) on the specific topic.] |
What about the katas that don't have public test cases? Sure, they should get improved instead, but sometimes that's not possible, for example if it's a puzzle:
The poem above would lead to many submissions by every single user who doesn't know the answer. |
With this automated submissions/completions criterion that you suggest, how will you differentiate difficulty from poorly written description or other bugs and glitches? Also, one can write a kata that intentionally requires to do a lot of submit attempts because of many hidden tricks not mentioned in the description. |
Interesting points and I hope that is clear that my PoV is more about choosing a lesser evil than some optimal solutions. A puzzle which is doable should still have a difficulty somehow proportional to the number of attempts; I have encountered a couple of katas which were much more along the lines of "guess what the author may be possibly be thinking", but I guess they will stay in beta forever or so. As far as the poor description goes, we could simply flag (other than not approve) a badly explained kata; once it is out of beta, you start counting attempts. I must admit I fail to figure out a good yet relatively simple kata which also asks to do tons of trial and error submissions despite a properly spelled problem: any specific kata that you had in mind? |
Err… do you remember the "Broken greetings" kata? It's the second one you'll encounter if you sign up on Codewars:
While it's somewhat clear what you have to do, it's not clear what "the expected value is". If I was a new member, I would simply submit, read the error message, and carry on. If I then accidentally introduce a typo (
Only about 30 percent of all submissions actually solved that kata, and only about 70% of the users who tried it. Note that the number of completions is less than the number of warriors, so some gave up. However, lets take another 8kyu kata:
Almost 66% of all solutions solved the kata, and even 90% of all users who tried it solved it. And it's really trivial one. But if you solely compare the numbers (30% vs 66%; 70% vs 90%) the first one would need a (much) higher kyu rating; although adding
While this isn't a good example, since it's also a rather hard kata, there are some katas which depend on random numbers. Only 1% of all submissions solved Binary Genetic Algorithms, but (probably) 20% of all warriors who tried it solved it. Remark: While it's interesting to discuss how to rank katas into the eight difficulties, one should also ask why we have exactly eight ranks. Why not 4? Why not 1, and have the user decide whether he thinks he can manage to solve the kata (see Project Euler). But that's a fundamental design decision and shouldn't be changed at this point. It's something to keep in mind for version 4 or higher. |
I think that kata Printing Array elements with Comma delimiters with arrays of arrays is harder than 8 kyu. |
I think that kata Rock Paper Scissors! is harder than 8 kyu too. |
@SlyBeetle really?? :) |
@yankouskia I think that they are harder than other 8 kyu kata. It is my opinion. |
I don't know how things changed since the above discussion, but I still come across - mostly old - katas that are seriously overrated. For example this one: rated 4kyu, but as the solution is only a few lines and pretty straightforward, I would rate it 6kyu maximum. When people complain about such katas in the comments, the usual reply is that "old kata = easy kata". Now this is where my dilemma starts: when I click on "edit kata" I see that I could change the rating. Even though I would find it totally justified, I hesitate, because that would cause people to lose honor. In this case 2000+ users would lose 24 honor. So, should I do it? Or should I not care so much? 😄 On the other hand: do people really lose honor if I do so? Please advise (preferrably @jhoffner) |
@kazk Any views on this? (I mean the original question: re-ranking katas...) |
@Voileexperiments so no problem for the moment, because of an unfixed bug... But is there any "official" point of view or advice? Or should I just change whenever I feel it's not correct? What if I just go nuts, and change everything to 8 kyu? 😄 |
@anter69 That wouldn't work for any katas that aren't as old as those because the ranking votes are actually kept, and you can't rank a kata below the average voted rank. |
I just re-ranked https://www.codewars.com/kata/5268acac0d3f019add000203 to 5 kyu (from 4). Will probably change some more old katas to be a bit more realistic. |
I dunno, but I'd say it's more of a Well, that's why we need some kind of discussions (or voting) before re-ranking these... |
My first guess was the same, so be it: now it's 6 kyu 😄 |
next up: https://www.codewars.com/kata/ip-validation (4 kyu) --> 6 kyu? |
Yup, looks like a low 6. It also needs a complete rewrite and random tests added. |
changed to 6 kyu -- we continue from here later |
I personally think one-time changes for old kata are fine. I don't know much about how Codewars works, so I'm not sure if this affects anything else. (I didn't even know it was possible to change the ranks of published kata.) This is also discussed on #1247 /cc @jhoffner |
@Voileexperiments @kazk ...but almost everything here could be downgraded at least 1 kyu (from the older ones): |
I just tried to change the rank of this kata (5 -> 7): the option is still there (see screenshot below) but after clicking "re-publish", the process is stuck with "Please wait while publishing". I remember that there was a discussion about removing the possibility to change ranking, but if it was done, then the option should be removed from the UI as well: |
Because the response that was returned was not a success, but rather something along "access denied". |
Sure, but it would be nice to get an error message instead. Or even better, the option to change it should be completely removed. On the other hand, is there any issue for reviewing katas that should b re-ranked? I sometimes drop by to update my list above, but I see no changes... |
Decent chance there's an error message in console? This was turned on for a bit, but then turned off because it choked the servers recalculating points for everyone. |
Just wondering and absolutely not criticising who approved this specific case (Abbe-senpai, I think), but I wonder if one can get the chance to object about a kata ranking once it is greenlit, as some user suggested here on a kata of my own.
Unfortunately I can no longer see past votes and proposed rank once approved, but I have to admit that 7 kyu may seem a bit too low; it is true that the base problem itself could be rather trivial if you figure out the right trick, but that same trick is not so trivial in itself. Plus MMMAAANNN and I worked to make it even a bit more tricky with edge test cases, so I would personally rank it at least as a 5 kyu kata.
One of the "problems" with ranking is also that nowadays it is done mostly by the same group of power users (plus some amateur like me, ok) sweeping through each beta and completing them in a row, not by your average coding Joe out there, so it is rather physiological that the ranks tend to shift on a lower scale, at least compared to earlier times.
Dynamically ranking a kata as users complete it and then rank it again would imho the best solution, so that we also can get to re-rank earlier katas and to stay more in touch with the user base (for example, I saw some frustration in the aforementioned kata, as some users thought that a 7 kyu level kata should be more within their arm's reach).
Power/veteran users could then just weight more than an ordinary user, just to balance the scale a bit better.
Ideas or thoughts about it?
┆Issue is synchronized with this Clickup by Unito
The text was updated successfully, but these errors were encountered: