New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Command to remove bad aliases. #3
Comments
Changed to complete linkage clustering instead of single linkage. This should stop the above from happening, but a more general solution is still needed. https://en.wikipedia.org/wiki/Single-linkage_clustering |
With complete-linkage we can still have two pre-existing clusters joined together right? It just requires a stricter match between the two? |
I don't think we will see something like the above. If we have the two clusters: a, a_, a_c And then the nick c gets added, I think we would be more likely to see the two existing clusters broken apart like so: a, a_ |
Seems like that fixes this issue, but opens up other problems with our clustering algorithm. Since nick clusters represent a single identity, it doesn't really make sense for a new cluster to be formed with the pieces of others. Assuming the current clusters are correct (every nick is clustered into the cluster representing the correct identity), then all new aliases should either be grouped with a previously existing cluster, or form a group of their own. My suggestion is that any new clustering algorithm obey this invariant: Assuming that any clustering algorithm has some likelihood of clustering incorrectly, I think this would allow for the most ease of use (as it would only require the rearrangement of a single nick). What we've got now could cause any number of distortions to the group clusters, per nick introduction, depending upon similarities between groups. |
Closing this issue because the original premise is solve. Opening a new issue for improving the algorithm #22 . |
It's possible to troll the nick clustering service into grouping all names together by creating a sequence of nicks of small edit distance. As of now the only recourse is to manually edit the persistent nick cluster file.
Example:
22:54 -!- Zhenya is now known as z
22:55 < z> nvm
22:55 @jesse almost
22:55 @jesse !alias z
22:55 < zhenya_bot> Zhenya, Zhenya_home, Zhenya_work,
Zhenya, will, will_, z, zbot_will 22:55 <@jesse> oh weird 22:55 <@jesse> !alias Zhenya 22:55 < zhenya_bot> Zhenya, Zhenya_home, Zhenya_work,
Zhenya, will, will_, z, zbot_will22:55 @jesse lol
22:55 @jesse you made a z name
22:55 @jesse and it combined your group with wills
22:55 @jesse you ruined all of name clustering
22:55 @jesse hahahaha
The text was updated successfully, but these errors were encountered: