New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DNM] Limit type-squatting on top gems when creating new rubygems #1809
[DNM] Limit type-squatting on top gems when creating new rubygems #1809
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it! It's nice that it's so straightforward, too. r+ from me with a test.
app/models/pusher.rb
Outdated
unless @rubygem.new_record? | ||
if @rubygem.new_record? | ||
Rubygem.downloaded(50).each do |rubygem| | ||
if Gem::Text.levenstein_distance(name.downcase, rubygem.name.downcase) < 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure it's levenshtein_distance
? Unrelated, if it's fast enough I'd like to do the top 100 (or top 1000?). Maybe Rubygem.downloded(1000).pluck(:name).first{|n| Gem::Text.levenshtein_distance(name.downcase, n.downcase) < 3 }
?
app/models/pusher.rb
Outdated
Rubygem.downloaded(50).each do |rubygem| | ||
if Gem::Text.levenstein_distance(name.downcase, rubygem.name.downcase) < 3 | ||
return notify("The name #{name.inspect} is too close to #{rubygem.name.inspect}.\n" \ | ||
"Please send an email to xxx@rubygems.org if you believe this is an error.", 409) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe help@rubygems.org
is the Tender support email queue.
app/models/pusher.rb
Outdated
unless @rubygem.new_record? | ||
if @rubygem.new_record? | ||
Rubygem.downloaded(50).each do |rubygem| | ||
if Gem::Text.levenstein_distance(name.downcase, rubygem.name.downcase) < 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this well, this checks for ld
distance for top x (now 50) gems and if distance is less than 3, it fails.
In top 50 gems, there is 3 letter gem and 4 letter gems now (according to https://rubygems.org/stats) and it is really easy to came up with a proper name being blocked by this condition.
Is this expected limitation?
list of gems having short (< 5 characters) name in top 50 rubygems:
ffi
rack
rake
json
i18n
thor
tilt
sass
arel
mail
PS: Is it possible to use this method directly? I had to do Class.new.extend(Gem::Text).levenshtein_distance('rack', 'rado')
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point -- maybe allow a distance of 1 for 3 letters, and 2 for four letters?
40459e3
to
17f3284
Compare
Isn't this "superseeded" by #2037 ❓ |
See #2037 |
DNM since I couldn't get tests running locally. Needs new tests added as well, in addition to some performance profiling