Skip to content

Commit

Permalink
Handle two more optional fragments in CC licenses
Browse files Browse the repository at this point in the history
While playing with leveraging SPDX XML <optional> markup I noticed a couple such annoations that reflect texts seen in the wild for CC-BY-SA-4.0 (including 2 "uses" examples cataloged in choosealicense.com) that tightening up of licensee's matching means that it no longer matches. This commit:

- hardcodes stripping those optional fragments (as some already are), and corresponding tests
- moves stripping markdown links up in priority, otherwise other normalization isn't properly applied
- adds apostrophe normalization
- update (c) year

My intent is to remove all of the hardcoded stripping of optional fragments by leveraging <optional> markup, but probably won't get to that immedidately, so offering this short term improvement.
  • Loading branch information
mlinksva committed Feb 13, 2021
1 parent 9cd4ddd commit d30ad55
Show file tree
Hide file tree
Showing 11 changed files with 655 additions and 23 deletions.
12 changes: 6 additions & 6 deletions .rubocop_todo.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This configuration was generated by
# `rubocop --auto-gen-config`
# on 2021-01-25 17:42:52 UTC using RuboCop version 1.8.1.
# on 2021-02-13 03:27:28 UTC using RuboCop version 1.9.0.
# The point is for the user to remove these configuration records
# one by one as the offenses are removed from the code base.
# Note that changes in the inspected code, or installation of new
Expand Down Expand Up @@ -62,7 +62,7 @@ Metrics/MethodLength:
# Offense count: 1
# Configuration parameters: CountComments, CountAsOne.
Metrics/ModuleLength:
Max: 265
Max: 271

# Offense count: 2
# Configuration parameters: IgnoredMethods.
Expand Down Expand Up @@ -90,7 +90,7 @@ Performance/UnfreezeString:
- 'spec/licensee/project_files/license_file_spec.rb'
- 'spec/vendored_license_spec.rb'

# Offense count: 161
# Offense count: 163
# Configuration parameters: Prefixes.
# Prefixes: when, with, without
RSpec/ContextWording:
Expand Down Expand Up @@ -131,17 +131,17 @@ RSpec/FilePath:
RSpec/MultipleExpectations:
Max: 6

# Offense count: 134
# Offense count: 136
# Configuration parameters: AllowSubject.
RSpec/MultipleMemoizedHelpers:
Max: 16

# Offense count: 283
# Offense count: 285
# Configuration parameters: IgnoreSharedExamples.
RSpec/NamedSubject:
Enabled: false

# Offense count: 63
# Offense count: 65
RSpec/NestedGroups:
Max: 6

Expand Down
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2014-2020 Ben Balter and Licensee contributors
Copyright (c) 2014-2021 Ben Balter and Licensee contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
2 changes: 1 addition & 1 deletion docs/command-line-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ License: MIT License
Matched files: LICENSE.md, licensee.gemspec
LICENSE.md:
Content hash: 46cdc03462b9af57968df67b450cc4372ac41f53
Attribution: Copyright (c) 2014-2020 Ben Balter and Licensee contributors
Attribution: Copyright (c) 2014-2021 Ben Balter and Licensee contributors
Confidence: 100.00%
Matcher: Licensee::Matchers::Exact
License: MIT License
Expand Down
18 changes: 13 additions & 5 deletions lib/licensee/content_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ module ContentHelper
developed_by: /#{START_REGEX}developed by:.*?\n\n/im,
quote_begin: /[`'"‘“]/,
quote_end: /[`'"’”]/,
cc_dedication: /The\s+text\s+of\s+the\s+Creative\s+Commons.*?Public\s+Domain\s+Dedication./im,
cc_wiki: /wiki.creativecommons.org/i,
cc_legal_code: /^\s*Creative Commons Legal Code\s*$/i,
cc0_info: /For more information, please see\s*\S+zero\S+/im,
cc0_disclaimer: /CREATIVE COMMONS CORPORATION.*?\n\n/im,
Expand All @@ -39,7 +41,8 @@ module ContentHelper
quotes: {
from: /#{REGEXES[:quote_begin]}+([\w -]*?\w)#{REGEXES[:quote_end]}+/,
to: '"\1"'
}
},
apostrophe: { from: '’', to: "'" }
}.freeze

# Legally equivalent words that schould be ignored for comparison
Expand Down Expand Up @@ -90,18 +93,16 @@ module ContentHelper
}.freeze
STRIP_METHODS = %i[
bom
cc_optional
cc0_optional
unlicense_optional
hrs
markdown_headings
borders
title
version
url
copyright
title
block_markup
link_markup
developed_by
end_of_terms
whitespace
Expand Down Expand Up @@ -148,7 +149,7 @@ def content_hash
def content_without_title_and_version
@content_without_title_and_version ||= begin
@_content = nil
ops = %i[html hrs comments markdown_headings title version]
ops = %i[html hrs comments markdown_headings link_markup title version]
ops.each { |op| strip(op) }
_content
end
Expand Down Expand Up @@ -272,6 +273,13 @@ def strip_cc0_optional
strip(REGEXES[:cc0_disclaimer])
end

def strip_cc_optional
return unless _content.include? 'creative commons'

strip(REGEXES[:cc_dedication])
strip(REGEXES[:cc_wiki])
end

def strip_unlicense_optional
return unless _content.include? 'unlicense'

Expand Down
173 changes: 173 additions & 0 deletions spec/fixtures/cc-by-sa-mdlinks/License.md

Large diffs are not rendered by default.

Loading

0 comments on commit d30ad55

Please sign in to comment.