Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fingerprint plugin #1651

Merged
merged 2 commits into from
Jul 18, 2019
Merged

Add fingerprint plugin #1651

merged 2 commits into from
Jul 18, 2019

Conversation

arnav-mandal1234
Copy link
Collaborator

@arnav-mandal1234 arnav-mandal1234 commented Jul 16, 2019

  • Addition of the new fingerprint plugin
  • Implementation and integration of the fingerprint generation algorithm
  • Modification of old unit tests
  • Addition of new unit tests

Original PR: #1576 [closed as something went wrong during final rebasing]

Signed-off-by: arnav-mandal1234 <arnav.mandal1234@gmail.com>
@codecov
Copy link

codecov bot commented Jul 16, 2019

Codecov Report

Merging #1651 into develop will decrease coverage by 68.07%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##           develop    #1651       +/-   ##
============================================
- Coverage    81.16%   13.08%   -68.08%     
============================================
  Files          125      125               
  Lines        15478    15478               
============================================
- Hits         12563     2026    -10537     
- Misses        2915    13452    +10537
Impacted Files Coverage Δ
src/summarycode/utils.py 0% <0%> (-100%) ⬇️
src/cluecode/copyrights_hint.py 0% <0%> (-100%) ⬇️
src/cluecode/plugin_copyright.py 0% <0%> (-100%) ⬇️
src/cluecode/plugin_email.py 0% <0%> (-100%) ⬇️
src/plugincode/output_filter.py 0% <0%> (-100%) ⬇️
src/scancode/plugin_only_findings.py 0% <0%> (-100%) ⬇️
src/commoncode/version.py 0% <0%> (-100%) ⬇️
src/scancode/plugin_info.py 0% <0%> (-100%) ⬇️
src/formattedcode/output_jsonlines.py 0% <0%> (-100%) ⬇️
src/licensedcode/tracing.py 0% <0%> (-100%) ⬇️
... and 105 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d7fee0b...2ac893c. Read the comment docs.

@codecov
Copy link

codecov bot commented Jul 16, 2019

Codecov Report

Merging #1651 into develop will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1651      +/-   ##
===========================================
+ Coverage    81.16%   81.18%   +0.01%     
===========================================
  Files          125      125              
  Lines        15478    15478              
===========================================
+ Hits         12563    12566       +3     
+ Misses        2915     2912       -3
Impacted Files Coverage Δ
src/typecode/pygments_lexers.py 52.25% <0%> (-0.65%) ⬇️
src/licensedcode/index.py 73.99% <0%> (+0.22%) ⬆️
src/licensedcode/query.py 75.8% <0%> (+0.26%) ⬆️
src/scancode/api.py 94.4% <0%> (+0.69%) ⬆️
src/textcode/markup.py 95.65% <0%> (+2.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d7fee0b...6ef2042. Read the comment docs.

Copy link
Contributor

@steven-esser steven-esser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See some minor comments. Others (@JonoYang and @pombredanne) may have more feedback for you as well.

Software license
================

Copyright (c) 2017 nexB Inc. and others. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should bump this copyright year to the current year, or simply remove the date portion entirely.

@@ -0,0 +1 @@
A ScanCode scan plugin to generate fingerprints using Simhash algorithm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a bit more detail here? Maybe some short background on the algorithm or some links AND how to go about running it from the command line?


def hamming_distance(self, fingerprint1, fingerprint2):
"""
Return hamming distance between two given fingerprints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a bit more detail here in the docstring as to what hamming distance numbers mean what? For example, what would one expect the hamming distance to be for similar files and what would it be for completely different files. You can sort of get an idea of this looking at the test cases, but it would nice to have that information in the source code for future reference.

"""
def __init__(self):
self.tokens = []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor formatting issue: there should be one new line between methods and two new lines between functions and classes.

Signed-off-by: arnav-mandal1234 <arnav.mandal1234@gmail.com>
@JonoYang JonoYang changed the title Adds fingerprint plugin Add fingerprint plugin Jul 17, 2019
@JonoYang JonoYang merged commit 2594d0f into aboutcode-org:develop Jul 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants