Skip to content
A Text Scanner which can find same or similar sourcecode
Branch: master
Clone or download
Latest commit 71bcb50 Dec 7, 2016
Type Name Latest commit message Commit time
Failed to load latest commit information.


SameCodeFinder is a static code text scanner which can find the similar or the same code file in a big directory.


SameCodeFinder could detect the same function in the source code files. The finder could show the Hamming Distacnce between two funcitons.

  • Find the same code which need to be extract to reuse
  • Show the Hamming Distance between each soucecode file(Support All kinds of soucecode type)
  • Show the Hamming Distance between each soucecode function(Support Java and Object-C now)

The below photo show the calculate result of MWPhotoBrowser Scan result of MWPhotoBrowser

The result come from the command

python ~/Projects/opensource/MWPhotoBrowser/ .m  --max-distance=10 --min-linecount=3 --functions --detail


Install the python implement of SimHash

pip install simhash

Visit A Python Implementation of Simhash Algorithm if you want to know more about the module.

python [arg0] [arg1] 


  • [arg0]
    • Target Directory of files should be scan
  • [arg1]
    • Doc Suffix of files should be scan, eg
      • .m - Object-C file
      • .swift - Swift file
      • .java - Java file
  • --detail
    • show process detail of scan
  • --functions
    • Use Functions as code scan standard
  • --max-distance=[input]
    • max hamming distance to keep, default is 20
  • --min-linecount=[input]
    • for function scan, the function would be ignore if the total line count of the function less than min-linecount
  • --output=[intput]
    • Customize the output file, default is "out.txt"


Python 2.6+, Pip 9.0+, simhash


SameCodeFinder is available under the MIT license. See the LICENSE file for more info.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.