-
Notifications
You must be signed in to change notification settings - Fork 1
License
robertlayton/tlsh
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
TLSH ======================================= TLSH is a fuzzy matching library. Given a binary object, it generates a hash value. The hash values can be used for similarity comparison. Similary objects have similar hash values. Similar hash values signal similar objects. The output hash is 35 bytes long. The first 3 bytes are used to capture the global similarity. The last 32 bytes are used to capture the local similarity. Here are the design choices. * To improve comparison accuracy, TLSH tracks counting bucket height distribution in quartiles. Bigger quartile difference results in higher difference score. * Use specially 6 trigrams to give equal representation of the bytes in the 5 byte sliding window which produces improved results. * Pearson hash is used to distribute the trigram counts to the counting buckets. * The global similarity score distances objects with significant size difference. Global similarity can be disabled. It also distances objects with different quartile distributions. * TLSH can be compiled to generate 70 or 134 characters hash strings. The longer version is more accurate. TLSH similarity is expressed as a difference score. * A score of 0 means the objects are almost identical. * For the 70 characters hash, a score of 200 or higher means the objects are very different. For the 134 characters hash, a score of 400 or higher means the objects are very different. LICENSE ======================================= Developed by Trend Micro Release under the Apache License (see LICENSE) Please refer to section 4(d) for conditions about the NOTICE.txt file Getting Started ======================================= Obtain TLSH wget https://github.com/trendmicro/tlsh/tarball/3.0.4 -O tlsh.tar.gz tar xfvz tlsh.tar.gz cd trendmicro-tlsh-xxxxxxx (x... is the last git commit ID) OR git clone git://github.com/trendmicro/tlsh.git cd tlsh git checkout 3.0.4 Extract the files from the tarball or clone from the git repo. Run CMake to create the Makefile and then make the project. A static library will be created under the "lib" directory. Look under the "test" directory for example code. Edit TLSH_SHORT in CMakeLists.txt to build the shorter or the longer TLSH hash. Linux, Cygwin tar xvzf tlsh.tar.gz cd trendmicro-tlsh-xxxxxxx mkdir -p build/release cd build/release cmake ../.. make make test Visual Studio on Windows extract tlsh.tar.gz cd trendmicro-tlsh-xxxxxxx mkdir build cd build "c:\program files (x86)\cmake 2.8\bin\cmake" .. Open TLSH.sln in Visual Studio Build Solution Note: tlsh_unittest is not build because it has not been updated for Windows Python extension Prerequisite: Install python 2.7 or above. For Windows, install 32 bits version. If you have Visual Studio 2010 installed, execute SET VS90COMNTOOLS=%VS100COMNTOOLS% or with Visual Studio 2012 installed SET VS90COMNTOOLS=%VS110COMNTOOLS% cd py_ext python setup.py build python setup.py install (sudo, run as root or administrator) import tlsh tlsh.hash(data) tlsh.diffscore(h1, h2) Create source tarball make package_source Changes ======================================= 3.0.0 - Implemented TLSH. - Updated to build with CMake. 3.0.1 - Enabled C++ optimization. Runs 4x faster. 3.0.2 - Supports Windows and Visual Studio. 3.0.3 - Added python extension library. TLSH is callable in Python. - Stop generating hash if the input is less than 512 bytes. - Cleaned up. 3.0.4 - Length difference consideration can be disabled in this version. See totalDiff in tlsh.h. - TLSH can be compiled to generate the 70 or 134 character hashes. The longer version is more accurate.
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published