-
Notifications
You must be signed in to change notification settings - Fork 125
Description
For whom concern,
This is Tsukasa OI, a maintainer of ssdeep.
Sorry for not maintaining for a long time while I was busy on the job. I'm now reviewing the original C source code again and looking for some improvements. However, there is an issue (the major one): preserving portability in C is hard. Per-OS code spreads everywhere. Some tools / fragments are old and we don't even know what platform/tools to support.
(even if we don't rewrite it in Rust, we definitely need some cleaning)
Then, a Rust guy recommended me to try rewriting it in Rust. Well... (about 2 weeks later) the result looks... promising.
I ported libfuzzy and a part of ssdeep (CLI) to Rust and... it performs faster than libfuzzy when comparing fuzzy hashes, even if we don't use any unsafe blocks (on fuzzy hash generation, the safe Rust version was about 15% slower). With unsafe Rust, it's definitely faster than libfuzzy (both in comparison and hash generation) and surprisingly... it got faster than ffuzzy++, my C++ port of libfuzzy (generally faster than libfuzzy and has a specialized API for large scale clustering) when I enabled LTO build. I haven't implemented all features in ssdeep (CLI) but it seems more readable.
In the process doing this, I found a bug inside fuzzy.c (I am struggling to find a failure test case because it seems very hard to reproduce) and will fix later (probably next week).
Anyway, back to Rust. It looks promising but I'm not sure whether this is the future we (as a project) should go. At least, we should discuss about it.
In a few weeks, I will release Rust port of the original ssdeep (at least, most features) and libfuzzy in my GitHub (not in ssdeep-project) and I would like to hear your thoughts.
Request for Comments
- What platform we should support ssdeep?
- What do you think about moving to Rust?