You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently Raspirus is highly dependant on MD5 signatures. If there is a virus whose signature we don't have, Raspirus has no way to know it's a virus. Even if we create a massive database with all possible MD5 signatures and always keep it up to date, an attacker could still just add a white-space to the file and completely change the MD5 signature.
Describe the solution you'd like
It would be great to have a system that tells us how likely a file is a malware. Ideally, it should be lightweight and fast. That's where fuzzy hashing comes in play, it creates a hash of a given file, just like MD5, but with the added benefit that we can compare one hash to the other. Implementing this would give us the ability to compare a given file to a database of known-malware signatures and return a percentage of how similar a file is. Then we add a threshold and everything above that threshold is considered malware, everything below is considered safe.
Describe alternatives you've considered
A machine learning algorithm - Too slow and unpredictable. Also hard to implement with the current setup
Yara signatures - Resource intensive, would drop support for Single board computers and lower-end PCs
File analysis - Too slow, would require opening each file and "look at it"
Additional context
The current issue is gathering the fuzzy hashes, this might take a while. And even then, we would still need to keep the database up to date and reformat the backend. We might allow the user to choose between MD5 signatures (Fast, higher coverage, higher miss-rate) and Fuzzy hashing (Lower coverage due to missing samples, lower miss-rate and more accurate analysis)
The text was updated successfully, but these errors were encountered:
Sounds about right. This will presumably greatly increase the time scanning takes, so we might have to come up with something in regards to that (Threading?? / making fuzzy hashing optional if you just want a quick scan?)
Threading might be a good idea, but we might need to scale it in relation to the user's resources. Also maybe adding a switch on the frontend to choose between signature scanning and fuzzy scanning might be useful
Is your feature request related to a problem? Please describe.
Currently Raspirus is highly dependant on MD5 signatures. If there is a virus whose signature we don't have, Raspirus has no way to know it's a virus. Even if we create a massive database with all possible MD5 signatures and always keep it up to date, an attacker could still just add a white-space to the file and completely change the MD5 signature.
Describe the solution you'd like
It would be great to have a system that tells us how likely a file is a malware. Ideally, it should be lightweight and fast. That's where fuzzy hashing comes in play, it creates a hash of a given file, just like MD5, but with the added benefit that we can compare one hash to the other. Implementing this would give us the ability to compare a given file to a database of known-malware signatures and return a percentage of how similar a file is. Then we add a threshold and everything above that threshold is considered malware, everything below is considered safe.
Describe alternatives you've considered
Additional context
The current issue is gathering the fuzzy hashes, this might take a while. And even then, we would still need to keep the database up to date and reformat the backend. We might allow the user to choose between MD5 signatures (Fast, higher coverage, higher miss-rate) and Fuzzy hashing (Lower coverage due to missing samples, lower miss-rate and more accurate analysis)
The text was updated successfully, but these errors were encountered: