DJB2 Algorithm #1925

waleedyaseen · 2019-02-20T11:46:20Z

Description

I would like the DJB2 algothim implemented.

Usage

This algorithm is not widely used, it is rarely seen however, the Java programming language uses
it for it's String.hashCode() function, and it is used to check if strings are equal without exposing the
String content which is usually short names.

Implementation

Example implementation in the C language:

int hash(char *text) {
    int name32 = 0;
    int count = strlen(text);
    for (int index = 0; index < count; index++) {
        name32 = (name32 << 5) - name32 + text[index];
    }
    return name32;
}

Or the JVM implementation of it:

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;
        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

or another simpler implementation in Java:

public static int hash(CharSequence text) {
    int count = text.length();
    int name32 = 0;
    for (int index = 0; index < count; index++) {
        name32 = (name32 << 5) - name32 + text.charAt(index);
    }
    return name32;
}

Example

Input	Output (Decimal)	Output (Hexadecimal)
`iceberg`	`1629187779`	`611b6ec3`
`cheese`	`-1361525545`	`aed8c4d7`
`hard`	`3195115`	`30c0eb`
`chip`	`3052620`	`2e944c`

The text was updated successfully, but these errors were encountered:

jsteube · 2019-02-20T15:30:53Z

I think it's also used in perl, not just Java. However the results are not matching. For example with the C code I do get 2163000881 (Decimal) as result for iceberg. Something is wrong, please verify everything again.

jsteube · 2019-02-20T15:39:39Z

I found the reason. Your C code does not match the Java code. For example, the Java code does not have that initialization value 5381. Also the multiplier is 33, not 31.

waleedyaseen · 2019-02-20T18:01:11Z

I apologize, I've updated the C implementation to match the one in the Java function, they both output identical results now.

jsteube · 2019-02-20T19:39:23Z

What's the reason to have this in hashcat? It should be pretty easy to collide the hash. Please explain.

waleedyaseen · 2019-02-20T19:54:58Z

This algorithm is widely used to hide the string content in a obfuscated code, and when reverse engineering, it could take huge amount of time to reverse a single string on the CPU, and by implementing it in hashcat, it makes the process go quite faster to retrieve the string content in that code.

The obfuscated code usually comes with many strings (could be in thousands), so using the CPU to crack all of those is really painful and time consuming.

jsteube · 2019-02-20T20:11:14Z

Wait, there's no way to reverse back the original string.

For example the strings:

BA

and

A`

they both produce the same hash 0x0000083f. There's no way to tell which is the correct string.

waleedyaseen · 2019-02-20T20:21:37Z

That's correct, you could collide them pretty easily, but it's unlikely when dealing with file names or other names, which they use to hide in the most of the cases.

However, you could tell which one is the correct/wanted one by telling which one makes more sense, let's say debug.txt and OwinxTU collide for the sake of giving an example, it would be obviously for the person who's trying to crack it that the first result is the one we are looking for.

jsteube · 2019-02-20T20:36:08Z

OK, but then there are many collisions for the same hash possible. Do you have any plans on controlling the keyspace low enough so that the number of collisions is low enough to work with the result?

Can you please give me a real life example where this obfuscation was used?

waleedyaseen · 2019-02-20T21:03:05Z

The collision can be minimised to very little or none by using a related words list for a dictionary or a hybird attack/cracking.

Regarding the examples, there is plenty of Java applications that do this to hide their file names, such as the RuneScape file storage system, you can click here to see an example of reversing some of these names.

jsteube · 2019-02-21T11:29:22Z

How about 64 bit version? How important would that be?

jsteube · 2019-02-21T11:40:59Z

There's also a large number of possible optimizations. For example the last character can be detected automatically. How to handle that?

waleedyaseen · 2019-02-21T12:48:16Z

As for the 64-bit version of this algorithm, I would say it is really unnecessary to implement as it is very uncommon to come across, don't see a need for it as of now.

However, as for the optimisations, I am quite unaware on how to handle that properly.

jsteube · 2019-02-21T12:57:39Z

So I've pushed support for DJB (32 bit output) with commit 51eb9eb

If you do not want to build from source you can use the beta on https://hashcat.net/beta/

Simply because it was so easy to add. Anyway, this implementation does not include any optimizations such as the last letter can be reversed etc. I hope it does what you want to do with it.

You maybe want to use the --keep-guessing flag to see all collisions if there are any.

If you want some fun, try: ./hashcat -m 18700 -O -w 3 ffffffff -a 3 ?l?l?l?l?l?l?l?l --keep-guessing

jsteube · 2019-02-22T08:24:47Z

Please note that I'll rename this algorithm to "Java Object hashCode()". The original DJB algorithm uses * 33 and not *31 plus it has an initialization value. Just to avoid confusion.

Plazmaz · 2020-06-17T00:14:40Z

This naming convention is pretty unclear. Java's Object.hashCode function varies wildly based on JVM, object, and implementation. Many objects explicitly override it. Also, the example hash given is super confusing, as Object.hashCode only ever returns an int value, so hex encoded is probably not how you're going to see it in the wild. I am a little confused here, because the default Object.hashCode guarantees no uniqueness. I would imagine there will be a ton of collisions here, and most of those would not be the value you're looking for.

immibis · 2020-12-19T16:09:33Z

This naming convention is pretty unclear. Java's Object.hashCode function varies wildly based on JVM, object, and implementation. Many objects explicitly override it. Also, the example hash given is super confusing, as Object.hashCode only ever returns an int value, so hex encoded is probably not how you're going to see it in the wild. I am a little confused here, because the default Object.hashCode guarantees no uniqueness. I would imagine there will be a ton of collisions here, and most of those would not be the value you're looking for.

FYI: the Java String hashCode algorithm is documented: https://docs.oracle.com/javase/7/docs/api/java/lang/String.html - any JVM which doesn't implement this is a bug.

Object.hashCode is the implementation specific one, but default object hashcodes are pretty useless, and not something you can crack, because they are basically the address of the object. I agree it should be called "Java String hashCode" instead of "Java Object hashCode"

jsteube closed this as completed Feb 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DJB2 Algorithm #1925

DJB2 Algorithm #1925

waleedyaseen commented Feb 20, 2019 •

edited

jsteube commented Feb 20, 2019

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 •

edited

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 •

edited

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 •

edited

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 •

edited

jsteube commented Feb 21, 2019

jsteube commented Feb 21, 2019

waleedyaseen commented Feb 21, 2019

jsteube commented Feb 21, 2019

jsteube commented Feb 22, 2019

Plazmaz commented Jun 17, 2020 •

edited

immibis commented Dec 19, 2020

DJB2 Algorithm #1925

DJB2 Algorithm #1925

Comments

waleedyaseen commented Feb 20, 2019 • edited

Description

Usage

Implementation

Example

jsteube commented Feb 20, 2019

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 • edited

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 • edited

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 • edited

jsteube commented Feb 20, 2019

waleedyaseen commented Feb 20, 2019 • edited

jsteube commented Feb 21, 2019

jsteube commented Feb 21, 2019

waleedyaseen commented Feb 21, 2019

jsteube commented Feb 21, 2019

jsteube commented Feb 22, 2019

Plazmaz commented Jun 17, 2020 • edited

immibis commented Dec 19, 2020

waleedyaseen commented Feb 20, 2019 •

edited

waleedyaseen commented Feb 20, 2019 •

edited

waleedyaseen commented Feb 20, 2019 •

edited

waleedyaseen commented Feb 20, 2019 •

edited

waleedyaseen commented Feb 20, 2019 •

edited

Plazmaz commented Jun 17, 2020 •

edited