Skip to content

Initial-One/Java-humanify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Java Humanify

Deobfuscate Java code using LLMs ("ChatGPT, Ollama, DeepSeek, etc.")

Java Humanify uses large language models (OpenAI, DeepSeek, Ollama, etc.) to suggest better names (classes, methods, fields, locals).
All actual code changes happen on the AST (JavaParser + Symbol Solver), so the output stays semantically 1:1 with the input.

Now it also supports auto-Javadoc generation for classes/constructors/methods via annotate (LLM or offline heuristics).

Inspired by HumanifyJS — but for Java bytecode turned into decompiled sources.


Why this exists

Decompiled / minified / obfuscated Java is painful to read:

package demo.mix;public final class a{private static final int[] O={0,1,1,2};private a(){}public static int h(String s){long x=0x811c9dc5L;if(s==null)return 0;int i=0,n=s.length(),j=O[2];while(i<n){char c=s.charAt(i++);x^=c;x*=0x01000193L;x&=0xffffffffL;j^=(c<<1);j^=j>>>7;if((i&3)==0)x^=(j&0xff);}return (int)x;}}

Java Humanify renames identifiers and (optionally) adds Javadoc:

package demo.mix;

/**
 * Computes a 32-bit hash for the input string using FNV-1a with additional state mixing.
 */
public final class HashCalculator {

    private static final int[] O = { 0, 1, 1, 2 };

    /**
     * Private constructor to prevent instantiation of this utility class.
     */
    private HashCalculator() {}

    /**
     * Calculates a 32-bit hash value for the input string using FNV-1a with additional state mixing.
     *
     * @param inputString parameter
     * @return return value
     */
    public static int calculateHash(String inputString) {
        long storedValue = 0x811c9dc5L;
        if (inputString == null) return 0;
        int index = 0, stringLength = inputString.length(), hashState = O[2];
        while (index < stringLength) {
            char currentChar = inputString.charAt(index++);
            storedValue ^= currentChar;
            storedValue *= 0x01000193L;
            storedValue &= 0xffffffffL;
            hashState ^= (currentChar << 1);
            hashState ^= hashState >>> 7;
            if ((index & 3) == 0) storedValue ^= (hashState & 0xff);
        }
        return (int) storedValue;
    }
}

LLMs do not touch your code structure.
They only propose names / comments. Renaming is applied on the AST with symbol resolution; constructors/imports/file names kept in sync.


Highlights

  • Pluggable LLMs for rename & docs: OpenAI / DeepSeek / Local (Ollama, OpenAI-compatible)
  • Auto-Javadoc via annotate:
    • Targets classes/enums/records/constructors/methods
    • Short, safe summaries (no wild guesses)
    • Auto @param/@return/@throws based on signatures
    • Offline heuristics available (no API key needed)
  • Signature-accurate renames (classFqn / methodSig / fieldFqn with fallbacks)
  • Robust AST transforms keep code compiling

Quick start

One-shot pipeline (analyze → suggest → apply → annotate)

# OpenAI example
export OPENAI_API_KEY=sk-xxxx
java -jar target/java-humanify-*.jar humanify \
--provider openai \
--model gpt-4o-mini \
samples/src samples/out
# DeepSeek example
export DEEPSEEK_API_KEY=sk-xxxx
java -jar target/java-humanify-*.jar humanify \
--provider deepseek \
--model deepseek-chat \
samples/src samples/out
# Local (Ollama) example
# make sure model is pulled: ollama run llama3.1:8b    
java -jar target/java-humanify-*.jar humanify \
--provider local \
--local-api ollama \
--endpoint http://localhost:11434 \
--model llama3.1:8b \
samples/src samples/out

humanify will:

  1. analyze → 2) suggest → 3) apply → 4) annotate
    Use --lang/--style/--overwrite to control the annotate step.

CLI commands

You can also run steps individually.

1) Analyze

java -jar java-humanify.jar analyze \
<srcDir> <snippets.json> \
[--maxBodyLen 1600] \
[--includeStrings true] \
[--exclude "glob/**"]

Outputs snippets.json with per-method code & metadata (package, FQN, signature, strings).


2) Suggest

java -jar java-humanify.jar suggest \
<snippets.json> <mapping.json> \
[--provider dummy|openai|deepseek|local] \
[--model gpt-4o-mini|deepseek-chat|<local-model>] \
[--batch 12] [--endpoint http://localhost:11434] \
[--local-api ollama|openai] \
[--timeout-sec 180] 

Auth options: OpenAI OPENAI_API_KEY, DeepSeek DEEPSEEK_API_KEY, Local: --endpoint + --model.


3) Apply

java -jar java-humanify.jar apply \
<srcDir> <mapping.json> <outDir> \
[--classpath jarOrDir[:morePaths]]

Renames classes/constructors/imports/new expressions/methods/fields/locals with conflict checks; updates file names.
--classpath helps Symbol Solver.


4) Annotate (NEW)

java -jar java-humanify.jar annotate \
--src <dir[,dir2,...]> \
[--lang en|zh] \
[--style concise|detailed] \
[--overwrite] \
[--provider dummy|openai|deepseek|local] \
[--model <model-name>] \
[--local-api ollama|openai] \
[--endpoint http://localhost:11434] \
[--timeout-sec 180] \
[--batch 12]
  • --provider dummy → offline heuristics (no network)
  • OpenAI: OPENAI_API_KEY
  • DeepSeek: DEEPSEEK_API_KEY
  • Local:
    • --local-api ollama with --endpoint http://localhost:11434
    • --local-api openai with an OpenAI-compatible endpoint (e.g. http://localhost:1234/v1)

5) Humanify (pipeline)

java -jar java-humanify.jar humanify \
[provider/model/annotate opts...] \
<srcDir> <outDir>

Creates _pass1 (after class renames), then final output in <outDir>, and finally annotates it.

Common annotate flags via humanify:

  • --lang en|zh
  • --style concise|detailed
  • --overwrite

Performance & Costs

  • Rename “Suggest” cost depends on text size; small projects are cheap.
  • Local models are free but typically slower and less accurate.
  • annotate --provider dummy is offline & free; LLM doc quality is better but costs tokens.

Contributing

Issues and PRs are welcome! Please:

  • Use feature branches
  • Keep changes small & tested
  • Follow existing code style

License

This project is licensed under the Apache-2.0 License.
See LICENSE for details.


Appendix: CLI help (short)

java -jar java-humanify.jar analyze  <srcDir> <snippets.json> [opts]
java -jar java-humanify.jar suggest  <snippets.json> <mapping.json> [opts]
java -jar java-humanify.jar apply    <srcDir> <mapping.json> <outDir> [--classpath ...]
java -jar java-humanify.jar annotate --src <dir[,dir2,...]> [--lang/--style/--overwrite ...]
java -jar java-humanify.jar humanify <srcDir> <outDir> [provider/model/annotate opts...]

About

Deobfuscate Java code using LLMs ("ChatGPT,Ollama,DeepSeek,etc.")

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages