Humanify

Deobfuscate Javascript code using LLMs ("AI")

This tool uses large language modeles (like ChatGPT & llama2) and other tools to deobfuscate, unminify, transpile, decompile and unpack Javascript code. Note that LLMs don't perform any structural changes – they only provide hints to rename variables and functions. The heavy lifting is done by Babel on AST level to ensure code stays 1-1 equivalent.

➡️ Check out the introduction blog post for in-depth explanation!

Example

Given the following minified code:

function a(e,t){var n=[];var r=e.length;var i=0;for(;i<r;i+=t){if(i+t<r){n.push(e.substring(i,i+t))}else{n.push(e.substring(i,r))}}return n}

The tool will output a human-readable version:

function splitString(inputString, chunkSize) {
  var chunks = [];
  var stringLength = inputString.length;
  var startIndex = 0;
  for (; startIndex < stringLength; startIndex += chunkSize) {
    if (startIndex + chunkSize < stringLength) {
      chunks.push(inputString.substring(startIndex, startIndex + chunkSize));
    } else {
      chunks.push(inputString.substring(startIndex, stringLength));
    }
  }
  return chunks;
}

🚨 NOTE: 🚨

Large files may take some time to process and use a lot of tokens if you use ChatGPT. For a rough estimate, the tool takes about 2 tokens per character to process a file:

echo "$((2 * $(wc -c < yourscript.min.js)))"

So for refrence: a minified bootstrap.min.js would take about $0.5 to un-minify using ChatGPT.

Using --local flag is of course free, but may take more time, be less accurate and not possible with your existing hardware.

Getting started

First install the dependencies:

npm install

Next you'll need to decide whether to use ChatGPT or llama2. In a nutshell:

ChatGPT
- Runs on someone else's computer that's specifically optimized for this kind of things
- Costs money depending on the length of your code
- Is more accurate
- Is (probably) faster
llama2
- Runs locally
- Is free
- Is less accurate
- Needs a local GPU with ~60gb RAM (M1 Mac works just fine)
- Runs as fast as your GPU does

See instructions below for each option:

ChatGPT

You'll need a ChatGPT API key. You can get one by signing up at https://openai.com/.

There are several ways to provide the API key to the tool:

echo "OPENAI_TOKEN=your-token" > .env && npm start --  -o deobfuscated.js obfuscated-file.js
export OPENAI_TOKEN="your-token" && npm start --  -o deobfuscated.js obfuscated-file.js
OPENAI_TOKEN=your-token npm start --  -o deobfuscated.js obfuscated-file.js
npm start -- --key="your-token"  -o deobfuscated.js obfuscated-file.js

Use your preferred way to provide the API key. Use npm start -- --help to see all available options.

llama2

Prerequisites:

You'll need to have a Python 3 environment with conda installed.
You need a Huggingface account with access to llama-2-7b-chat-hf model. Make sure to read the instructions on the model page about how to access the model.

Run the following command to install the required Python packages and activate the environment:

conda env create -f environment.yaml
conda activate base

You can now run the tool with:

npm start -- --local -o deobfuscated.js obfuscated-file.js

Note: this downloads ~13gb of model data to your computer on the first run.

Features

The main features of the tool are:

Uses ChatGPT functions/llama2 to get smart suggestions to rename variable and function names
Uses custom and off-the-shelf Babel plugins to perform AST-level unmanging
Uses Webcrack to unbundle Webpack bundles

Contributing

If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.

Licensing

The code in this project is licensed under MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
local-inference		local-inference
src		src
.gitignore		.gitignore
.npmrc		.npmrc
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

local-inference

local-inference

src

src

.gitignore

.gitignore

.npmrc

.npmrc

LICENSE

LICENSE

README.md

README.md

environment.yaml

environment.yaml

package-lock.json

package-lock.json

package.json

package.json

tsconfig.json

tsconfig.json

Repository files navigation

Humanify

➡️ Check out the introduction blog post for in-depth explanation!

Example

Getting started

ChatGPT

llama2

Features

Contributing

Licensing

About

Contributors 3

Languages

License

jehna/humanify

Folders and files

Latest commit

History

Repository files navigation

Humanify

➡️ Check out the introduction blog post for in-depth explanation!

Example

Getting started

ChatGPT

llama2

Features

Contributing

Licensing

About

Resources

License

Stars

Watchers

Forks

Languages