Proudly Made in Nebraska. Go Big Red! π½ https://xkcd.com/2347/
Canonical home is Codeberg. The repo is also mirrored on GitHub for CI (Actions), Sigstore OIDC signing, and the GHSA security-advisory CNA path. Both forges carry the same commits.
-
Previously "Ghidra: Top 42 Principal-Architect Recommendations" ("GT42PAR" didn't have the pinache "GayHydra" does! π³οΈβπ)
-
Audit date: 2026-05-21. Repo: NationalSecurityAgency/ghidra @ master (94164bd6e9).*
Findings drawn from: 1,553 open issues, 335 open PRs, ~15.5k Java files, 187k LOC C++ decompiler, 39 processor specs, and the actual CI workflows.
Each rec is filed as a GitHub issue (#1β#42). PRs that address a rec close the issue with closes #N and tick the checkbox below.
-
1. Fix the PR graveyard. 77% of open PRs (257/335) have never received a single maintainer comment; 67% are still labeled
Status: Triage, including 4+ year old PRs. This is the single biggest pathology in the project. Nothing else on this list matters as much. β PR_QUEUE_POLICY.md -
2. Adopt an explicit triage SLA. Commit publicly to first-response-within-N-days. "First response" can be "rejected, here's why" β that is infinitely better than silence. The current implicit policy ("we'll get to it") is corroding contributor trust. β TRIAGE_SLA.md
-
3. Close the door on dead PRs. ~76 PRs are >4yr old. Bulk-close with a respectful template ("we appreciate this work but cannot review it; please reopen if you'd like to rebase against current master and re-submit under [new lane X]"). A clean queue is honest; a 335-deep queue is theater. β STALE_POLICY.md, stale.yml
-
4. Create a "processor submission" fast-lane. 54% of open PRs (180) are processor/Sleigh additions β this is a structurally distinct review (correctness against ISA docs, not architectural fit). It needs a dedicated reviewer rotation, an explicit checklist, and a separate queue. Mixing it with framework work is why both queues stall. β PROCESSOR_LANE.md, PR template
-
5. Create a separate lane for decompiler correctness fixes. PRs like #6718 (shifted struct-offset loop bug) age the same as 13k-line processor submissions. Correctness regressions in the crown jewel should have an expedited path. β DECOMPILER_CORRECTNESS_LANE.md
-
6. Adopt an RFC process for mega-PRs. #4103 (WebAssembly, 4.2yr, +15,387 LOC) and #5778 (RISC-V vector/crypto, 2.7yr, +6,676) cannot be reviewed line-by-line against a closed-development model. Require a design RFC first, then merge in small landings. Today these PRs are humanitarian disasters for the contributors who wrote them. β RFC_PROCESS.md, template
-
7. Publish the bus factor. Comment analysis shows maintainer engagement concentrates on ~9 logins (
ryanmkurtz,ghidra1,emteere,dragonmacher,pjsoberoi,mumbel,GhidorahRex,Sleigh-InSPECtor,nsadeveloper789). Make that explicit. Community contributors (jobermayr,astrelsky,LukeSerne,nneonneo) are doing unpaid triage with no recognition or commit bit β formalize this. β MAINTAINERS.md -
8. Reform the "Status: Triage" label. 184 issues and 226 PRs sit in Triage. The label has become a synonym for "we have looked at this exactly zero times." Either remove it or wire it to an SLA. β LABEL_POLICY.md
-
9. Resolve #4103 WebAssembly definitively. 46 comments, 4.2 years, the single most-asked-for feature in the queue. Land, fork, or close. Limbo is the worst outcome β it's been the worst outcome for four years. β decision 0001
-
10. Resolve #5778 RISC-V vector/bitmanip/crypto. Labeled
Waiting on customerfor years. Convert "waiting" into a yes or a no. RISC-V is no longer a research ISA; analyzing modern firmware without these extensions is an increasingly visible gap. β decision 0002
-
11. Publish SECURITY.md. None exists. For a tool that parses adversary-controlled binaries and ships a network server, this is below baseline. Document private disclosure address, embargo policy, and CVE assignment path. LLVM and binutils do this; Ghidra should too. β SECURITY.md
-
12. Issue public CVE IDs. Internal
GP-*tracker IDs (GP-6832, GP-6719, GP-258) hide security fixes from NVD. Downstream packagers and enterprise security teams cannot patch what they cannot see. Recent server-side hardening (path-traversal in username validation, race in writeUserList, RMI deserialization filter) deserved CVEs and didn't get them. β CVE_POLICY.md -
13. Land an OSS-Fuzz integration for the C++ decompiler. Zero public fuzz harness. 187k LOC of C++ parsing untrusted p-code/Sleigh input with 2,740 raw
new/deletesites. This is the highest-EV security investment in the codebase. Google OSS-Fuzz is free. β OSS_FUZZ.md, harnesses, oss-fuzz project -
14. Land fuzz harnesses for file-format loaders. ELF, PE, Mach-O, DEX, PDB, DWARF, COFF β every loader parses attacker-controlled input. No public fuzzing exists for any of them. Start with the most-used three. β LOADER_FUZZING.md, harnesses
-
15. Build ASAN/UBSAN CI variants of the decompiler. Makefile has the flags commented out (
decompile/cpp/Makefile:256). Enable them in a nightly CI job. With ~2,700 raw allocations, this will surface real bugs immediately. β Makefile targettest_san, workflow, doc -
16. Ship an explicit script sandbox option.
GhidraScript(Java) andPyGhidraScriptProvider(Python) run with full JVM privileges including reflection and arbitrary classloading. There is no opt-in sandbox mode. Headless mode + a malicious script in a shared directory = full code execution. At minimum, add a "trusted scripts only" mode that refuses to run scripts outside a signed/configured allowlist. β SCRIPT_SANDBOX.md -
17. Sign released decompiler binaries and update the doc. The decompiler ships as a native executable; binary distribution integrity is a real supply-chain question. Document the verification path. β BINARY_SIGNING.md
-
18. Investigate #1481 (data-type archive deserialization amplification). Only CWE-tagged open security issue. The archive/extension format deserialization path deserves a principal-level review across the whole pipeline, not a point fix. β review doc
-
19. Audit Java deserialization sites end-to-end. 20+ files use raw
ObjectInputStream.readObject()(FileSystem item storage, RepositoryItem, Version, etc.). Recent GP-6719 added an RMI filter β extend the same allowlist discipline to the non-RMI sites. There should be exactly one approved deserialization helper. β JAVA_DESERIALIZATION_AUDIT.md -
20. Document and ship a
serial.filterfor the desktop client by default. Recent work added it (Framework/FileSystem/data/client.rmi.serial.filter); make sure it's on by default and that the allowlist is regression-tested when classes are renamed. β launch.properties, GhidraSerialFilterDefaultTest.java -
21. Add SBOM generation to release. Gradle dependency locking is in place β good. But there's no published SBOM. Generate CycloneDX or SPDX as part of
buildGhidraand attach it to releases. Enterprise consumers increasingly require this. β SBOM.md, sbom.gradle
-
22. Run tests in CI. The single CI workflow (
build-ghidra.yml) calls./gradlew buildGhidraand exits. Notest, nointegrationTest, no decompiler unit tests. For a project this size and reach, this is the most surprising finding in the audit. JaCoCo is wired up locally and the coverage data never reaches a dashboard. β build-ghidra.yml -
23. Multi-OS CI matrix. Decompiler builds on Linux, macOS, and Windows; CI only tests Linux. Native bugs on Windows/macOS are caught by users, not maintainers. Run at minimum a build+smoke matrix on all three. β build-ghidra.yml
-
24. Build and run the C++ decompiler unit tests in CI. Seven
unittests/*.ccfiles plus 84 XML data-driven tests exist and don't run in CI. Wire them in. β decompiler-cpp-tests.yml -
25. Re-enable
-Xlint.gradle/javaProject.gradlecurrently passes-Xlint:none, suppressing every javac warning across the codebase. This is hiding genuine bugs. Re-enable incrementally β start with-Xlint:deprecation,uncheckedand ratchet. β javaProject.gradle, XLINT_RATCHET.md -
26. Add static analysis. No SpotBugs, ErrorProne, Checkstyle, or Sonar configuration anywhere in the tree. ErrorProne is the cheapest win β it integrates as a javac plugin and catches a known class of mistakes (mutable-collection-as-key, missing override, etc). β errorprone.gradle, ERRORPRONE.md
-
27. Adopt Mockito (or an equivalent). Tests rely entirely on JUnit 4 + Hamcrest with no mocking framework. Result: test setup is heavyweight (real Swing, real programs) β and modules with hard dependencies (Project: 404 src files / 21 tests, SoftwareModeling: 1631 / 128) go untested because mocking is too painful. β javaProject.gradle, MOCKITO_ADOPTION.md
-
28. Triage
@Ignoredebt. Ignored tests inx86AssemblyTest,dsPIC30FAssemblyTest,ARMAssemblyTest,x64AssemblyTest,SymbolPathParserTest,CharsetInfoManagerTest. Each ignored test is a frozen bug report. Either fix or delete. β IGNORE_TEST_POLICY.md -
29. Migrate to JUnit 5. 4.13.2 is fine but JUnit 5 unlocks parameterized tests, conditional execution, and parallel run modes that would meaningfully accelerate the integration suite. β JUNIT5_MIGRATION.md
-
30. Decouple Swing from integration tests. Driving JFrame/FieldPanel directly (see
AbstractDecompilerTest) makes the test slow, flaky on CI, and impossible on headless containers. Introduce a headless view layer for tests. β HEADLESS_TEST_LAYER.md
-
31. Begin a phased migration from raw
new/deleteto RAII/smart pointers. ~2,740 raw allocation sites across 187k LOC; one (1)unique_ptrimport in the entire codebase. C++11 has been available since 2011. This isn't about style β it's about exception-safety and use-after-free risk on malformed input (and "use-after-free in Sleigh decompiler backend" is literally in the recent commit history, GP-37838c180a). β RAII_MIGRATION.md -
32. Adopt C++20. The codebase is on
-std=c++11. The decompiler community has moved on.std::span,std::expected,std::format, ranges, concepts β all directly applicable. Bump in two steps (14 β 20) with CI on three platforms. β CPP20_ADOPTION.md -
33. Version the Javaβnative IPC protocol. Custom byte-framing (
{0,0,1,X}magic markers,DecompileProcess.java:54-61) has no schema version, no CRC, no graceful resync. One byte of corruption kills the decompiler process. Add a version handshake and a length-prefix-with-CRC frame. β IPC_VERSIONING.md -
34. Replace the IPC protocol with a schema (FlatBuffers/Cap'n Proto). Bigger investment than #33, but solves the recurring "decompiler crashed" UX papercut at the root, enables differential testing across versions, and makes a non-Java host viable (PyGhidra, Rust frontend, etc). β IPC_SCHEMA.md
-
35. Bound decompilation time and memory. Issue #5730 (huge-function UX) and #8429 (decompiler perf) and PR #9179 (bounded parallel decompiler) all converge on this. The decompiler should never hang the UI. Hard wall-clock + RSS budget per function with partial-result return. β DECOMPILER_BUDGETS.md
-
36. Stop flushing the decompiler cache on trivial edits (#1871). 26 reactions, sitting open. The fix is plausibly small relative to its UX impact. β CACHE_FLUSH_1871.md
-
37. Improve C++ / vtable handling as a coordinated roadmap, not point fixes. #516 (49 π), #992, related issues β community is loudly asking for first-class C++ analysis. IDA/Hex-Rays has it. This is a strategic, not tactical, gap. Spec a C++ frontend RFC. β RFC 0001
-
38. Tackle variable-naming-across-scopes (#975, 53 π). The single most-upvoted open issue. Touches the symbol/scope model end to end; needs design, not a patch. Worth principal-architect time. β RFC 0002
-
39. Detect typical
forloops and inline functions (#644, #2376, #4461). Decompiler output that hides the loop induction variable behind awhile + counteris the single most common "this looks worse than IDA" complaint. Inline-aware analysis (#2376, #4461) is the same complaint one level deeper. β FOR_LOOP_INLINE_DETECTION.md -
40. Formalize Sleigh semantics; add a Sleigh fuzzer. Sleigh has an HTML reference manual, no formal grammar, no semantic model, no fuzz harness. With 39 processor specs and 21k lines of
.slaspec, silent codegen bugs in stale processors (PowerPC unchanged since 2019, 8051 since 2019, M8C since 2019) are inevitable and undetectable. Differential fuzzing against canonical ISA test vectors is the right answer. β SLEIGH_FORMAL_AND_FUZZ.md -
41. Establish a processor-maintenance policy. Each
Ghidra/Processors/<arch>/should have a named maintainer (community is fine), a test corpus, and an "orphaned after N years inactive" marker so users know what to trust. Right now everything looks equally maintained from the README, and that's not true. β Processors/MAINTAINERS.md -
42. Modernize the build's Python story. Jython is deprecated in favor of PyGhidra but still ships in
Extensions/Jython. Pick a date, announce, remove. Two Python paths is worse than either one alone, and the test suite for PyGhidra is correspondingly thin (recurring Debugger PR cluster β #8978 etc β is partly Python-stack churn). β decision 0003
If you do nothing else from this list, do #1, #2, #11, #13, and #22:
- #1/#2 because the project is bleeding contributor trust faster than it can earn stars,
- #11 because there is no responsible-disclosure process for a tool that ships into thousands of enterprise SOCs,
- #13 because the C++ decompiler is the highest-EV fuzz target in open-source security tooling and nobody is fuzzing it,
- #22 because a project this consequential should not be shipping without running its own tests in CI.
Everything else is downstream of those five.
Issue tracking: filed as #1β#42 at https://codeberg.org/CryptoJones/GayHydra/issues.
GayHydra is a fork of Ghidra, the software reverse engineering (SRE) framework created and maintained by the National Security Agency Research Directorate. The upstream framework includes a suite of full-featured software-analysis tools that enable users to analyze compiled code on a variety of platforms including Windows, macOS, and Linux. Capabilities include disassembly, assembly, decompilation, graphing, and scripting, along with hundreds of other features, across a wide variety of processor instruction sets and executable formats, in both interactive and automated modes. Users may also develop their own extension components and scripts using Java or Python.
This fork tracks upstream NSA Ghidra and adds the security-hardening, governance, CI, and testing
work documented in the 42 principal-architect recommendations at the top of this README β see the
per-rec links to design docs, policy files, and PRs that land each piece. Java packages, class
names, and the Ghidra/ source tree are deliberately preserved to keep upstream merges clean.
WARNING: There are known security vulnerabilities within certain versions of Ghidra and the forks (including GayHydra) that derive from it. Before proceeding, please read through GayHydra's SECURITY.md for a better understanding of how you might be impacted.
To install a pre-built multi-platform GayHydra release:
- Install JDK 21 64-bit
- Download a GayHydra release file
- NOTE: The multi-platform release file is named
ghidra_<version>_<release>_<date>.zip(the build artifact retains the upstreamghidra_filename so existing tooling keeps working) and is under the "Assets" drop-down. Downloading either of the files named "Source Code" is not correct for this step.
- NOTE: The multi-platform release file is named
- Extract the GayHydra release file
- NOTE: Do not extract on top of an existing installation
- Launch GayHydra:
./ghidraRun(ghidraRun.batfor Windows)- or launch PyGhidra:
./support/pyghidraRun(support\pyghidraRun.batfor Windows) - (the launcher script name stays
ghidraRunto match upstream)
- or launch PyGhidra:
For additional information and troubleshooting tips about installing and running a GayHydra release, please refer to the Getting Started document which can be found at the root of a GayHydra installation directory.
To create the latest development build for your platform from this source repository:
- JDK 21 64-bit
- Gradle 8.5+ (or provided Gradle wrapper if Internet connection is available)
- Python3 (version 3.9 to 3.14) with bundled pip
- GCC or Clang, and make (Linux/macOS-only)
- Microsoft Visual Studio 2017+ or Microsoft C++ Build Tools with the
following components installed (Windows-only):
- MSVC
- Windows SDK
- C++ ATL
unzip master.zip
cd gayhydra
NOTE: Instead of downloading the compressed source, you may instead want to clone the
canonical Codeberg repository: git clone https://codeberg.org/CryptoJones/GayHydra.git
NOTE: If an Internet connection is available and you did not install Gradle, the
./gradlew (or gradlew.bat) command may be used in place of the gradle command in the following
instructions.
gradle -I gradle/support/fetchDependencies.gradle
gradle buildGhidra
The compressed development build will be located at build/dist/.
For more detailed information on building GayHydra, please read the Developer's Guide.
For issues building, please check the Known Issues section for possible solutions.
GayHydra installations support users writing custom scripts and extensions via the GhidraDev
plugin for Eclipse (the plugin keeps the upstream name). The plugin and its corresponding
instructions can be found within a GayHydra release at Extensions/Eclipse/GhidraDev/ or at
this link. Alternatively, Visual Studio Code may be used to edit scripts by
clicking the Visual Studio Code icon in the Script Manager. Fully-featured Visual Studio Code
projects can be created from a GayHydra CodeBrowser window at
Tools -> Create VSCode Module project.
NOTE: Both the GhidraDev plugin for Eclipse and Visual Studio Code integrations only support developing against fully built GayHydra installations which can be downloaded from the Releases page.
To develop GayHydra itself, it is highly recommended to use Eclipse, which the upstream Ghidra development process is highly customized for.
- Follow the above build instructions so the build completes without errors
- Install Eclipse IDE for Java Developers
gradle prepdev eclipse buildNatives
- File -> Import...
- General | Existing Projects into Workspace
- Select root directory to be your downloaded or cloned GayHydra source repository
- Check Search for nested projects
- Click Finish
When Eclipse finishes building the projects, GayHydra can be launched and debugged with the provided Ghidra Eclipse run configuration (the run-configuration name stays upstream).
For more detailed information on developing GayHydra, please read the Developer's Guide.
If you would like to contribute bug fixes, improvements, and new features back to GayHydra, please take a look at our Contributor's Guide to see how you can participate. For upstream-applicable fixes, also consider opening the same PR against NationalSecurityAgency/ghidra so upstream benefits too.
Q. Why "GayHydra"? A. I'm queer, I thought it was clever, the dual-license-license-pun thing is funny to me, and the mascot energy of "many heads, all gay" matches how a reverse-engineering tool actually feels in practice. If the name keeps a homophobic contributor away, that's a positive selection filter, not a cost.
Q. How do you pronounce "Ghidra"? A. I have no idea. My Ghidra book is signed by Chris Eagle and I still don't know. (I would like Kara Nance's signature too, please.) Pick the pronunciation that sounds least embarrassing out loud, commit to it, and never let anyone correct you.
β AKC
