OpenCL GPU Miner for x16rs by Iv-84 · Pull Request #36 · hacash/fullnode

Iv-84 · 2025-12-03T13:31:07Z

Overview

Dear maintainers,

My name is Ivan, and I have been working on this OpenCL GPU miner for approximately 2–3 months. On a GeForce RTX 5090, the current implementation achieves around 100 Mh/s.
Although I am an experienced programmer, I am not a specialist in mining software. This implementation started from scratch, based on the legacy sources uploaded by jojoin in the hacash/x16rs repository.
This PR introduces a working GPU kernel for x16rs, along with optimizations and fixes that enable practical performance on modern hardware.
I kindly ask for your review, feedback, and guidance on further improvements.

Thank you,
Ivan

Challenges in x16rs GPU Mining

The x16rs algorithm sequence is unpredictable and varies with each nonce.

It is not possible to simply chain kernels in a fixed order.
Each round requires determining the next algorithm, which complicates parallelism.

My Approach

Instead of processing one nonce per work-item, each thread processes multiple (unit_size) nonces (e.g., 16, 32, 64, 128).
By keeping many nonces in-flight, the likelihood of executing the same algorithm across multiple nonces increases, which enhances parallelism.
Within each work-group, hashes are shared and reordered by their next algorithm (using the last byte of the hash). This enables even stronger parallel execution.
All algorithms are executed within the same kernel, reducing kernel launch overhead.

Implementation Details

Work-groups, work-items and... unit-items!
- Each work-group runs with local_size work-items.
- Each work-item processes unit_size nonces.
Reordering per round
- After each hashing step, nonces are bucket-sorted by hash % 16.
- Implemented with histogram, starting_index, and offset arrays in __local memory for faster sorting
Local memory usage
- Tables for Blake, AES, LT, and mixtab are preloaded into __local memory by cooperative threads.
- Synchronization is enforced with barrier(CLK_LOCAL_MEM_FENCE).
Best hash selection
- After all rounds (x16rs_repeat), each work-item finds its best hash.
- A reduction across the work-group selects the best hash and nonce globally.
Optimizations applied
- #pragma unroll in critical loops.
- Avoiding unnecessary hash copies (using pointers directly).
- Integration of optimized sources by Wolf.
- Removal of legacy 80-block hashing logic, limiting processing to 32 chars.
- Selective loop roll/unroll based on performance.
- attribute((work_group_size_hint(256,1,1))) for compiler guidance.

Benchmarks

Tested in Windows/Ubuntu with similar results
Hashrate was between heights 650.000 and 700.000

GPU	work_groups	local_size	unit_size	Hashrate
RTX 4070	256	256	128	~38Mh/s
RTX 4070	2048	256	256	~40Mh/s
RTX 4090	2048	256	256	~80Mh/s
RTX 5090	2048	256	256	~100Mh/s
RTX 5090	2048	256	512	~100Mh/s

Known Limitations / Future Work

Frequent barriers (barrier(CLK_LOCAL_MEM_FENCE)) help in x16rs parallelism but may create bottlenecks.
OpenCL 2.0+ features (e.g., subgroups) and vector types (uint4, ulong4) could be leveraged for further optimization.
Currently tested only on NVIDIA hardware; AMD testing is pending.

Code Included

The full kernel is in this PR (x16rs_main), along with other supporting headers (util.cl, x16rs.cl, sha3_256.cl). The full "opencl" folder is required in order to launch the poworker with GPU mining enabled.

As a suggestion, the "x16rs/opencl" folder could be added as an asset on the Releases page.

zip -r hacash_x16rs_opencl.zip x16rs/opencl

Config file

GPU section is required in order to mine with GPU. I will open a new PR in hacash/doc to edit https://github.com/hacash/doc/blob/main/build/config_description.md

[gpu]
use_opencl = false
work_groups = 1024
local_size = 256
unit_size = 128
opencl_dir = opencl/
platform_id = 0
device_id = 0

*️⃣ These are default values, change "use_opencl" to true to start mining with GPU

jojoin · 2025-12-04T03:19:16Z

Thank you very much for your contribution! I am reviewing this part of the code, and once I ensure it doesn't affect the existing CPU mining section, this PR will be merged.

jojoin · 2025-12-04T03:24:02Z

It would be great if you could ensure that the GPU mining tool is compatible with different platforms and operating systems, allowing as many GPUs as possible to participate in mining.

Iv-84 · 2025-12-04T13:46:32Z

@jojoin

Thank you very much for your contribution! I am reviewing this part of the code, and once I ensure it doesn't affect the existing CPU mining section, this PR will be merged.

Thanks. Let me know if something needs to be changed.

It would be great if you could ensure that the GPU mining tool is compatible with different platforms and operating systems, allowing as many GPUs as possible to participate in mining.

I tested this on Windows 10, Windows 11, Ubuntu 22.04, and Ubuntu 24.04 using NVIDIA GPUs. Since only the default NVIDIA drivers are required, any NVIDIA GPU worked fine.
I will check with the community Telegram group to see if anyone can assist with testing on AMD hardware and macOS systems.

Iv-84 · 2025-12-05T09:48:09Z

I have already contacted several beta testers. They are currently testing the miner and have provided valuable feedback.
I will continue working on the PR to address a few issues.

jojoin · 2025-12-11T05:14:23Z

Compilation of OpenCL has added the 'ocl' feature switch. For details, please pull the latest code.
@Iv-84

YouKenTrust · 2025-12-12T08:28:45Z

@Iv-84 Hi Ivan, thank you for pushing Hacash forward with your GPU miner work. Could you please share your ERC20 or BEP20 address? The community would like to donate 500 USDT as a small token of appreciation and support.

Iv-84 · 2025-12-12T14:42:16Z

@YouKenTrust I really appreciate it. I'll take it as an incentive to continue developing Hacash.
My BEP20 address is 0xd407652d2b64c8e2ac9fb219da20484fab593ec3

YouKenTrust · 2025-12-13T07:03:39Z

https://bscscan.com/tx/0xa5674d501f7a574e24d9c629eeac7e826b087ffd4c8b85f58321e08e730a4833

TaKKiD · 2025-12-13T23:37:28Z

Windows 11 , AMD 7900XT is working, CPU is 7950X3D, Integrated Graphics Card is also working, but it's not working in AMD Software: Adrenalin Edition Driver. However, AMD Software: PRO Edition is working. The config.ini setting should be:
use_opencl = true
work_groups = 128
local_size = 128
unit_size = 28
opencl_dir = -opencl/
platform_id = 0
device_ids = 0, 1 ( 0= AMD 7900xt Min 60/Mh/s- Max 80Mh/s 1 = CPU GPU AMD 7950X3D Min1,20Mh/s Max 4 Mh/s)

Iv-84 and others added 10 commits November 27, 2024 15:13

OpenCL support

bd16374

OpenCL x16rs implementation

d16e1c4

debug

371e690

Tweaking for x16rs

0f9d0aa

Merge branch 'main' into opencl_miner

7c090e1

Modify start message

0f48937

Single thread

8829682

add gpu section to ini file

bbf232a

Merge branch 'main' into opencl_miner

f1d6576

Check OpenCL dir

c46d1dd

Platform selection

8430afd

jojoin merged commit a98a70a into hacash:main Dec 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenCL GPU Miner for x16rs#36

OpenCL GPU Miner for x16rs#36
jojoin merged 11 commits intohacash:mainfrom
Iv-84:opencl_miner

Iv-84 commented Dec 3, 2025 •

edited

Loading

Uh oh!

jojoin commented Dec 4, 2025

Uh oh!

jojoin commented Dec 4, 2025

Uh oh!

Iv-84 commented Dec 4, 2025

Uh oh!

Iv-84 commented Dec 5, 2025

Uh oh!

jojoin commented Dec 11, 2025

Uh oh!

YouKenTrust commented Dec 12, 2025

Uh oh!

Iv-84 commented Dec 12, 2025

Uh oh!

YouKenTrust commented Dec 13, 2025

Uh oh!

TaKKiD commented Dec 13, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Iv-84 commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Challenges in x16rs GPU Mining

My Approach

Implementation Details

Benchmarks

Known Limitations / Future Work

Code Included

Config file

Uh oh!

jojoin commented Dec 4, 2025

Uh oh!

jojoin commented Dec 4, 2025

Uh oh!

Iv-84 commented Dec 4, 2025

Uh oh!

Iv-84 commented Dec 5, 2025

Uh oh!

jojoin commented Dec 11, 2025

Uh oh!

YouKenTrust commented Dec 12, 2025

Uh oh!

Iv-84 commented Dec 12, 2025

Uh oh!

YouKenTrust commented Dec 13, 2025

Uh oh!

TaKKiD commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Iv-84 commented Dec 3, 2025 •

edited

Loading

TaKKiD commented Dec 13, 2025 •

edited

Loading