Skip to content

davidjaenson/equihash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Oct 20, 2016: The cpu solver is close to 3.4 sol/s per thread on a i7-2640M CPU @ 2.80GHz

Oct 28, 2016: The opencl solver achieves ~15 sol/s on a gtx-970

Acknowledgements: Oct 20, 2016: The CPU-version uses the Tromp's modified blake2b code and his method for initializing the blake2b state.

CPU equihash API

Initiate the buckets and the indice tree structure by calling "equihash_init_buckets".

void equihash_init_buckets(bucket_t** src, bucket_t** dst, element_indice_t*** indices)

Then call "equihash" which returns the number of solutions found, and places these solutions uncompressed in dst_solutions (512 indices each one 32 bits). The blake2b state should be initialized with the header data before calling this function. The src buckets, dst buckets and indices refer to the same ones as were mentioned above in the "equihash_init_buckets" function.

size_t equihash(uint32_t* dst_solutions, const blake2b_state* state, bucket_t* src, bucket_t* dst, element_indice_t** indices)

When you're done finding equihash solutions, call "equihash_cleanup_buckets". The same params as "equihash_init_buckets".

void equihash_cleanup_buckets(bucket_t* src, bucket_t* dst, element_indice_t** indices);

OpenCL equihash API:

Initiate the gpu configuration data and the related variables using the "equihash_init" function passing a gpu_config_t struct.

void equihash_init(gpu_config_t* config);

Then call "equihash" which returns the number of solutions found, and places these solutions uncompressed in dst_solutions (512 indices each one 32 bits). The blake2b state should be initialized with the header data before calling this function.

size_t equihash(uint32_t* dst_solutions, crypto_generichash_blake2b_state* state, gpu_config_t* base_config);

When you're done finding equihash solutions, call "equihash_cleanup". The same param as "equihash_init".

void equihash_cleanup(gpu_config_t* config);

Future improvements for the GPU-implementation

  • A lot of microoptimizations
  • The opencl code currently calculates the produce_solutions step from the candidate solutions very inefficiently. Proper parallelization of this kernel can increase performance by a lot.
  • Each slot "element_t" in a bucket uses 128 bit of memory. This can at the very least be reduced to 64 bits.
  • Proper selection of the number of buckets and slots per bucket. Currently 1024 buckets are used.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published