Skip to content

Commit

Permalink
More docs, and add '--ryzen' as a hack for Ryzen folks to get nice perf.
Browse files Browse the repository at this point in the history
  • Loading branch information
fireworm71 committed Nov 28, 2017
1 parent 2b955ff commit 34297a6
Show file tree
Hide file tree
Showing 4 changed files with 53 additions and 7 deletions.
28 changes: 22 additions & 6 deletions README.md
Expand Up @@ -42,7 +42,7 @@ _OR_
* make

#### Note for Debian/Ubuntu users:
* apt-get install automake autoconf pkg-config libcurl4-openssl-dev libjansson-dev libssl-dev libgmp-dev zlib1g-dev
* `apt-get install automake autoconf pkg-config libcurl4-openssl-dev libjansson-dev libssl-dev libgmp-dev zlib1g-dev`

#### Notes for AIX users:
* To build a 64-bit binary, export OBJECT_MODE=64
Expand Down Expand Up @@ -119,6 +119,23 @@ If for some reason you want to remove HugePages (or adjust the size):

`sudo nano /etc/sysctl.conf`, scroll to the bottom, and remove / edit the line `vm.nr_hugepages=size`, `Ctrl+O`, `[Enter]`, `Ctrl+X`. Then, like before, `sudo sysctl -p`. Note that you can also reboot and this will cause HugePages to allocate / deallocate.

### Oneways, cpu affinity.

You can now specify a number of 'oneway' threads to acompany your 'default way' threads. Default way is determined by your CPU instruction set.

'-1 n' will specify the number of oneway threads to spawn. You can also use `--oneways n`. Some folks (on arm especially) see perf gains due to the implementation. Additionally, there are new options to help control affinity for these threads and 'default' way threads too. `--cpu-affinity-stride N`, `--cpu-affinity-default-index N`, `--cpu-affinity-oneway-index N`, and `--cpu-priority-oneway 0-5`

Affinity stride works by saying how many cpus should be skipped before assigning the thread to it. So, if you have 8 cpu's and set 'stride' to 3, you will start with CPU (0 + 3 * 0), then CPU (0 + 3 * 1), then CPU (0 + 3 * 2), aka CPU's 0, 3, 6. 'default' index is the starting index ('0' in the example) for the number of `-t` threads. 'oneway-index' is the starting index for oneway `-1` threads.

Play around with them, and pass `-D` to get some debug output.

e.g, 'Bind oneways to odd threads, 'defaults' to even threads: `./cpuminer ... -t 4 -1 4 --cpu-affinity-stride 2 --cpu-affinity-default-index 0 --cpu-affinity-oneway-index 1`
e.g.2, 'Bind oneways to the last 4 cpu's after the defaults' `./cpuminer ... -t 4 -1 4 --cpu-affinity-stride 1 --cpu-affinity-default-index 0 --cpu-affinity-oneway-index 4` (edited)

# RYZEN USERS pass `--ryzen`.

Ryzen's implementation of AVX2 is ... subpar. Please pass `--ryzen` on the commandline to default to the AVX implementation. Users reported ~25% gains.

### Connecting through a proxy

Use the --proxy option.
Expand All @@ -131,17 +148,16 @@ When the --proxy option is not used, the program honors the http_proxy and all_p

GCC
=======
Some people have reported increases by using GCC 7.
Some people have reported increases by using GCC 7.2. Please note, this will replace your existing GCC installation, which may become unrecoverable if any errors occur in the `make` process.

To build and install GCC 7.2 on Ubuntu do the following:
* `apt-get -y install unzip flex`
* `wget https://github.com/gcc-mirror/gcc/archive/gcc-7_2_0-release.zip`
* `unzip gcc-7_2_0-release.zip`
* `cd gcc-7_2_0-release`
* `cd gcc-gcc-7_2_0-release`
* `sudo ./contrib/download_prerequisites`
* `mkdir build`
* `cd build`
* `../configure `
* `mkdir build && cd build`
* `../configure --enable-languages=c,c++ --disable-multilib`
* `make -j 8`
* `make install`

Expand Down
13 changes: 12 additions & 1 deletion algo/scrypt.c
Expand Up @@ -1771,7 +1771,13 @@ static inline void scrypt_core(uint32_t *X, uint32_t *V, int N)
bool printed = false;
unsigned char *scrypt_buffer_alloc(int N, int forceThroughput)
{
uint32_t size = (forceThroughput == -1 ? scrypt_best_throughput() : forceThroughput) * 32 * (N + 1) * sizeof(uint32_t);
uint32_t throughput = (forceThroughput == -1 ? scrypt_best_throughput() : forceThroughput);
if (opt_ryzen_1x) {
// force throughput to be 3 (aka AVX) instead of AVX2.
throughput = 3;
}

uint32_t size = throughput * 32 * (N + 1) * sizeof(uint32_t);

#ifdef __linux__
unsigned char* m_memory = (unsigned char*)(mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_POPULATE, 0, 0));
Expand Down Expand Up @@ -1987,6 +1993,11 @@ extern int scanhash_scrypt(int thr_id, struct work *work, uint32_t max_nonce, ui
uint32_t n = pdata[19] - 1;
const uint32_t Htarg = ptarget[7];
int throughput = scrypt_best_throughput();
if (opt_ryzen_1x) {
// force throughput to be 3 (aka AVX) instead of AVX2.
throughput = 3;
}

int i;

#ifdef HAVE_SHA256_4WAY
Expand Down
18 changes: 18 additions & 0 deletions cpu-miner.c
Expand Up @@ -125,6 +125,7 @@ int opt_affinity_stride = 1;
int opt_affinity_default_index = 0;
int opt_affinity_oneway_index = 0;
int opt_oneway_priority = 0;
bool opt_ryzen_1x = false;
int* thread_affinty_array = NULL;
int num_cpus;
char *rpc_url;
Expand Down Expand Up @@ -200,6 +201,7 @@ Options:\n\
--cert=FILE certificate for mining server using SSL\n\
-x, --proxy=[PROTOCOL://]HOST[:PORT] connect through a proxy\n\
-t, --threads=N number of miner threads (default: number of processors)\n\
-1, --oneways=N number of miner threads that are forced to 'oneway' (default: 0)\n\
-r, --retries=N number of times to retry if a network call fails\n\
(default: retry indefinitely)\n\
-R, --retry-pause=N time to pause between retries, in seconds (default: 30)\n\
Expand Down Expand Up @@ -234,11 +236,23 @@ Options:\n\
--cputest debug hashes from cpu algorithms\n\
--cpu-affinity set process affinity to cpu core(s), mask 0x3 for cores 0 and 1\n\
--cpu-priority set process priority (default: 0 idle, 2 normal to 5 highest)\n\
\n\
--cpu-affinity-stride N \n\
how many processors to skip when assigining affinity based on indicies\n\
cannot be used with '--cpu-affinity' (default: 1) See README.md for more details.\n\
--cpu-affinity-default-index N \n\
which cpu to start affinity for 'default' way threads (0-based). (default: 0) See README.md for more details.\n\
--cpu-affinity-oneway-index N \n\
which cpu to start affinity for 'default' way threads (0-based). (default: [After default threads]) See README.md for more details.\n\
--cpu-priority-oneway 0-5\n\
what priority oneway threads have (0 lowest, 5 highest) (default: 0)\n\
\n\
-b, --api-bind IP/Port for the miner API (default: 127.0.0.1:4048)\n\
--api-remote Allow remote control\n\
--max-temp=N Only mine if cpu temp is less than specified value (linux)\n\
--max-rate=N[KMG] Only mine if net hashrate is less than specified value\n\
--max-diff=N Only mine if net difficulty is less than specified value\n\
--ryzen Force AVX, and disable AVX2. Ryzen 1*** is much faster.\n\
-c, --config=FILE load a JSON-format configuration file\n\
-V, --version display version information and exit\n\
-h, --help display this help text and exit\n\
Expand Down Expand Up @@ -269,6 +283,7 @@ static struct option const options[] = {
{ "cpu-affinity-stride", 1, NULL, 1050 },
{ "cpu-affinity-default-index", 1, NULL, 1051 },
{ "cpu-affinity-oneway-index", 1, NULL, 1052 },
{ "ryzen", 0, NULL, 2000 },
{ "no-color", 0, NULL, 1002 },
{ "debug", 0, NULL, 'D' },
{ "diff-factor", 1, NULL, 'f' },
Expand Down Expand Up @@ -2583,6 +2598,9 @@ void parse_arg(int key, char *arg)
opt_affinity_oneway_index = v;
use_affinity_mask = -1;
break;
case 2000: // "ryzen"
opt_ryzen_1x = true;
break;
case 'V':
show_version_and_exit();
case 'h':
Expand Down
1 change: 1 addition & 0 deletions miner.h
Expand Up @@ -297,6 +297,7 @@ extern bool opt_stratum_stats;
extern char *opt_cert;
extern char *opt_proxy;
extern long opt_proxy_type;
extern bool opt_ryzen_1x;
extern bool use_syslog;
extern bool use_colors;
extern pthread_mutex_t applog_lock;
Expand Down

0 comments on commit 34297a6

Please sign in to comment.