|
1 |
| -Known bugs in this release: |
| 1 | + Known issues with using this release. |
2 | 2 |
|
3 |
| -Not working on Big Endian (fails self-test): |
| 3 | +Not working on big-endian CPU architectures (these formats fail |
| 4 | +self-test on big-endian CPUs): |
4 | 5 | * mssql05
|
5 | 6 | * office
|
6 | 7 | * rar
|
| 8 | +(x86 and x86-64 are little-endian, so they are not affected.) |
| 9 | + |
| 10 | +Not working on HD 4000 series and older ATI GPUs (these formats need |
| 11 | +byte-addressable store, which is only present in HD 5000 series and |
| 12 | +newer ATI/AMD GPUs): |
| 13 | +* sha512crypt-opencl |
| 14 | +* wpapsk-opencl |
| 15 | + |
| 16 | +Many OpenCL formats fail at runtime on Mac OS X (whereas CUDA ones work |
| 17 | +fine). We've seen these fail on Mac OS X 10.8.1: bf-opencl, |
| 18 | +mscash2-opencl, nt-opencl, rar, raw-sha512-opencl, sha512crypt-opencl, |
| 19 | +wpapsk-opencl, and xsha512-opencl. We suspect that this may be caused |
| 20 | +by driver bugs. The same formats work fine on Linux. |
| 21 | + |
| 22 | +In GPU-enabled builds, running "john --test" (with no --format |
| 23 | +restriction) will eventually fail (before it has a chance to test all |
| 24 | +formats). This is because GPU resources allocated by one format are |
| 25 | +currently not freed before proceeding to test another format (they're |
| 26 | +only freed when John exits). We're going to correct this in a future |
| 27 | +release. Meanwhile, please test GPU-enabled formats one by one, e.g. |
| 28 | +with "john --test --format=mscash2-opencl", etc. |
| 29 | + |
| 30 | +Some OpenCL-enabled formats (for "slow" hashes and non-hashes) may |
| 31 | +sometimes trigger "ASIC hang" errors as reported by AMD/ATI GPU drivers, |
| 32 | +requiring system reboot to re-gain access to the GPU. For example, on |
| 33 | +HD 7970 this problem is known to occur with sha512crypt-opencl, but is |
| 34 | +known not to occur with mscash2-opencl. Our current understanding is |
| 35 | +that this has to do with OpenCL kernel running time and watchdog timers. |
| 36 | +We're working on reducing kernel run times to avoid such occurrences in |
| 37 | +the future. |
| 38 | + |
| 39 | +All CUDA formats substantially benefit from compile-time tuning. |
| 40 | +README-CUDA includes some info on this. In short, on GTX 400 series and |
| 41 | +newer NVIDIA cards, you'll likely want to change "-arch sm_10" to "-arch |
| 42 | +sm_20" or greater (as appropriate for your GPU) on the NVCC_FLAGS line |
| 43 | +in Makefile. You'll also want to tune BLOCKS and THREADS for the |
| 44 | +specific format you're interested in. These are typically specified in |
| 45 | +cuda_*.h files. README-CUDA includes a handful of pre-tuned settings. |
| 46 | +It is not unusual to obtain e.g. a 3x speedup (compared to the generic |
| 47 | +defaults) with this sort of tuning. |
| 48 | + |
| 49 | +Some OpenCL formats benefit from compile-time tuning, too. For example, |
| 50 | +bf-opencl is pre-tuned for HD 7970 cards, and will need to be re-tuned |
| 51 | +for other cards (adjust WORK_GROUP_SIZE in opencl_bf_std.h and |
| 52 | +opencl/bf_kernel.cl; you may also adjust MULTIPLIER). In fact, on |
| 53 | +smaller GPUs this specific format might not work at all until |
| 54 | +WORK_GROUP_SIZE is reduced. Most OpenCL formats may benefit from tuning |
| 55 | +of KEYS_PER_CRYPT, although higher values, while generally increasing |
| 56 | +the c/s rate, may create usability issues (more work lost on |
| 57 | +interrupted/restored sessions, less optimal order of candidate passwords |
| 58 | +being tested). |
| 59 | + |
| 60 | +Even though wpapsk-cuda and wpapsk-opencl primarily use the GPU, they |
| 61 | +also do a (small, but not negligible) portion of the computation on CPU |
| 62 | +and thus they substantially benefit from OpenMP-enabled builds. We |
| 63 | +intend to reduce their use of CPU in a future version. |
| 64 | + |
| 65 | +Interrupting a cracking session that uses an ATI/AMD GPU with Ctrl-C |
| 66 | +often results in: |
| 67 | + ../../../thread/semaphore.cpp:87: sem_wait() failed |
| 68 | + Aborted |
| 69 | +When this happens, the john.pot and .log files are not updated with |
| 70 | +latest cracked passwords. To mitigate this, reduce the Save setting in |
| 71 | +john.conf from the default of 600 seconds to a lower value (e.g., 60). |
| 72 | + |
| 73 | +With GPU-enabled formats (and sometimes with OpenMP on CPU as well), the |
| 74 | +number of candidate passwords being tested concurrently can be very |
| 75 | +large (thousands). When the format is of a "slow" type (such as an |
| 76 | +iterated hash) and the number of different salts is large, interrupting |
| 77 | +and restoring a session may result in a lot of work being re-done (many |
| 78 | +minutes or even hours). It is easy to see if a given session is going |
| 79 | +to be affected by this or not: watch the range of candidate passwords |
| 80 | +being tested as included in the status line printed on a keypress. If |
| 81 | +this range does not change for a long while, the session is going to be |
| 82 | +affected since interrupting and restoring it will retry the entire |
| 83 | +range, for all salts, including for salts that already had the range |
| 84 | +tested against them. |
| 85 | + |
| 86 | +"Single crack" mode is relatively inefficient with GPU-enabled formats |
| 87 | +(and sometimes with OpenMP on CPU as well), because it might not be able |
| 88 | +to produce enough candidate passwords per target salt to fully utilize a |
| 89 | +GPU, as well as because its ordering of candidate passwords from most |
| 90 | +likely to least likely is lost when the format is only able to test a |
| 91 | +large number of passwords concurrently (before proceeding to doing the |
| 92 | +same for another salt). You may reasonably start with quick "single |
| 93 | +crack" mode runs on CPU (possibly without much use of OpenMP) and only |
| 94 | +after that proceed to using GPU-enabled formats (or with heavier use of |
| 95 | +OpenMP, beyond a few CPU cores), locking those runs to specific cracking |
| 96 | +modes other than "single crack". |
| 97 | + |
| 98 | +Some formats lack proper binary_hash() functions, resulting in duplicate |
| 99 | +hashes (if any) not being eliminated at loading and sometimes also in |
| 100 | +slower cracking (when the number of hashes per salt is large). When |
| 101 | +this happens, the following message is printed: |
| 102 | + Warning: excessive partial hash collisions detected |
| 103 | + (cause: the "format" lacks proper binary_hash() function definitions) |
| 104 | +Known to be affected are: bfegg, dominosec, md5crypt-cuda, phpass-cuda. |
| 105 | +Also theoretically present, but less likely to be triggered in practice, |
| 106 | +are similar issues in non-hash formats. |
0 commit comments