# SHA-256 in Python using libtomcrypt and ctypes

Note: this is a slightly updated version of the [`demo_dynamic.py` example script](https://github.com/libtom/libtomcrypt/blob/develop/demos/demo_dynamic.py) provided with the official `libtomcrypt` distribution. It has been updated to Python 3 and converted to a Jupyter Notebook, with minor updates to the ctypes calls.  Fernando Perez, September 2017.

## demo_dynamic.py                                     v1

This program demonstrates Python's use of the dynamic
language support additions to LTC, namely access to LTC
constants, struct and union sizes, and the binding of a
math package to LTC.  Also provided are simple code
fragments to illustrate how one might write a Python
wrapper for LTC and how an app might call the wrapper.
This or a similar model should work for Ruby and other
dynamic languages.

This instance uses Python's ctypes and requires a single
.dylib linking together LTC and a math library.  Building
a single .dylib is needed because LTC wants a fairly tight
relationship between itself and the mathlib.  (ctypes can
load multiple .dylibs, but it does not support this level
of tight coupling between otherwise independent libraries.)

My .dylib was created on OSX with the following steps:

1- compile LTC to a .a static lib:

       CFLAGS="-DLTM_DESC -DUSE_LTM" make

2- link LTC and LTM into a single .dylib:

       ar2dylib_with  tomcrypt  tommath

where ar2dylib_with is a shell script that combines
the LTC .a with the LTM .dylib

Reminder: you don't need to bind in a math library unless
          you are going to use LTC functions that depend
          on a mathlib.  For example, public key crypto
          needs a mathlib; hashing and symmetric encryption
          do not.

This code was written for Python 2.7.

Larry Bugbee
March 2014

In [1]:
from ctypes import *
from ctypes.util import find_library

In [2]:
#---------------------------------------------------------------
# load the .dylib

libname = 'tomcrypt'
libpath = f'/opt/libtom/lib/lib{libname}.dylib'

print
print('  demo_dynamic.py')
print
print('  path to library %s: %s' % (libname, libpath))

LTC = cdll.LoadLibrary(libpath)
print('  loaded: %s' % LTC)

  demo_dynamic.py
  path to library tomcrypt: /opt/libtom/lib/libtomcrypt.dylib
  loaded: <CDLL '/opt/libtom/lib/libtomcrypt.dylib', handle 7faa75f2a150 at 0x1106f7a20>


In [56]:
#---------------------------------------------------------------
# get list of all supported constants followed by a list of all
# supported sizes.  One alternative: these lists may be parsed
# and used as needed.
print('  all supported constants and their values:')

# get size to allocate for constants output list
str_len = c_int(0)
ret = LTC.crypt_list_all_constants(None, byref(str_len))
print('    need to allocate %d bytes \n' % str_len.value)

# allocate that size and get (name, size) pairs, each pair
# separated by a newline char.
names_sizes = c_buffer(str_len.value)
ret = LTC.crypt_list_all_constants(names_sizes, byref(str_len))
print(names_sizes.value.decode())

  all supported constants and their values:
    need to allocate 627 bytes 

PK_PUBLIC,0
PK_PRIVATE,1
PKA_RSA,0
PKA_DSA,1
LTC_PKCS_1,1
LTC_PKCS_1_EMSA,1
LTC_PKCS_1_EME,2
LTC_PKCS_1_V1_5,1
LTC_PKCS_1_OAEP,2
LTC_PKCS_1_PSS,3
LTC_MRSA,1
MIN_RSA_SIZE,1024
MAX_RSA_SIZE,4096
LTC_MKAT,0
LTC_MECC,1
ECC_BUF_SIZE,256
ECC_MAXSIZE,66
LTC_MDSA,1
LTC_MDSA_DELTA,512
LTC_MDSA_MAX_GROUP,512
LTC_DER_MAX_PUBKEY_SIZE,4096
LTC_MILLER_RABIN_REPS,35
LTC_CTR_MODE,1
CTR_COUNTER_LITTLE_ENDIAN,0
CTR_COUNTER_BIG_ENDIAN,4096
LTC_CTR_RFC3686,8192
MAXBLOCKSIZE,128
TAB_SIZE,32
ARGTYPE,0
LTM_DESC,1
TFM_DESC,0
GMP_DESC,0
LTC_FAST,1
LTC_NO_FILE,0
ENDIAN_LITTLE,1
ENDIAN_BIG,0
ENDIAN_32BITWORD,0
ENDIAN_64BITWORD,1
ENDIAN_NEUTRAL,0


In [57]:
print('  all supported sizes:')

# get size to allocate for sizes output list
str_len = c_int(0)
ret = LTC.crypt_list_all_sizes(None, byref(str_len))
print('    need to allocate %d bytes \n' % str_len.value)

# allocate that size and get (name, size) pairs, each pair
# separated by a newline char.
names_sizes = create_string_buffer(str_len.value)
ret = LTC.crypt_list_all_sizes(names_sizes, byref(str_len))
print(names_sizes.value.decode())

  all supported sizes:
    need to allocate 1151 bytes 

ltc_hash_descriptor,208
hash_state,416
sha256_state,112
sha3_state,416
sha512_state,208
whirlpool_state,144
md2_state,88
md4_state,96
md5_state,96
rmd128_state,96
rmd160_state,96
rmd256_state,112
rmd320_state,120
sha1_state,96
tiger_state,104
blake2s_state,136
blake2b_state,248
chc_state,272
ltc_cipher_descriptor,192
symmetric_key,4256
anubis_key,616
camellia_key,280
blowfish_key,4168
cast5_key,132
des_key,256
des3_key,768
kasumi_key,256
khazad_key,144
kseed_key,256
multi2_key,36
noekeon_key,32
rc2_key,256
rc5_key,204
rc6_key,176
skipjack_key,10
xtea_key,512
rijndael_key,484
safer_key,217
saferp_key,536
twofish_key,4256
symmetric_CBC,4392
symmetric_CFB,4528
symmetric_CTR,4536
symmetric_ECB,4264
symmetric_F8,4528
symmetric_LRW,69848
symmetric_OFB,4400
f9_state,4656
hmac_state,848
omac_state,4784
pelican_state,4280
pmac_state,8888
xcbc_state,4784
ocb_state,9008
ocb3_state,9536
gcm_state,69904
eax_state,14232
rsa_key,72
dsa_key,48
d

In [60]:
#---------------------------------------------------------------
# get individually named constants and sizes

# print selected constants
print('\n  selected constants:')

names = [
    b'ENDIAN_LITTLE',
    b'ENDIAN_64BITWORD',
    b'PK_PUBLIC',
    b'MAX_RSA_SIZE',
    b'CTR_COUNTER_BIG_ENDIAN',
]
for name in names:
    const_value = c_int(0)
    rc = LTC.crypt_get_constant(name, byref(const_value))
    value = const_value.value
    print('    %-25s  %d' % (name.decode(), value))


  selected constants:
    ENDIAN_LITTLE              1
    ENDIAN_64BITWORD           1
    PK_PUBLIC                  0
    MAX_RSA_SIZE               4096
    CTR_COUNTER_BIG_ENDIAN     4096


In [61]:
# print selected sizes
print('\n  selected sizes:')

names = [
    b'rijndael_key',
    b'rsa_key',
    b'symmetric_CTR',
    b'twofish_key',
    b'ecc_point',
    b'gcm_state',
    b'sha256_state',
    b'sha512_state',
]
for name in names:
    size_value = c_int(0)
    rc = LTC.crypt_get_size(name, byref(size_value))
    value = size_value.value
    print('    %-25s  %d' % (name.decode(), value))


  selected sizes:
    rijndael_key               484
    rsa_key                    72
    symmetric_CTR              4536
    twofish_key                4256
    ecc_point                  24
    gcm_state                  69904
    sha256_state               112
    sha512_state               208


In [62]:
#---------------------------------------------------------------
#---------------------------------------------------------------
# ctypes getting a list of this build's supported algorithms
# and compiler switches

def get_named_string(lib, name):
    return c_char_p.in_dll(lib, name).value.decode()

print('\n%s' % ('-'*60))
print('This is a string compiled into LTC showing compile ')
print('options and algorithms supported by this build \n')
print(get_named_string(LTC, 'crypt_build_settings'))


------------------------------------------------------------
This is a string compiled into LTC showing compile 
options and algorithms supported by this build 

LibTomCrypt 1.17 (www.libtom.net)
LibTomCrypt is public domain software.


Endianness: little (64-bit words)
Clean stack: disabled
Ciphers built-in:
   Blowfish
   RC2
   RC5
   RC6
   Safer+
   Safer
   Rijndael
   XTEA
   Twofish (tables)
   DES
   CAST5
   Noekeon
   Skipjack
   Khazad
   Anubis  (tweaked)
   KSEED
   KASUMI
   MULTI2
   Camellia
Stream ciphers built-in:
   ChaCha
   RC4
   SOBER128

Hashes built-in:
   SHA3
   SHA-512
   SHA-384
   SHA-512/256
   SHA-256
   SHA-512/224
   SHA-224
   TIGER
   SHA1
   MD5
   MD4
   MD2
   RIPEMD128
   RIPEMD160
   RIPEMD256
   RIPEMD320
   WHIRLPOOL
   BLAKE2S
   BLAKE2B
   CHC_HASH

Block Chaining Modes:
   CFB
   OFB
   ECB
   CBC
   CTR
   LRW (tables) 
   F8
   XTS

MACs:
   HMAC
   OMAC
   PMAC
   PELICAN
   XCBC
   F9
   POLY1305
   BLAKE2S MAC
   BLAKE2B MAC

ENC + A

In [67]:
#---------------------------------------------------------------
# here is an example of how a wrapper can make Python access
# more Pythonic

def _get_size(name):
    size = c_int(0)
    rc = LTC.crypt_get_size(name, byref(size))
    return size.value

sha256_state_struct_size = _get_size(b'sha256_state')
sha512_state_struct_size = _get_size(b'sha512_state')

class SHA256(object):
    def __init__(self):
        self.state = create_string_buffer(sha256_state_struct_size)
        LTC.sha256_init(byref(self.state))
        
    def update(self, data):
        if not isinstance(data, bytes):
            raise TypeError("Unicode-objects must be encoded before hashing - tomcrypt")
        LTC.sha256_process(byref(self.state), data, len(data))
        
    def digest(self):
        md = create_string_buffer(32)
        LTC.sha256_done(byref(self.state), byref(md))
        return md.raw

    def hexdigest(self):
        return self.digest().hex()

In [68]:
data = b'hello world'

sha256 = SHA256()
sha256.update(data)
md = sha256.hexdigest()

print(f'the SHA256 digest for {data} is {md}')

the SHA256 digest for b'hello world' is b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9


In [69]:
import hashlib

sha256py = hashlib.sha256()
sha256py.update(data)
mdpy = sha256py.hexdigest()

print(f'the SHA256 digest for {data} is {mdpy}')

the SHA256 digest for b'hello world' is b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9


## WARNING: this SHA256 object is broken!

The digest call uses the `libtomcrypt.sha256_done` function, which modifies the state of the object. So **`digest` isn't idempotent**:

In [78]:
data = b'hello world'

sha256 = SHA256()
sha256.update(data)

print(sha256.hexdigest())
print(sha256.hexdigest())
print(sha256.hexdigest())

b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
87bbcdbd44a8e483f44e718c885e9031212de7d658ecf1b84e6617f4b649918f
739d6dbde991fa20e69a3fed954160712a86d3a4f6e669352e1674fbc119e95b


In comparison, the hashlib hash objects don't have this problem:

In [79]:
sha256py = hashlib.sha256()
sha256py.update(data)

print(sha256py.hexdigest())
print(sha256py.hexdigest())
print(sha256py.hexdigest())

b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9


Looking at the underlying C code in the Pytyhon stdlib shows why. The function that implements the digest method looks like this:

```C
static PyObject *
SHA256Type_digest_impl(SHAobject *self)
/*[clinic end generated code: output=46616a5e909fbc3d input=1fb752e58954157d]*/
{
    unsigned char digest[SHA_DIGESTSIZE];
    SHAobject temp;

    SHAcopy(self, &temp);
    sha_final(digest, &temp);
    return PyBytes_FromStringAndSize((const char *)digest, self->digestsize);
}
```

It makes a *copy* of the hash object before calling `sha_final` (whose code is taken from `libtomcrypt`). 