<table>
<tr><td><img style="height: 150px;" src="images/geo_hydro1.jpg"></td>
<td bgcolor="#FFFFFF">
    <p style="font-size: xx-large; font-weight: 900; line-height: 100%">AG Dynamics of the Earth</p>
    <p style="font-size: large; color: rgba(0,0,0,0.5);">Juypter notebooks</p>
    <p style="font-size: large; color: rgba(0,0,0,0.5);">Georg Kaufmann</p>
    </td>
</tr>
</table>

# Bytes
----

In this notebook, we discuss **bytes** as storage and manipulation.

As introcution, see [working with binary data](https://www.devdungeon.com/content/working-binary-data-python)

In [43]:
from itertools import zip_longest
import binascii

----
## From string to byte and back

We first consider a string `message`...

In [11]:
message = "Allwissend bin ich nicht; doch viel ist mir bewusst!"
print(type(message),len(message),message)

<class 'str'> 52 Allwissend bin ich nicht; doch viel ist mir bewusst!


... which we encode, using either the `.encode()` or the `bytes()`functions.
Both produce the same, a bytes object.

In [15]:
message_encoded1 = message.encode('utf8')
message_encoded2 = bytes(message,'utf8')
print(type(message_encoded1),len(message_encoded1),message_encoded1)
print(type(message_encoded2),len(message_encoded2),message_encoded2)

<class 'bytes'> 52 b'Allwissend bin ich nicht; doch viel ist mir bewusst!'
<class 'bytes'> 52 b'Allwissend bin ich nicht; doch viel ist mir bewusst!'


Back from bytes to string is done with either `.encode()`or `str()`.

In [17]:
message_decoded1 = message_encoded1.decode('utf8')
message_decoded2 = str(message_encoded1,'utf8')
print(type(message_decoded1),len(message_decoded1),message_decoded1)
print(type(message_decoded2),len(message_decoded2),message_decoded2)

<class 'str'> 52 Allwissend bin ich nicht; doch viel ist mir bewusst!
<class 'str'> 52 Allwissend bin ich nicht; doch viel ist mir bewusst!


But the `bytes` command can also be used with an integer number $n$. Then, it creates $n$ bytes ...

In [87]:
n=4
print(bytes(n),type(bytes(n)))
print(bytes(n).decode(),type(bytes(n).decode()))

b'\x00\x00\x00\x00' <class 'bytes'>
     <class 'str'>


Each byte is the `00` byte in hexadecimal notation, which in the unsual char table is a blank.

----
## 'bytearray'
While `bytes()` cannot be changed (it is immutable), a mutable sequence can be created with
`bytearray`:

In [86]:
message_encoded1 = message.encode('utf8')
print(type(message_encoded1),len(message_encoded1),message_encoded1)

message_bytearray = bytearray(message_encoded1)
print(type(message_bytearray),len(message_bytearray),message_bytearray)
for byte in message_bytearray:
    print(byte,' ',end='')

<class 'bytes'> 52 b'Allwissend bin ich nicht; doch viel ist mir bewusst!'
<class 'bytearray'> 52 bytearray(b'Allwissend bin ich nicht; doch viel ist mir bewusst!')
65  108  108  119  105  115  115  101  110  100  32  98  105  110  32  105  99  104  32  110  105  99  104  116  59  32  100  111  99  104  32  118  105  101  108  32  105  115  116  32  109  105  114  32  98  101  119  117  115  115  116  33  

----
## Decimal, hexadecimal, and binary representation

We can represent a number in 
- the decimal, 
- the hexadecimal `hex)()`, and
- the binary `bin()` system,

thus based on the base `10`, the base `16`, and the base `2`.

For the 8-bit system, which can store $2^8=256$ different numbers, we find in the three systems:

In [89]:
for a in range(2):
    print(a,':',bin(a),':',hex(a))

0 : 0b0 : 0x0
1 : 0b1 : 0x1


Note that ...
- in the **binary** system a prefix `0b` is added,
- in the **hexadecimal** system a prefix `0x` is added.

We strip the prefixes off:

In [90]:
for a in range(256):
    print(a,':',bin(a)[2:].zfill(8),':',hex(a)[2:])

0 : 00000000 : 0
1 : 00000001 : 1
2 : 00000010 : 2
3 : 00000011 : 3
4 : 00000100 : 4
5 : 00000101 : 5
6 : 00000110 : 6
7 : 00000111 : 7
8 : 00001000 : 8
9 : 00001001 : 9
10 : 00001010 : a
11 : 00001011 : b
12 : 00001100 : c
13 : 00001101 : d
14 : 00001110 : e
15 : 00001111 : f
16 : 00010000 : 10
17 : 00010001 : 11
18 : 00010010 : 12
19 : 00010011 : 13
20 : 00010100 : 14
21 : 00010101 : 15
22 : 00010110 : 16
23 : 00010111 : 17
24 : 00011000 : 18
25 : 00011001 : 19
26 : 00011010 : 1a
27 : 00011011 : 1b
28 : 00011100 : 1c
29 : 00011101 : 1d
30 : 00011110 : 1e
31 : 00011111 : 1f
32 : 00100000 : 20
33 : 00100001 : 21
34 : 00100010 : 22
35 : 00100011 : 23
36 : 00100100 : 24
37 : 00100101 : 25
38 : 00100110 : 26
39 : 00100111 : 27
40 : 00101000 : 28
41 : 00101001 : 29
42 : 00101010 : 2a
43 : 00101011 : 2b
44 : 00101100 : 2c
45 : 00101101 : 2d
46 : 00101110 : 2e
47 : 00101111 : 2f
48 : 00110000 : 30
49 : 00110001 : 31
50 : 00110010 : 32
51 : 00110011 : 33
52 : 00110100 : 34
53 : 00110101 : 35


----
## Blocks

In [None]:
message = 'ciphertext'
ciphertext = message.encode('utf-8')
keylength = 3

blocks = [ciphertext[i:i+keylength] for i in range(0, len(ciphertext), keylength)]
transposed = [bytes(t) for t in zip_longest(*blocks, fillvalue=0)]

print(message)
print(ciphertext)
print(blocks)
print(transposed)

In [None]:
for t in zip_longest(*blocks, fillvalue=0):
    for i in t:
        print(chr(i))
    #print(t)

In [None]:
message = "Allwissend bin ich nicht; doch viel ist mir bewusst!"
ciphertext = message.encode('utf-8')
keylength = 7

blocks = [ciphertext[i:i+keylength] for i in range(0, len(ciphertext), keylength)]
transposed = [bytes(t) for t in zip_longest(*blocks, fillvalue=0)]

print(message)
print(ciphertext)
print(blocks)
print(transposed)

In [None]:
for bytes in ciphertext:
    print(bytes,chr(bytes),' ',end='')

In [None]:
for i in blocks:
    print(i)

In [None]:
for i in transposed:
    print(i)

In [3]:
# Create empty bytes
empty_bytes = bytes(4)
print(type(empty_bytes))
print(empty_bytes)

<class 'bytes'>
b'\x00\x00\x00\x00'


In [9]:
# Cast bytes to bytearray
mutable_bytes = bytearray(b'\x00\x0F')
print(mutable_bytes,type(mutable_bytes),len(mutable_bytes))
# Bytearray allows modification
mutable_bytes[0] = 255
mutable_bytes.append(255)
print(mutable_bytes)

# Cast bytearray back to bytes
immutable_bytes = bytes(mutable_bytes)
print(immutable_bytes)

bytearray(b'\x00\x0f') <class 'bytearray'> 2
bytearray(b'\xff\x0f\xff')
b'\xff\x0f\xff'


----