# ByteArray

In the previous notebooks the **immutable** string and byte classes were examined. Immutable means that once an instance is instantiated, it cannot be modified. 

The byte and bytearray are both sequences where each unit in the sequence is a byte. However the byte is **immutable** and the bytearray is **mutable**.

The bytearray can be conceptualised as the spreadsheet on the left, each field can be entered and mutated to a new value or deleted. The byte can be conceptualised as the pdf on the right, where the data can be read but not modified. 

<img src='./images/img_001.png' alt='img_001' width='400'/>

## Initialisation Signature

The docstring of the initialisation signature can be examined:

In [1]:
bytearray?

[1;31mInit signature:[0m [0mbytearray[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
bytearray(iterable_of_ints) -> bytearray
bytearray(string, encoding[, errors]) -> bytearray
bytearray(bytes_or_buffer) -> mutable copy of bytes_or_buffer
bytearray(int) -> bytes array of size given by the parameter initialized with null bytes
bytearray() -> empty bytes array

Construct a mutable bytearray object from:
  - an iterable yielding integers in range(256)
  - a text string encoded using the specified encoding
  - a bytes or a buffer object
  - any object implementing the buffer API.
  - an integer
[1;31mType:[0m           type
[1;31mSubclasses:[0m     

It is similar to that of the ```bytes``` class. Construction from a ```tuple``` iterable of ```int``` values between ```0:256``` gives:

In [2]:
bytearray((20, 104, 101, 108, 108, 111, 129))

bytearray(b'\x14hello\x81')

Recall each of these integers (displayed in decimal) can be viewed in binary and hexadecimal form. Some of them also map to an ASCII printable character, the ones that don't are displayed using a hexadecimal escape sequence:

In [4]:
nums = (20, 104, 101, 108, 108, 111, 129)

import string

print('dec', end='\t')
print('bin'.ljust(10), end='\t')
print('hex'.center(4), end='\t')
print('chr')

for num in nums:
    print(num, end='\t')
    print('0b' + bin(num).removeprefix('0b').zfill(8), end='\t')
    print('0x' + hex(num).removeprefix('0x').zfill(2), end = '\t')
    if chr(num) in string.printable:
        print(chr(num), end = '\t')
    else:
        print('x\\' + hex(num).removeprefix('0x').zfill(2), end='\t')
    print()

dec	bin       	hex 	chr
20	0b00010100	0x14	x\14	
104	0b01101000	0x68	h	
101	0b01100101	0x65	e	
108	0b01101100	0x6c	l	
108	0b01101100	0x6c	l	
111	0b01101111	0x6f	o	
129	0b10000001	0x81	x\81	


Notice that the formal representation of the bytearray which shows the recommended way to construct an instance, constructs an instance using a byte. The byte is considered a fundamental datatype:

In [None]:
bytearray((20, 104, 101, 108, 108, 111, 129))

## Object Design Pattern

In [None]:
bytearray.mro()

Recall that the ```bytes``` class is consistent to the design pattern of the ```str``` class. These have the following abstract base classes:

* object
* Container
* Hashable
* Collection
  * Sized
  * Iterable
* Sequence


The ```bytearray``` class is a ```MutableSequence``` which means it has a design pattern of all the above with the exception to ```Hashable``` and has the additional design pattern of ```MutableSequence```:

* object
* Container
* ~~Hashable~~
* Collection
  * Sized
  * Iterable
* Sequence
* Mutable Sequence

If the following is input therefore:

In [None]:
help(bytearray)

The ```print_identifier_group``` function from the custom ```helper_module``` can be used to examine the identifiers:

In [5]:
from helper_module import print_identifier_group

The ```bytes``` and ```bytearray``` classes do not have any attributes:

In [13]:
print_identifier_group(bytearray, kind='attribute')

[]


In [14]:
print_identifier_group(bytes, kind='attribute')

[]


Both classes have the ```__doc__``` (*dunder doc*) datamodel attribute, however the ```bytearray``` has the ```__hash__``` (*dunder hash*) class attribute that is set to ```None``` because it is immutable and not hashable:

In [15]:
print_identifier_group(bytearray, kind='datamodel_attribute')

['__doc__', '__hash__']


In [17]:
print_identifier_group(bytes, kind='datamodel_attribute')

['__doc__']


In [19]:
bytearray.__hash__ == None

True

The **mutable** ```bytearray``` class has all the methods of the **immutable** ```bytes``` class:

In [6]:
print_identifier_group(bytes, kind='function', second=bytearray, show_unique_identifiers=True)

[]


In [8]:
print_identifier_group(bytes, kind='function', second=bytearray, show_only_intersection_identifiers=True)

['capitalize', 'center', 'count', 'decode', 'endswith', 'expandtabs', 'find', 'fromhex', 'hex', 'index', 'isalnum', 'isalpha', 'isascii', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']


It has also the supplementary ```MutatableCollection``` methods:

In [7]:
print_identifier_group(bytearray, kind='function', second=bytes, show_unique_identifiers=True)

['append', 'clear', 'copy', 'extend', 'insert', 'pop', 'remove', 'reverse']


The **mutable** ```bytearray``` class has most of the datamodel methods of the **immutable** ```bytes``` class. 

The **immutable** ```bytes``` class has a small number of immutable datamodel methods that are not present in the ```bytearray``` class. Likewise the **mutatable** ```bytearray``` has mutatable methods, that define the behaviour of inplace operators:

In [20]:
print_identifier_group(bytes, kind='datamodel_method', second=bytearray, show_unique_identifiers=True)

['__bytes__', '__getnewargs__']


In [22]:
print_identifier_group(bytes, kind='datamodel_method', second=bytearray, show_only_intersection_identifiers=True)

['__add__', '__buffer__', '__class__', '__contains__', '__delattr__', '__dir__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']


In [21]:
print_identifier_group(bytearray, kind='datamodel_method', second=bytes, show_unique_identifiers=True)

['__alloc__', '__delitem__', '__iadd__', '__imul__', '__release_buffer__', '__setitem__']


## Hashable

Note that ```__hash__``` (*dunder hash*) is a datamodel method for the ```bytes``` class giving each ```bytes``` instance a unique hash value whereas it is a class attribute for the ```bytearray``` class equal to ```None``` because an **immutable** datatype is not hashable:

In [23]:
callable(getattr(bytearray, '__hash__'))

False

In [24]:
callable(getattr(bytes, '__hash__'))

True

In [25]:
bytearray.__hash__ == None

True

Because the ```bytes``` instance is immutable, it has a value that cannot be changed and a corresponding hash value. The ```bytesarray``` is mutable and has no hash value:

In [26]:
hash(b'\x14hello\x81')

-4887952282617764662

Attempting to do the same with the ```bytearray``` will result in a ```TypeError``` because a ```bytearray``` is not hashable:

A consequence of a ```bytearray``` being unhashable is a ```byte``` instance can be used as a key in a mapping and a ```bytearray``` instance cannot:

In [27]:
mapping = {b'r': 'red',
           b'g': 'green',
           b'b': 'blue'}

In [28]:
mapping[b'r']

'red'

Think of the mapping as a collection of unique lockers where each locker contains a value. Each unique lock has a unique key that is designed to fit in the unique lock. Distorting the key i.e. mutating it will prevent the key from working. Attempting to use a mutable collection as a key will therefore result in a ```TypeError```:

## Mutable Methods

If the following immutable ```byte``` and mutatable ```bytearray``` instances are created:

In [67]:
greeting_b = b'hello'
greeting_ba = bytearray(b'hello')

Each has a length of 5 bytes:

In [68]:
len(greeting_b)

5

In [69]:
len(greeting_ba)

5

The datamodel identifier ```__alloc__``` (*dunder alloc*) will display the number of bytes the ```bytearray``` occupies:

In [70]:
bytearray.__alloc__?

[1;31mDocstring:[0m
B.__alloc__() -> int

Return the number of bytes actually allocated.
[1;31mType:[0m      method_descriptor

The number of ```bytes``` allocated is larger than the number of bytes occupied. This is done for memory management in order to speed up the operation of some of the mutable identifiers such as ```insert``` and ```append```:

In [71]:
greeting_ba.__alloc__()

6

Both the ```byte``` and ```bytearray``` have the immutatable datamodel method:

In [72]:
greeting_b.__getitem__?

[1;31mSignature:[0m      [0mgreeting_b[0m[1;33m.[0m[0m__getitem__[0m[1;33m([0m[0mkey[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mCall signature:[0m [0mgreeting_b[0m[1;33m.[0m[0m__getitem__[0m[1;33m([0m[1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mType:[0m           method-wrapper
[1;31mString form:[0m    <method-wrapper '__getitem__' of bytes object at 0x00000229B4880330>
[1;31mDocstring:[0m      Return self[key].

In [73]:
greeting_ba.__getitem__?

[1;31mSignature:[0m      [0mgreeting_ba[0m[1;33m.[0m[0m__getitem__[0m[1;33m([0m[0mkey[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mCall signature:[0m [0mgreeting_ba[0m[1;33m.[0m[0m__getitem__[0m[1;33m([0m[1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mType:[0m           method-wrapper
[1;31mString form:[0m    <method-wrapper '__getitem__' of bytearray object at 0x00000229B4D5D570>
[1;31mDocstring:[0m      Return self[key].

However only the ```bytearray``` has the mutable datamodel method counterparts:

In [74]:
greeting_ba.__setitem__?

[1;31mSignature:[0m      [0mgreeting_ba[0m[1;33m.[0m[0m__setitem__[0m[1;33m([0m[0mkey[0m[1;33m,[0m [0mvalue[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mCall signature:[0m [0mgreeting_ba[0m[1;33m.[0m[0m__setitem__[0m[1;33m([0m[1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mType:[0m           method-wrapper
[1;31mString form:[0m    <method-wrapper '__setitem__' of bytearray object at 0x00000229B4D5D570>
[1;31mDocstring:[0m      Set self[key] to value.

In [75]:
greeting_ba.__delitem__?

[1;31mSignature:[0m      [0mgreeting_ba[0m[1;33m.[0m[0m__delitem__[0m[1;33m([0m[0mkey[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mCall signature:[0m [0mgreeting_ba[0m[1;33m.[0m[0m__delitem__[0m[1;33m([0m[1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mType:[0m           method-wrapper
[1;31mString form:[0m    <method-wrapper '__delitem__' of bytearray object at 0x00000229B4D5D570>
[1;31mDocstring:[0m      Delete self[key].

Whereas the ```bytes``` class only has the immutable identifier:

This means an individual byte can be indexed from a ```byte``` or ```bytearray```:

In [76]:
greeting_b[0]

104

In [77]:
greeting_ba[0]

104

Recall this letter is:

In [78]:
chr(104)

'h'

In [79]:
'0b' + bin(104).removeprefix('0b').zfill(8)

'0b01101000'

In [80]:
'0x' + hex(104).removeprefix('0x').zfill(2)

'0x68'

If the letter ```'H'``` is now examined:

In [81]:
ord('H')

72

In [82]:
'0b' + bin(ord('H')).removeprefix('0b').zfill(8)

'0b01001000'

In [83]:
'0x' + hex(ord('H')).removeprefix('0x').zfill(2)

'0x48'

Because a ```bytearray``` is mutable, the first index can be assigned to a new byte:

In [84]:
greeting_ba[0] = 72

Notice there is no cell output as a ressignment was carried out. This modifies the ```bytearray``` instance ```greeting_ba``` in place and now the first letter is capitalised:

In [51]:
greeting_ba

bytearray(b'Hello')

If the same operation is tried with the immutable bytearray there is a ```TypeError``` because a ```bytes``` instance does not support item reassignment:

The first letter can also be deleted:

In [102]:
del greeting_ba[0]

In [103]:
greeting_ba

bytearray(b'ELlo')

Slicing can also be used:

In [104]:
greeting_ba[0:2]

bytearray(b'EL')

A slice can be assigned to another ```bytearray``` instance or a ```bytes``` instance:

In [105]:
greeting_ba[0:2] = bytearray(b'HEL')

Once again, the ```bytearray``` is mutated:

In [106]:
greeting_ba

bytearray(b'HELlo')

There is a subtle difference between the inplace operators ```+=``` and ```*=``` in the **immutable** ```bytes``` and the **mutatable** ```bytearray``` classes:

* In the bytes class, the ```__iadd__``` (*dunder inplace add*) and ```__imul__``` (*dunder inpalce mul*) datamodel identifiers are not defined, so the behaviour from ```__add__``` (*dunder add*) and ```__mul__``` (*dunder mul*) is used alongside reassignment. In other words a new ```bytes``` instance is created and the instance name is moved from the old instance to the new instance.
* In the bytearray class, the ```__iadd__``` (*dunder inplace add*) and ```__imul__``` (*dunder inplace mul*) datamodel identifiers are defined and the original instance is mutated.

In Python each unique object has its own identification. 

In [125]:
greeting_b = b'hello'
greeting_ba = bytearray(b'hello')

For the ```bytes``` instance the identification can be checked before and after the use of an inplace operator:

In [126]:
id1 = id(greeting_b)
id1

2378145766624

In [127]:
greeting_b += b'HELLO'

In [128]:
id2 = id(greeting_b)
id2

2378145759328

Notice for the ```bytes``` instance the ids differ. This is because the ```bytes``` instance is **immutatable** meaning another instance was created and the label ```greeting_b``` was moved from the old instance to the new instance:

In [129]:
id1 == id2

False

For the ```bytearray``` instance the identification can be checked before and after the use of an inplace operator:

In [130]:
id1 = id(greeting_ba)
id1

2378151532592

In [131]:
greeting_b += bytearray(b'HELLO')

In [132]:
id2 = id(greeting_ba)
id2

2378151532592

Notice for the ```bytearray``` instance the ids remain the same. This is because the ```bytearray``` instance is **mutatable** and the inplace operator carries the operation inplace:

In [133]:
id1 == id2

True

This can be seen with the inplace multiplication operator also:

In [134]:
greeting_b = b'hello'
greeting_ba = bytearray(b'hello')

In [135]:
id1 = id(greeting_b)
greeting_b *= 5
id2 = id(greeting_b)
id1 == id2

False

In [136]:
greeting_b # new instance

b'hellohellohellohellohello'

In [138]:
id1 = id(greeting_ba)
greeting_ba *= 5
id2 = id(greeting_ba)
id1 == id2

True

In [139]:
greeting_ba # mutated original instance

bytearray(b'hellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohellohello')

Most methods are setup to either be immutable (return a new instance) without modifying the original instance or mutable (modify the existing instance and return no value).

The mutable methods below all mutate the ```bytearray``` directly and have no return value:

* append
* extend
* insert
* remove
* reverse
* pop
* clear 

The immutable method ```copy``` is usually a companion for mutable methods and returns a copy without modifying the original instance:

* copy

The exception to the rule is ```pop``` which both mutates the original instance and returns the value popped:

* pop
  

```bytearray.append``` as the name suggest appends (adds to the back) a single byte:

In [142]:
greeting_ba = bytearray(b'hello')

In [141]:
greeting_ba.append?

[1;31mSignature:[0m [0mgreeting_ba[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mitem[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Append a single item to the end of the bytearray.

item
  The item to be appended.
[1;31mType:[0m      builtin_function_or_method

This is normally supplied in the form of an ordinal integer. Recall each charqacter has an ordinal integer:

In [144]:
ord('!')

33

In [145]:
returnval = greeting_ba.append(33)

The instance is mutated inplace:

In [146]:
greeting_ba

bytearray(b'hello!')

The returnval is therefore ```None```:

In [147]:
returnval

In [148]:
returnval == None

True

A common mistake beginners make when first encountering a mutable method is to use it with reassignment:

In [149]:
greeting_ba = greeting_ba.append(33)

The operation of the mutable method is carried out and returns a ```None``` instance which is the only instance of the ```NoneType``` class. 

The instance name ```greeting_ba``` is peeled off from the original ```bytearray``` instance that had its value updated from ```bytearray(b'hello')``` to ```bytearray(b'hello!')``` and placed on the return value ```None``` instance of the ```NoneType``` class. Because the ```bytearray``` instance now has no references it is orphaned and cleaned up by Pythons garbage collection:

In [150]:
greeting_ba

In [None]:
greeting_ba == None

```bytearray.extend``` as the name suggests extends the end of the ```bytearray``` instance using another ```bytearray``` whichusually spans over multiple bytes:

In [151]:
greeting_ba = bytearray(b'hello')

In [152]:
greeting_ba.extend(bytearray(b'Hello')) # No return value

In [153]:
greeting_ba # Original instance mutated

bytearray(b'helloHello')

The method ```extend``` and inplace addition ```+=``` are very similar however ```+=``` only works when both instances are ```bytearray``` instances. ```extend``` can also accept ```bytes``` instances:

In [154]:
greeting_ba = bytearray(b'hello')

In [155]:
greeting_ba + bytearray(b'Hello') # immutable return value

bytearray(b'helloHello')

In [156]:
greeting_ba # original instance unchanged

bytearray(b'hello')

In [159]:
greeting_ba += bytearray(b'Hello') # mutated inplace, no return value

In [160]:
greeting_ba

bytearray(b'helloHelloHello')

In [161]:
greeting_ba.extend(bytearray(b'Hi')) # mutated inplace, no return value

In [162]:
greeting_ba.extend(b'Bye') # mutated inplace, no return value

In [163]:
greeting_ba

bytearray(b'helloHelloHelloHelloHello')

```bytearray.insert``` as the name suggests can be used to insert a byte in a ```bytearray``` instance at a given index:

In [164]:
greeting_ba.insert?

[1;31mSignature:[0m [0mgreeting_ba[0m[1;33m.[0m[0minsert[0m[1;33m([0m[0mindex[0m[1;33m,[0m [0mitem[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Insert a single item into the bytearray before the given index.

index
  The index where the value is to be inserted.
item
  The item to be inserted.
[1;31mType:[0m      builtin_function_or_method

In [165]:
greeting_ba = bytearray(b'hello')

The second ```l``` is at index ```3```:

In [166]:
greeting_ba[3]

108

Recall:

In [167]:
chr(108)

'l'

And:

In [168]:
ord('L')

76

In [169]:
greeting_ba.insert(3, 76)  # mutated inplace, no return value

In [170]:
greeting_ba 

bytearray(b'helLlo')

The mutatable method ```insert``` inserts a new byte so that the original byte at that index and all bytes at subsequent indexes are shifted along one. This behaviour is different from the mutable datamodel method ```__setitem__``` (*dunder setitem*) which replaces an original byte or slice with a new byte or slice. 

```bytearray.remove``` as the name suggests can be used to remove the first occurance of a byte in a bytearray instance:

In [None]:
greeting_ba.remove?

In [None]:
greeting_ba = bytearray(b'hello')

For example the first occurance of l can be removed:

In [None]:
ord('l')

In [None]:
greeting_ba.remove(108) # no return value

In [None]:
greeting_ba # original instance mutated

bytearray.reverse as the name suggests reverses the order of the bytes in the bytearray instance:

In [None]:
? greeting_ba.reverse

In [None]:
greeting_ba = bytearray(b'hello') 

In [None]:
greeting_ba.reverse() # No return value

In [None]:
greeting_ba # Original instance mutated

This is different from the reversed iterator from the reversed builtins function:

In [None]:
backward = reversed(greeting_ba)
backward

In [None]:
next(backward)

bytearray.clear, as the name suggests clears all bytes in the bytearray instance leaving it an empty bytearray instance:

In [None]:
? greeting_ba.clear

In [None]:
greeting_ba = bytearray(b'hello') # No return value

In [None]:
greeting_ba.clear() # Original instance mutated

bytearray.copy as the name suggests can be used to return a copy of a bytearray:

In [None]:
? bytearray.copy

In [None]:
greeting_ba = bytearray(b'hello') 

In [None]:
greeting_ba.copy() # Return Value

In [None]:
greeting_bac = greeting_ba.copy()

If the ids of each instance are examined:

In [None]:
id1 = id(greeting_ba)
id1

In [None]:
id2 = id(greeting_bac)
id2

In [None]:
id1 == id2

The ids are unique and therefore these are two seperate instances and not alias to the same instance. Therefore they are not the same instance:

In [None]:
id1 is id2

They do however have the same value:

In [None]:
greeting_ba

In [None]:
greeting_bac

Therefore they are equal:

In [None]:
greeting_ba == greeting_bac

One instance can be modified without influencing the other:

In [None]:
greeting_ba.clear() # No return value

In [None]:
greeting_bac # Independent instance unchanged

bytearray.pop as the name suggests pops a value off a bytearray instance, by default the value is popped off the end but can be changed by supplying an index. Note that the popped value is returned:

In [None]:
? greeting_ba.pop

In [None]:
greeting_ba = bytearray(b'hello') 

In [None]:
greeting_ba.pop() # Return value

In [None]:
greeting_ba # Instance mutated

In [None]:
chr(111)

In [None]:
popped_value = greeting_ba.pop(1)

In [None]:
popped_value # Returned value

In [None]:
greeting_ba # Instance mutated

In [None]:
chr(popped_value)