<a href="https://colab.research.google.com/github/koad7/core-python/blob/main/Core_Python_3_Byte_Oriented_Programming.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Core Python 3: Byte Oriented Programming


## Course Overview
This course introduces byte-oriented programming in Python. It covers bitwise operators, the binary representation of integers, bytes and bytearray types, packing and unpacking binary data with the struct module, memoryview object, and efficiently accessing huge data volumes with memory mapped files. It requires prior knowledge of Python basics, including functions, classes, and modules.


## Bit- and Byte-wise Operations
This chapter introduces bitwise operators and explains bit and byte-wise operations in Python. Bitwise operations are necessary when dealing with binary data or when interoperating with systems not written in Python. They include bitwise AND, OR, exclusive OR (XOR), bitwise complement (NOT), and shift operators.

   Code example: bitwise AND operation
   
   ```python
   # bitwise AND operation
   left = 0b1100
   right = 0b1010
   result = left & right
   print(bin(result))  # Outputs: 0b1000
   ```

## Bitwise Operators
This chapter explores bitwise operators in-depth. Bitwise operators compare corresponding bits in two operands and perform operations based on them. This chapter focuses on bitwise AND, bitwise OR, bitwise XOR, and bitwise NOT, along with left shift and right shift operators.

   Code example: bitwise XOR operation
   
   ```python
   # bitwise XOR operation
   left = 0b1100
   right = 0b1010
   result = left ^ right
   print(bin(result))  # Outputs: 0b110
   ```

## Masking with Bitwise-and
This chapter delves into the concept of masking, which uses the bitwise AND operator. Masking can be used to control which bits are selected and helps extract bytes or nibbles or any positional subset of bits from a larger bit sequence. The chapter also introduces the concept of bitwise shift operators.

   Code example: masking and shifting
   
   ```python
   # masking and shifting
   data = 0b110111000001010000111100
   mask = 0b111111110000000000000000
   result = (data & mask) >> 16
   print(bin(result))  # Outputs: 0b11011100
   ```

## Flags with Bitwise-and

This chapter discusses the use of the bitwise AND operator for testing flags. Flags are boolean values often used to represent the state of a function or program. Storing large numbers of boolean values can be inefficient in terms of memory, and this chapter illustrates the benefits of storing flags as binary within integers.

Example: Testing a flag within a bitfield
```python
bitfield = 0b1010  # binary representation of bitfield
mask = 0b10  # mask for testing the second bit (1-indexed)
result = bitfield & mask
print(result)  # non-zero if bit is set, 0 if it's not set
```

## Clearing Flags with Bitwise-and

This chapter discusses clearing flags using the bitwise AND operator. Using an inverted mask, the specific bit you wish to clear can be set to 0.

Example: Clearing a specific bit in a bitfield
```python
bitfield = 0b11001100
mask = ~0b01000000  # mask to clear 6th bit (1-indexed)
bitfield &= mask
print(bin(bitfield))  # binary representation of updated bitfield
```

## Bitwise-or

This chapter introduces the bitwise OR operator. A bit in the result is set to 1 if either or both of the operand bits are set to 1.

Example: Bitwise OR operation
```python
binary1 = 0b1100
binary2 = 0b1010
result = binary1 | binary2
print(bin(result))
```

## Packing Values with Bitwise-or

This chapter uses the bitwise OR operator to combine different color channels into a 24-bit color number.

Example: Combining red, green, and blue channels
```python
r = 0b11110101  # red channel
g = 0b00110011  # green channel
b = 0b10101010  # blue channel
color = (r << 16) | (g << 8) | b
print(bin(color))  # binary representation of 24-bit color
```

## Setting and Combining Flags with Bitwise-or

This chapter shows how to set flags with the bitwise OR operator. It's used to set any bit by bitwise OR'ing with a mask which has a 1 at the desired position.

Example: Setting multiple flags simultaneously
```python
bitfield = 0b000000000
# permissions
USR_READ = 0b100000000
USR_WRITE = 0b010000000
GRP_READ = 0b000100000
OTH_READ = 0b000001000
# setting flags
bitfield |= USR_READ | USR_WRITE | GRP_READ | OTH_READ
print(bin(bitfield))
```

This chapter also demonstrates how to use the augmented assignment version of bitwise OR for updating the permissions in-place.

Example: Update permissions in-place
```python
GRP_WRITE = 0b000010000
bitfield |= GRP_WRITE  # bitwise OR assignment
print(bin(bitfield))
```
## Bitwise Exclusive-or

This chapter explains the bitwise exclusive OR operator (^). The exclusive OR operation, as the name implies, only returns true when exactly one of the input bits is set. This can be thought of as a "bitwise not equal". It can also be seen as a "programmable bit flipper", where bits can be selectively toggled from 0 to 1 or 1 to 0 based on a mask operand.

A useful property of bitwise XOR is that applying it twice with the same value will return you to the original value. Here's a Python code example of this:

```python
bits = 0b1010  # A 4-bit value
mask = 0b0111  # A mask to flip right-most three bits

print(bin(bits))  # before flipping
bits ^= mask  # flipping in place
print(bin(bits))  # after first flip
bits ^= mask  # flipping again
print(bin(bits))  # back to original value
```

## Bitwise Complement

This chapter explains the bitwise complement or NOT operator (~), which flips all bits of its operand. Unlike other bitwise operators, bitwise complement is unary, taking only a single operand. The output can sometimes be confusing because Python uses the two's complement representation for negative integers.

Here is an example of using the bitwise NOT operator:

```python
v = 0b11110000  # An 8-bit value
print(bin(~v & 0xFF))  # Applying bitwise NOT and masking to get inverted 8-bit value
```

## Byte-wise Operations with Integers

This chapter introduces byte-wise operations with integers, which are useful when dealing with computer memory. Python provides the `to_bytes` method for converting integers to byte objects, and the `from_bytes` method for converting byte objects back to integers.

Here is an example of using byte-wise operations:

```python
x = 240  # An integer
b = x.to_bytes(2, byteorder=sys.byteorder)  # Convert to bytes
print(b)

x_new = int.from_bytes(b, byteorder=sys.byteorder)  # Convert bytes back to integer
print(x_new)
```

To get the inverted pattern of a 1-byte binary number, you can use the `to_bytes` method with the `signed` argument set to `True`. Here is an example:

```python
x = 0b11110000  # An 8-bit value
x = ~x  # Bitwise NOT
print(bin(x.to_bytes(2, byteorder='big', signed=True)[1] & 0xFF))  # Get the inverted 8-bit pattern
```


## Bit Sets
This chapter discusses how integers can be used as bit sets for different purposes, like keeping track of certain features in a collection. It gives a detailed example of how to manage data on vehicle ownership among eight programmers.

Key code examples:

1. Creating masks for each programmer:
    ```python
    albert = 1 << 0
    betty = 1 << 1
    charles = 1 << 2
    # ...continuing till hannah
    ```

2. Creating bit sets for car and bicycle owners:
    ```python
    car_owners = albert | charles | hannah
    bicycle_owners = daphne | edward | hannah
    ```

3. Determining who owns both a car and a bicycle:
    ```python
    car_and_bike_owners = car_owners & bicycle_owners
    ```

4. The example also illustrates using the bitwise operators to perform set operations in the context of Python's set type.

## Byte-oriented Types
This chapter focuses on understanding Python's bytes and bytearray types, as well as their differences.

Key code examples:

1. Create a bytes object:
    ```python
    b = b'hello world'
    ```

2. Decode bytes into a string:
    ```python
    s = b.decode('utf-8')
    ```

3. Various forms of the bytes constructor:
    ```python
    bytes()  # Empty byte sequence
    bytes(5)  # Sequence of 5 null bytes
    bytes([1, 2, 3, 4, 5])  # From an iterable
    bytes("hello", "utf-8")  # From a string with encoding
    ```

4. Create a bytearray object:
    ```python
    ba = bytearray(b'hello world')
    ```

5. Modify a bytearray object:
    ```python
    ba[0] = 72  # Change 'h' to 'H'
    ```

## Interpreting Binary Structures
This chapter describes Python's struct module which can interpret C structures into Python and vice versa. The chapter provides an example of writing a binary file in C and then reading it in Python.

Key code examples:

1. Writing a binary file in C (not in Python):

```c
typedef struct {
    float x;
    float y;
    float z;
} Vector;

typedef struct {
    unsigned short int red;
    unsigned short int green;
    unsigned short int blue;
} Color;

typedef struct {
    Vector position;
    Color color;
} Vertex;

int main() {
    FILE *file = fopen("colors.bin", "wb");
    Vertex vertices[4];
    // initialize vertices
    fwrite(vertices, sizeof(vertices), 1, file);
    fclose(file);
    return 0;
}
```

2. Reading the binary file in Python using the struct module:

```python
import struct

with open('colors.bin', 'rb') as f:
    data = f.read()

vertex_data = struct.unpack('<3f 3H', data)
```

## Using the struct Module to Read Binary Structures
This chapter focuses on how to use Python's struct module to read binary structures. It starts with a simple program that reads a vertex from a file. The struct.unpack_from function is used to convert the raw byte sequence into more manageable types, with the format string defining the expected data types. A number of modifications are made to the program, including using tuple unpacking to assign the read values to variables, using repeat counts to shorten the format string, and reading all the vertex structures from the file.

An interesting challenge encountered in this chapter is reading struct data that was written using a different C compiler. The Python program failed to read the data correctly due to the extra padding added by the C compiler. To solve this, the 'X' code was added to the format string to account for the padding bytes.

The chapter ends with a detailed discussion about the importance of careful handling of binary data to ensure program portability across systems and compilers.

```python
import struct
from dataclasses import dataclass

@dataclass
class Vector:
    x: float
    y: float
    z: float

@dataclass
class Color:
    red: int
    green: int
    blue: int

@dataclass
class Vertex:
    vector: Vector
    color: Color

def make_colored_vertex(vector, color):
    return Vertex(Vector(*vector), Color(*color))

def main():
    with open("file.bin", "rb") as file:
        buffer = file.read()

    vertices = []
    for i in range(0, len(buffer), 20):  # 20 bytes per vertex structure
        fields = struct.unpack_from("@3f3Hxx", buffer, i)  # Using the format string
        vertex = make_colored_vertex(fields[:3], fields[3:])
        vertices.append(vertex)

    for vertex in vertices:
        print(vertex)
```

## The memoryview Type
This chapter introduces the built-in Python type `memoryview`, which wraps any existing underlying collection of bytes that supports the buffer protocol. This type allows us to view an underlying byte buffer as a sequence of Python objects without duplicating the data. Using the `memoryview` type, the code is reworked to avoid data duplication, and a deeper understanding of the `memoryview` type's functionality is provided.

One of the key advantages of `memoryview` objects is that they support slicing and can be cast to interpret binary data as native types without copying data. This functionality is leveraged to modify vector and color classes to use bytes buffer as the underlying storage. This chapter concludes by reworking the main loop in the program to yield successive `memoryview` objects.

```python
def make_colored_vertex(memv: memoryview):
    vector_memv = memv[:12].cast('f')
    color_memv = memv[12:18].cast('H')
    return Vertex(Vector(vector_memv), Color(color_memv))

def main():
    with open("file.bin", "rb") as file:
        buffer = file.read()

    mem = memoryview(buffer)
    vertex_struct_alignment = 18
    vertex_struct_padding = (4 - vertex_struct_alignment % 4) % 4
    vertex_struct_size = vertex_struct_alignment + vertex_struct_padding

    vertices = [make_colored_vertex(mem[i:i+vertex_struct_size])
                for i in range(0, len(mem), vertex_struct_size)]
    
    for vertex in vertices:
        print(vertex)
```

## Memory-mapped Files

This chapter introduces the concept of memory-mapped files and how to implement them using the mmap module in Python. The key idea behind memory-mapped files is to use the operating system's virtual memory system to make large files appear as if they're in memory, thereby minimizing the amount of data transferred from disk into memory. The mmap module creates an object that behaves like a byte array and a file-like object.

Code Example:

```python
import mmap

with open('file.txt', 'r') as f:
    with mmap.mmap(f.fileno(), length=0, access=mmap.ACCESS_READ) as m:
        # Use m object as per the requirement
        print(m[:10])  # reads the first 10 bytes
```

In this example, the file 'file.txt' is opened, and a file descriptor is created using the fileno() method. The file descriptor is then passed to the mmap constructor to create a memory-mapped file. The m object can be used similarly to a byte array.

The chapter also discusses the importance of properly handling the lifetimes of objects dependent on the memory-mapped file, as these must be shorter than the lifetime of the memory mapping itself. The use of try-finally blocks is recommended to ensure resources are released even in the case of an exception.

## The mmap Module

The mmap module provides a way to create, access, and manipulate memory-mapped files. It's an efficient way to read binary data, particularly for very large files. The chapter discusses how the mmap instance supports the C API Bertha protocol and hence can be used as an argument passed to the memory view constructor.

This chapter ends with a summary of the entire course, reviewing Python's bitwise operators, binary sequence types (bytes and bytearray), using the struct module to interpret bytes as C language types, and memoryview for obtaining copy-free views. The course concludes by emphasizing the usefulness of memory-mapped files for efficient data handling, encouraging further exploration into more advanced topics like shared memory with mmap and interfacing with native C and C++ code. The material covered in this course is part of the book, "The Python Master".