In [1]:
[ord(character) for character in "€uro"]

[8364, 117, 114, 111]

In [7]:
for character in "€uro":
    decimal_code_pt = ord(character)
    #binary_code_pt = bin(decimal_code_pt)[2:]
    print(f"{character} {decimal_code_pt:10d} {decimal_code_pt:20b}")

€       8364       10000010101100
u        117              1110101
r        114              1110010
o        111              1101111


In [12]:
for character in "€uro":
    decimal_code_pt = ord(character)
    print(f"{character}  {decimal_code_pt:4d}  {decimal_code_pt:016b}")

€  8364  0010000010101100
u   117  0000000001110101
r   114  0000000001110010
o   111  0000000001101111


In [13]:
"€uro".encode("utf-8")

b'\xe2\x82\xacuro'

In [14]:
type("€uro".encode("utf-8"))

bytes

In UTF-8, the string `"€uro"` takes `3, 1, 1, 1` bytes to store for `"€", "u", "r", "o"`, resp. Indeed, this can be verified by the next cell:

In [15]:
for character in "€uro":
    print(character, len(character.encode("utf-8")))

€ 3
u 1
r 1
o 1


Note that `"€"` cannot be encoded in `"ascii"`.
```python
for character in "€uro":
    print(character, len(character.encode("ascii")))
```
<br>

```
UnicodeEncodeError: 'ascii' codec can't encode character '\u20ac' in position 0: ordinal not in range(128)
```

## Left Shift `<<`
`a << n` is like `a * (2**n)` for all non-negative integers `n`.
- Don’t use the bit shift operators as a means of premature optimization in Python. You won’t see a difference in execution speed, but you’ll most definitely make your code less readable. This is because most compilers and interpreters today, including Python's, are quite capable of optimizing your code behind the scenes.

In [18]:
import random

**(?)** What about left shift for negative integers?

In [59]:
[-0b1101 << n for n in range(10)]

[-13, -26, -52, -104, -208, -416, -832, -1664, -3328, -6656]

**Note**.
Expressions like `0b1101` are not two-complement. They can only express non-negative integers; at most we can add a minus sign in the front to negate.

In [1]:
0b11111111

255

In [2]:
-0b1111111

-127

In [7]:
-0b10000000

-128

## Right Shift `>>`
`a >> n` is like `a // (2**n)` for all non-negative integers `n`.

In [57]:
random.sample(range(2**8), 2**8)

[55,
 200,
 56,
 242,
 219,
 201,
 63,
 1,
 156,
 231,
 137,
 62,
 25,
 67,
 94,
 41,
 249,
 125,
 136,
 235,
 196,
 203,
 3,
 53,
 79,
 98,
 255,
 204,
 50,
 188,
 225,
 46,
 131,
 113,
 252,
 162,
 39,
 205,
 93,
 106,
 81,
 103,
 141,
 100,
 105,
 194,
 237,
 8,
 191,
 168,
 85,
 213,
 10,
 208,
 12,
 183,
 92,
 142,
 147,
 206,
 89,
 179,
 192,
 229,
 139,
 29,
 83,
 120,
 95,
 13,
 23,
 80,
 69,
 253,
 175,
 207,
 176,
 218,
 116,
 138,
 27,
 143,
 220,
 250,
 123,
 177,
 158,
 70,
 169,
 181,
 72,
 135,
 133,
 190,
 88,
 173,
 75,
 254,
 109,
 104,
 185,
 152,
 154,
 87,
 64,
 28,
 170,
 155,
 9,
 0,
 247,
 217,
 216,
 36,
 101,
 212,
 157,
 245,
 199,
 74,
 184,
 178,
 60,
 102,
 182,
 127,
 130,
 108,
 110,
 224,
 77,
 246,
 222,
 45,
 124,
 76,
 5,
 21,
 84,
 160,
 35,
 198,
 99,
 132,
 16,
 149,
 112,
 71,
 186,
 174,
 54,
 148,
 145,
 48,
 211,
 34,
 65,
 251,
 78,
 82,
 107,
 221,
 96,
 227,
 172,
 230,
 49,
 20,
 236,
 66,
 129,
 51,
 166,
 42,
 68,
 7,
 47,
 26,
 243,
 11

In [43]:
random.seed(42)
for a in random.sample(range(2**8), 2**8):
    for n in random.sample(range(2**8), 2**8):
        left_shift = a >> n
        divide_by_power = a // (2**n)
        if left_shift != divide_by_power:
            print(f"{a} >> {n} Not equal to {a} // (2**{n})")

In [44]:
type((i for i in range(10)))

generator

In [31]:
for n in range(17):
    print(f"2 >> {n:2d} equals {2 >> n}")

2 >>  0 equals 2
2 >>  1 equals 1
2 >>  2 equals 0
2 >>  3 equals 0
2 >>  4 equals 0
2 >>  5 equals 0
2 >>  6 equals 0
2 >>  7 equals 0
2 >>  8 equals 0
2 >>  9 equals 0
2 >> 10 equals 0
2 >> 11 equals 0
2 >> 12 equals 0
2 >> 13 equals 0
2 >> 14 equals 0
2 >> 15 equals 0
2 >> 16 equals 0


In [36]:
for n in range(33):
    print(f"-2 >> {n:2d} equals {-2 >> n}")

-2 >>  0 equals -2
-2 >>  1 equals -1
-2 >>  2 equals -1
-2 >>  3 equals -1
-2 >>  4 equals -1
-2 >>  5 equals -1
-2 >>  6 equals -1
-2 >>  7 equals -1
-2 >>  8 equals -1
-2 >>  9 equals -1
-2 >> 10 equals -1
-2 >> 11 equals -1
-2 >> 12 equals -1
-2 >> 13 equals -1
-2 >> 14 equals -1
-2 >> 15 equals -1
-2 >> 16 equals -1
-2 >> 17 equals -1
-2 >> 18 equals -1
-2 >> 19 equals -1
-2 >> 20 equals -1
-2 >> 21 equals -1
-2 >> 22 equals -1
-2 >> 23 equals -1
-2 >> 24 equals -1
-2 >> 25 equals -1
-2 >> 26 equals -1
-2 >> 27 equals -1
-2 >> 28 equals -1
-2 >> 29 equals -1
-2 >> 30 equals -1
-2 >> 31 equals -1
-2 >> 32 equals -1


**(?)** Why are the previous cell's results always `-1`?

### Arithmetic and Logical Shift
**Definition**.
- **logical shift**, aka _unsigned right shift_ or _zero-fill right shift_, moves the entire binary sequence, including the sign bit to the right and fills the resulting vacancies on the leftmost positions with zeros ![](figs/rshift_logical.5ee25943b1a4.gif)<br>
  - Since the leftmost bit is always replaced by a zero in a logical shift, the result is always **non-negative** <br><br>
- **arithmetic shift**, aka _signed right shift_, shifts all bits to the right, the vacancies being replaced by zeros except at the leading bit which maintains the sign bit's value. ![](figs/rshift_arithmetic.990b7e40923a.gif)

| decimal | binary |
| ------- | ------ |
| `-100` | `10011100` |
| `28`  |  `00011100` |

**(?)** Quick quiz: Whose bitwise NOT equals `-100`?


<br>
<br>

01. only arithmetic shift
  - Python
02. both logical and arithmetic shift
  - Java
  - Javascript
  - Julia
  
Because I hardly use Java or Javascript, let's use Julia to compare with Python.

In [15]:
-100 >> 1

-50

In [16]:
!julia -e "println(-100 >>> 1)"

9223372036854775758


In [20]:
!julia -e "println(typeof(-100))"

Int64


In [19]:
!julia -e "println(Int32(-100) >>> 1)"

2147483598


Let's
- explore a little bit
- try to do the logical shift manually via string operation in Julia by ourselves

In [22]:
!julia -e "println(typemax(Int32), \"\n\", typemin(Int32))"

2147483647
-2147483648


In [23]:
!julia -e "println(string(Int32(-100), base=2))"

-1100100


In [24]:
!julia -e "println(bitstring(Int32(-100)))"

11111111111111111111111110011100


In [25]:
!julia -e "println(length(bitstring(Int32(-100))))"

32


In [30]:
with open("logical_shift_manully.jl", "w") as f:
    f.write("""
a = Int32(-100)
a_bstr = bitstring(a)
shifted_bstr = "0" * a_bstr[1:end-1]
shifted = parse(Int32, shifted_bstr; base=2)
println("shifted           = $shifted")
println("Int32(-100) >>> 1 = $(Int32(-100) >>> 1)")
    """)

In [31]:
!cat logical_shift_manully.jl


a = Int32(-100)
a_bstr = bitstring(a)
shifted_bstr = "0" * a_bstr[1:end-1]
shifted = parse(Int32, shifted_bstr; base=2)
println("shifted           = $shifted")
println("Int32(-100) >>> 1 = $(Int32(-100) >>> 1)")
    

In [32]:
!julia logical_shift_manully.jl

shifted           = 2147483598
Int32(-100) >>> 1 = 2147483598


Great. We have in some sense verified logical right shift.

In [None]:
from ctypes import c_uint32

In [None]:
c_uint32(-100)

In [None]:
c_uint32(-100).value

In [18]:
# Oops: Forgot to use the attribute .value
c_uint32(-100).value >> 1

2147483598

In [13]:
2**32 >> 7

33554432

In [12]:
!julia -e "println(2^32 >>> 7)"

33554432


In [14]:
!julia -e "println(2^32 >> 7)"

33554432


## Bitwise NOT
### Why `~156` Equals `-157` But Not `99` ?

| decimal | binary |
| ------- | ------ |
| `156` | `0b10011100` |
| `99`  | `0b01100011` |

This might be able to be explained as follows:

In [8]:
~156

-157

In [1]:
~-100

99

In [7]:
len(bin(~-100)), len(bin(-100))

(9, 10)

In [6]:
print(f"{bin(~-100):>10}")
print(bin(-100))

 0b1100011
-0b1100100


In [17]:
0b01100011

99

In [14]:
ls_result = !ls
ls_result

['bitwise_ops.ipynb', 'figs', 'README.md', 'trash.py']

In [13]:
julia_result = !(julia -e "print(Int32(-100) >>> 1)")
julia_result

['2147483598']

### Unsigned Integer (in C)
- The first sign bit of a signed integer, when in unsigned case, is being recognized as an extra bit. In other words, the maximum reachable integer of an unsigned integer is larger.
- Python's integer can be infinite-length, so there is no worry about overflow.

In [1]:
from ctypes import c_uint8

In [3]:
c_uint8(-42).value

214

In [4]:
bin(42)

'0b101010'

So  `-42` by two's complement should be `~(0b00101010) + 0b1`, which equals `0b11010110`. When this is
interpreted as `c_uint8`, we get `2**7 + 2**6 + 2**4 + 2**2 + 2**1`, which equal to `128 + 64 + 16 + 4 + 2`,
i.e. `214`.

**(?)** What happens with overflow in `c_uint8`?<br>
**(R)** Cf. below.

In [5]:
c_uint8(0)

c_ubyte(0)

In [6]:
c_uint8(2**8)

c_ubyte(0)

In [7]:
c_uint8(2**8 + 1)

c_ubyte(1)

In [8]:
c_uint8(-1)

c_ubyte(255)