## 부록 B: 데이터 타입의 내부 표현

### 정수

![](https://github.com/dvgodoy/FineTuningLLMs/blob/main/images/appendixB/uint8.png?raw=True)

<center>그림 B.1 - 부호 없는 8비트 정수(UINT8)의 내부 표현</center>

$$
\Large
x = \sum_{j=1}^{n_B}{b_j2^{j-1}}
\\
\Large
\begin{aligned}
&n_B = \text{number of bits}
\\
&b_j = j^{\text{th}}\text {from right to left}
\end{aligned}
$$

<center>식 B.1 - 비트로부터 부어 없는 정수 값을 계산하는 공식</center>

In [1]:
def get_unsigned_number(bits):
    number = sum([int(bit)*2**(j-1) for j, bit in enumerate(bits[::-1], start=1)])
    return number

In [2]:
get_unsigned_number('10000011')

131

In [3]:
import torch
torch.iinfo(torch.uint8)

ModuleNotFoundError: No module named 'torch'

![](https://github.com/dvgodoy/FineTuningLLMs/blob/main/images/appendixB/int8.png?raw=True)
<center>그림 B.3 - 부호 있는 8비트 정수(INT8)의 내부 표현</center>

![](https://github.com/dvgodoy/FineTuningLLMs/blob/main/images/appendixB/int8_example.png?raw=True)
<center>그림 B.4 - 부호 있는 정수 값과 비트</center>

$$
\Large
x = -b_{n_B}2^{n_B-1}+\sum_{j=1}^{n_B-1}{b_j2^{j-1}}
\\
\Large
\begin{aligned}
&n_B = \text{number of bits}
\\
&b_j = j^{\text{th}}\text {from right to left}
\\
&b_{n_B} = \text{left-most bit}
\end{aligned}
$$

<center>식 B.2 - 비트로부터 부호 있는 정수 값을 계산하는 공식</center>

In [None]:
def get_number(bits, signed=True):
    nb = len(bits)
    sign = -signed*2**(nb-1)
    number = sum([int(bit)*2**(j-1) for j, bit in enumerate(bits[signed:][::-1], start=1)])
    return sign + number

In [None]:
get_number('10000011', signed=True)

In [None]:
import numpy as np
np.binary_repr(-125, width=8)

In [None]:
torch.iinfo(torch.int8)

### 부동소수점 숫자

$$
\Large
\text{FP} = \underbrace{-1^S}_{\text{sign}}\underbrace{2^x}_{\text{exponent}}\underbrace{(1.0 + f)}_{\text{mantissa}}
$$
<center>식 B.3 - 부호, 지수, 가수로부터 부동소수점 숫자 계산하기</center>

In [None]:
def to_fp(s, x, f):
    return (-1)**s * 2**x * (1 + f)

In [None]:
f_cte = 0
print(to_fp(s=0, x=-1, f=f_cte)) # = 1*(2**-1)*(1+0)
print(to_fp(s=0, x=-8, f=f_cte)) # = 1*(2**-8)*(1+0)

In [None]:
x_cte = -1
print(to_fp(s=0, x=x_cte, f=0)) # = 1*(2**-1)*(1+0)
print(to_fp(s=0, x=x_cte, f=.5))# = 1*(2**-1)*(1+.5)
print(to_fp(s=0, x=x_cte, f=.9999))# = 1*(2**-1)*(1+.99)

In [None]:
new_x_cte = x_cte + 1
print(to_fp(s=0, x=new_x_cte, f=0))# = 1*(2**0)*(1+0)

In [None]:
print(to_fp(s=0, x=2, f=.55))# = 1*(2**2)*(1+.55)
print(to_fp(s=0, x=5, f=.15))# = 1*(2**5)*(1+.15)
print(to_fp(s=0, x=8, f=.5))# = 1*(2**8)*(1+.5)

![](https://github.com/dvgodoy/FineTuningLLMs/blob/main/images/appendixB/bf16_example.png?raw=True)
<center>그림 B.5 - 부호, 지수, 가수와 해당 비트의 예</center>

$$
\Large
f = \sum_{i=1}^{n_M}{m_i2^{-i}}
\\
\Large
x = \left(\sum_{j=1}^{n_E}{e_j2^{j-1}}\right) - b
\\
\Large
b = 2^{n_E-1}-1
\\
\Large
\begin{aligned}
&n_M = \text{number of bits in the mantissa}
\\
&m_i = i^{\text{th}}\text{ bit from left to right in the mantissa}
\\
&n_E = \text{number of bits in the exponent}
\\
&e_j = j^{\text{th}}\text {from right to left in the exponent}
\end{aligned}
$$
<center>식 B.4 가수와 지수의 비트에서 f와 x를 계산하는 공식</center>

In [None]:
def get_x(exponent):
    bias = 2**(len(exponent)-1)-1
    return sum([int(bit)*2**(j-1) for j, bit in enumerate(exponent[::-1], start=1)]) - bias

def get_f(mantissa):
    return sum([int(bit)*2**(-i) for i, bit in enumerate(mantissa, start=1)])

In [None]:
exponent = '10000011'
x = get_x(exponent)
x

In [None]:
mantissa = '0111011'
f = get_f(mantissa)
f

In [None]:
mantissa = '011101100000'
get_f(mantissa)

In [None]:
to_fp(0, x, f)

![](https://github.com/dvgodoy/FineTuningLLMs/blob/main/images/appendixB/bf16_diagram.png?raw=True)
<center>그림 B.6 BF16 데이터 타입의 내부 표현</center>

$$
\Large
\text{BF16} = -1^S \left(1.0 + \sum_{i=1}^{7}{m_i2^{-i}}\right) 2^{\left(\sum_{j=1}^{8}{e_j2^{j-1}}\right)-127}
$$
<center>식 B.5 비트로부터 BF16 값을 계산하는 공식</center>

![](https://github.com/dvgodoy/FineTuningLLMs/blob/main/images/appendixB/types_comparison.png?raw=True)
<center>그림 B.7 FP32, BF16, FP16의 내부 표현 비교</center>

In [None]:
# Adapted from https://stackoverflow.com/questions/16444726/binary-representation-of-float-in-python-bits-not-hex  
import struct

def binary_fp32(num):
    bits = ''.join('{:0>8b}'.format(c) for c in struct.pack('!f', num))
    sign = bits[0]
    exponent = bits[1:9]
    mantissa = bits[9:]
    return {'sign': sign, 'exponent': exponent, 'mantissa': mantissa}

bits = binary_fp32(23.375)
bits

In [None]:
s = int(bits['sign'])
f = get_f(bits['mantissa'])
x = get_x(bits['exponent'])
to_fp(s, x, f)