> ### 데이터 압축이 필요한 이유
> - 데이터 전송 시 대용량 데이터는 전송 속도가 느리며, 전송 문제가 발생할 가능성이 높음
> - 데이터 압축의 종류
>   - `무손실 압축` : 데이터 손실이 전혀 없는 압축
>   - `손실 압축` : 사람이 눈치채지 못할 수준의 정보만 버리고 압축하는 방법
> - 압축률 : 원시 자료량 (원래 데이터 크기) / 압축된 자료량 (압축된 데이터 크기)
> - 다양한 압축 알고리즘에 따라 압축 성능 및 시간이 좌우됨
>   - 압축 : 인코딩 (Encoding)
>   - 압축 해제 : 디코딩 (Decoding)

> ### zlib
> - `zlib`은 데이터를 압축하거나 해제할 때 사용하는 모듈
> - `compress()`와 `decompress()` 함수로 문자열을 압축하거나 해제
> - 데이터 크기를 줄여서 전송이 필요할 경우 사용 

### 문자열 데이터 압축

In [2]:
import zlib

In [3]:
# 대용량 문자열 데이터 (350,000 byte)
data = "Life is too short, You need python." * 1000

In [5]:
data

'Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, 

In [6]:
print(len(data))

35000


### zlib 압축

In [7]:
# 유티코드로 인코딩 후 압축
compressed_data = zlib.compress(data.encode(encoding='utf-8'))
print(len(compressed_data))

161


In [8]:
compressed_data

b'x\x9c\xed\xca\xb1\x11\x80 \x10\x00\xb0U~\x00\xcfIX\xc0^<lxO\xb0p{\xd7\xb0H\xea\x94\xf3\xa8q\x8e\x98\x991Z\xdes\x89-\x9f\xe8\xb5\xeeq\xbd\xb3e_\x8b\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\xff(\x1f1~\xa4\t'

In [19]:
# 압축률
print(f'zlib : {round(len(data) / len(compressed_data), 2)}')

zlib : 217.39


### zlib 압축 해제

In [17]:
org_data = zlib.decompress(compressed_data).decode('utf-8')
print(len(org_data))

35000


In [18]:
org_data

'Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, 

> ### gzip
> - `gzip`은 파일을 압축하거나 해제할 때 사용하는 모듈
> - 내부적으로 zlib 알고리즘을 사용

In [24]:
import gzip

In [22]:
# 대용량 문자열 데이터 (350000 byte)
data = "Life is too short, You need python." * 10000

In [23]:
# 원본 데이터 저장
with open('org_data.txt', 'w') as f:
    f.write(data)

### gzip 압축

In [30]:
with gzip.open('compressed.txt.gz', 'wb') as f:
    f.write(data.encode('utf-8'))

### gzip 압축 해제

In [31]:
with gzip.open('compressed.txt.gz', 'rb') as f:
    org_data = f.read().decode('utf-8')

In [32]:
org_data

'Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, 

### zipfile
- `zipfile`은 여려개 파일을 zip 확장자로 합쳐서 압축할 때 사용하는 모듈

In [35]:
import zipfile

In [36]:
# 파일 합치기
with zipfile.ZipFile('./sample/새파일.zip', 'w') as myzip:
    myzip.write('./sample/새파일1.txt')
    myzip.write('./sample/새파일2.txt')
    myzip.write('./sample/새파일3.txt')

In [38]:
# 압축 해제하기
with zipfile.ZipFile('./sample/새파일.zip') as myzip:
    myzip.extractall()

> ### tarfile
> - `tarfile`은 여러 개의 파일을 tar 확장자로 합쳐서 압축할 때 사용하는 모듈

In [39]:
import tarfile

In [40]:
# 파일 합치기
with tarfile.open('./sample/새파일.tar', 'w') as mytar:
    mytar.add('./sample/새파일1.txt')
    mytar.add('./sample/새파일2.txt')
    mytar.add('./sample/새파일3.txt')
    mytar.add('./sample/새파일4.txt')

In [42]:
# 압축 해제하기
with tarfile.open('./sample/새파일.tar') as mytar:
    mytar.extractall()