# **Chapter 4. [폴더/디렉토리] 폴더관리 프로그램 만들기**


---
### 📝 **학습 목차**
> 4-1. 프로젝트 개요 <br>
> 4-2. 디렉터리 다루기 - os.path, pathlib <br>
> 4-3. 파일 읽기, 저장 - fileinput, pickle <br>
> 4-4. 파일 찾기, 복사, 이동 - glob, fnmatch, shutil <br>
> **4-5. 파일 압축 - zlib, gzip, zipfile, tarfile** <br>
> 4-6. 프로젝트 실습

## 4-5. 파일 압축

> ### 데이터 압축이 필요한 이유?
> - 데이터 전송 시 대용량 데이터는 **전송 속도가 느리며**, **전송 문제가 발생**할 가능성이 높음
> - 데이터 압축의 종류
>    - `무손실 압축` : 데이터 손실이 전혀 없는 압축
>    - `손실 압축` : 사람이 눈치채지 못할 수준의 정보만 버리고 압축하는 방법
> - 압축률 : 원시 자료량 (원래 데이터 크기) / 압축된 자료량 (압축된 데이터 크기)
> - 다양한 압축 알고리즘에 따라 압축 성능 및 시간이 좌우됨
> - `압축` : 인코딩 (Encoding)
> - `압축 해제` : 디코딩 (Decoding)

> ### 런 - 길이 부호화 (Run-Length Encoding)
> - 대표적인 무손실 압축 방법
>
> <img align='left' src='img/run_length-encoding.png' width='600' height='600'/>

> ### zlib
> - `zlib` 은 zlib은 데이터를 압축하거나 해제할 때 사용하는 모듈
> - `compress()`와 `decompress()` 함수로 문자열을 압축하거나 해제
> - 데이터 크기를 줄여서 전송이 필요할 경우 사용

### 문자열 데이터 압축

In [1]:
import zlib

In [2]:
# 대용량 문자열 데이터 (350,000 byte) 
data = "Life is too short, You need python." * 10000

In [3]:
data

'Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, 

In [4]:
print(len(data))

350000


#### zlib 압축

In [5]:
# 유니코드로 인코딩 후 압축
compress_data = zlib.compress(data.encode(encoding='utf-8'))
print(len(compress_data))

1077


In [6]:
compress_data

b'x\x9c\xed\xca\xb1\x11\x80 \x10\x00\xb0U~\x00\xcfIX\xc0^<lxO\xb0p{\xc7\xb0I\xea\x94\xf3\xa8q\x8e\x98\x991Z\xdes\x89-\x9f\xe8\xb5\xeeq\xbd\xb3e_\x8b\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\x8a\xa2(\

In [7]:
# 압축률
print(f'zlib : {round(len(data) / len(compress_data), 2)}')

zlib : 324.98


#### zlib 압축 해제

In [8]:
org_data = zlib.decompress(compress_data).decode('utf-8')
print(len(org_data))

350000


In [9]:
org_data

'Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, 

> ### gzip
> - `gzip` 은 파일을 압축하거나 해제할 때 사용하는 모듈
> - 내부적으로 `zlib` 알고리즘을 사용

In [10]:
import gzip

In [11]:
# 대용량 문자열 데이터 (350,000 byte) 
data = "Life is too short, You need python." * 10000

In [12]:
# 원본 데이터 저장
with open('org_data.txt', 'w') as f:
    f.write(data)

#### gzip 압축

In [13]:
with gzip.open('compressed.txt.gz', 'wb') as f:
    f.write(data.encode('utf-8'))

#### gzip 압축 해제

In [15]:
with gzip.open('compressed.txt.gz', 'rb') as f:
    org_data = f.read().decode('utf-8')

In [16]:
org_data

'Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, You need python.Life is too short, 

> ### zipfile
> - `zipfile` 은 여러개 파일을 zip 확장자로 합쳐서 압축할 때 사용하는 모듈

In [17]:
import zipfile

In [18]:
# 파일 합치기
with zipfile.ZipFile('./sample/새파일.zip', 'w') as myzip:
    myzip.write('./sample/새파일1.txt')
    myzip.write('./sample/새파일2.txt')
    myzip.write('./sample/새파일3.txt')

In [19]:
# 압축 해제하기
with zipfile.ZipFile('./sample/새파일.zip') as myzip:
    myzip.extractall()

> ### tarfile
> - `tarfile` 은 여러개 파일을 tar 확장자로 합쳐서 압축할 때 사용하는 모듈

In [20]:
import tarfile

In [21]:
# 파일 합치기
with tarfile.open('./sample/새파일.tar', 'w') as mytar:
    mytar.add('./sample/새파일1.txt')
    mytar.add('./sample/새파일2.txt')
    mytar.add('./sample/새파일3.txt')

In [None]:
# 압축 해제하기
with tarfile.open('./sample/새파일.tar') as mytar:
    mytar.extractall()