Pytorch와 Tensorflow는 가장 많이 사용하는 딥러닝 프레임워크입니다.
주요 문법을 비교해보겟습니다.


### Tensor 연산
* 행렬 형태의 연산을 지원하는 것은 비슷함

In [3]:
# PyTorch
import torch

# 텐서 생성
x = torch.tensor([1, 2, 3])
y = torch.tensor([4.0, 5.0, 6.0])

# 텐서 연산
z = x + y
print(z)


torch.int64
tensor([5., 7., 9.])


In [None]:
# TensorFlow
import tensorflow as tf

# 텐서 생성
x = tf.constant([1.0, 2.0, 3.0])
y = tf.constant([4.0, 5.0, 6.0])

# 텐서 연산
z = x + y
print(z)

tf.Tensor([5. 7. 9.], shape=(3,), dtype=float32)


### Autograd
* Gradient 계산에 대해서 Pytorch는 backward를 사용해서 명시적으로 주로 사용
* Tensorflow의 경우는 학습에 fit이라는 함수를 사용하면서 Gradient 계산하라는 명령을 주지 않아도 됨. 따라서 TF는 gradient 계산을 기록하라는 함수 제공

In [None]:
# PyTorch
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = x * 2
z = y.mean()  # loss function 대신 간단한 함수 사용
z.backward()

print(x.grad)


tensor([0.6667, 0.6667, 0.6667])


In [None]:
# TensorFlow
x = tf.Variable([1.0, 2.0, 3.0])
with tf.GradientTape() as tape:
    y = x * 2
    z = tf.reduce_mean(y) # loss function 대신 간단한 함수 사용

grads = tape.gradient(z, x)
print(grads)


tf.Tensor([0.6666667 0.6666667 0.6666667], shape=(3,), dtype=float32)


### Sequential Model
* 둘다 시퀀셜 모드를 제공하며, tf는 activation function이 Dense(Linear)에 옵션으로 지정할 수 있습니다

In [None]:
import torch
import torch.nn as nn

# PyTorch Sequential 모델 정의
model = nn.Sequential(
    nn.Linear(3, 5),  # 입력 크기 3, 출력 크기 5
    nn.ReLU(),
    nn.Linear(5, 4),  # 입력 크기 5, 출력 크기 4
    nn.ReLU(),
    nn.Linear(4, 1)   # 입력 크기 4, 출력 크기 1
)

# 모델 요약 출력
print(model)


Sequential(
  (0): Linear(in_features=3, out_features=5, bias=True)
  (1): ReLU()
  (2): Linear(in_features=5, out_features=4, bias=True)
  (3): ReLU()
  (4): Linear(in_features=4, out_features=1, bias=True)
)


In [None]:
# TensorFlow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# TensorFlow Sequential 모델 정의
model = Sequential([
    Dense(5, input_shape=(3,), activation='relu'),  # 입력 크기 3, 출력 크기 5
    Dense(4, activation='relu'),  # 입력 크기 5, 출력 크기 4 (자동 결정)
    Dense(1)  # 입력 크기 4, 출력 크기 1 (자동 결정)
])

# 모델 요약 출력
model.summary()


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 5)                 20        
                                                                 
 dense_1 (Dense)             (None, 4)                 24        
                                                                 
 dense_2 (Dense)             (None, 1)                 5         
                                                                 
Total params: 49 (196.00 Byte)
Trainable params: 49 (196.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### CNN 모델 비교
* tf의 경우 입력에 대한 크기를 지정하지 않아도 됩니다. 자동으로 계산됩니다

In [None]:
import torch
import torch.nn as nn

# PyTorch Sequential 모델 정의 (CNN)
model = nn.Sequential(
    nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1),  # 입력 채널 1, 출력 채널 32
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),  # 2x2 최대 풀링

    nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),  # 입력 채널 32, 출력 채널 64
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),  # 2x2 최대 풀링

    nn.Flatten(),  # 2D 데이터를 1D로 변환
    nn.Linear(64 * 7 * 7, 128),  # 입력 뉴런 수 64*7*7, 출력 뉴런 수 128
    nn.ReLU(),
    nn.Linear(128, 10)  # 입력 뉴런 수 128, 출력 뉴런 수 10 (분류 클래스 수)
)

# 모델 요약 출력
print(model)


Sequential(
  (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Flatten(start_dim=1, end_dim=-1)
  (7): Linear(in_features=3136, out_features=128, bias=True)
  (8): ReLU()
  (9): Linear(in_features=128, out_features=10, bias=True)
)


In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# TensorFlow Sequential 모델 정의 (CNN)
model = Sequential([
    Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(28, 28, 1)),  # 입력 채널 1, 출력 채널 32
    MaxPooling2D((2, 2)),  # 2x2 최대 풀링

    Conv2D(64, (3, 3), padding='same', activation='relu'),  # 입력 채널 32, 출력 채널 64
    MaxPooling2D((2, 2)),  # 2x2 최대 풀링

    Flatten(),  # 2D 데이터를 1D로 변환
    Dense(128, activation='relu'),  # 입력 뉴런 수 자동 결정, 출력 뉴런 수 128
    Dense(10)  # 입력 뉴런 수 128, 출력 뉴런 수 10 (분류 클래스 수)
])

# 모델 요약 출력
model.summary()


Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 28, 28, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2  (None, 14, 14, 32)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 14, 14, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 7, 7, 64)          0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 3136)              0         
                                                                 
 dense_3 (Dense)             (None, 128)              

### Subclassing
* 둘 다 subclassing을 지원합니다. tf는 forward 대신 call이라는 함수를 사용합니다

In [None]:
import torch
import torch.nn as nn

# PyTorch 모델 정의 (서브클래싱)
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.layer1 = nn.Linear(3, 5)
        self.layer2 = nn.Linear(5, 4)
        self.layer3 = nn.Linear(4, 1)

    def forward(self, x):
        x = torch.relu(self.layer1(x))
        x = torch.relu(self.layer2(x))
        x = self.layer3(x)
        return x

model = MyModel()

# 모델 요약 출력
print(model)


MyModel(
  (layer1): Linear(in_features=3, out_features=5, bias=True)
  (layer2): Linear(in_features=5, out_features=4, bias=True)
  (layer3): Linear(in_features=4, out_features=1, bias=True)
)


In [None]:
import tensorflow as tf
from tensorflow.keras import layers, Model

# TensorFlow 모델 정의 (서브클래싱)
class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.dense1 = layers.Dense(5, activation='relu')
        self.dense2 = layers.Dense(4, activation='relu')
        self.dense3 = layers.Dense(1)

    def call(self, inputs):
        x = self.dense1(inputs)
        x = self.dense2(x)
        return self.dense3(x)

model = MyModel()

# 모델 요약 출력
model.build((None, 3))  # 모델 빌드하여 입력 크기 정의
model.summary()


Model: "my_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_5 (Dense)             multiple                  20        
                                                                 
 dense_6 (Dense)             multiple                  24        
                                                                 
 dense_7 (Dense)             multiple                  5         
                                                                 
Total params: 49 (196.00 Byte)
Trainable params: 49 (196.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### Functional API
* Pytorch에서는 연산 함수 모아놓은 것을 functional에서 제공
* Tensorflow는 Functional API를 통해서 유연하게 모델 만들 수 있음


In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# PyTorch 모델 정의 (Functional API)
class MyCNN(nn.Module):
    def __init__(self):
        super(MyCNN, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1)  # 입력 채널 1, 출력 채널 32
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)  # 입력 채널 32, 출력 채널 64
        self.fc1 = nn.Linear(64 * 7 * 7, 128)  # 입력 뉴런 수 64*7*7, 출력 뉴런 수 128
        self.fc2 = nn.Linear(128, 10)  # 입력 뉴런 수 128, 출력 뉴런 수 10 (분류 클래스 수)

    def forward(self, x):
        x = F.relu(self.conv1(x))  # ReLU 활성화 함수
        x = F.max_pool2d(x, kernel_size=2, stride=2)  # 2x2 최대 풀링
        x = F.relu(self.conv2(x))  # ReLU 활성화 함수
        x = F.max_pool2d(x, kernel_size=2, stride=2)  # 2x2 최대 풀링
        x = torch.flatten(x, 1)  # 배치 차원 제외 2D 데이터를 1D로 변환
        x = F.relu(self.fc1(x))  # ReLU 활성화 함수
        x = self.fc2(x)  # 최종 출력 레이어
        return x

model_torch = MyCNN()

# 모델 요약 출력
print(model_torch)


MyCNN(
  (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (fc1): Linear(in_features=3136, out_features=128, bias=True)
  (fc2): Linear(in_features=128, out_features=10, bias=True)
)


In [None]:
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.models import Model

# Functional API를 사용한 TensorFlow 모델 정의
inputs = Input(shape=(28, 28, 1))  # 입력 크기 정의
x = Conv2D(32, (3, 3), padding='same', activation='relu')(inputs)  # 입력 채널 1, 출력 채널 32
x = MaxPooling2D((2, 2))(x)  # 2x2 최대 풀링
x = Conv2D(64, (3, 3), padding='same', activation='relu')(x)  # 입력 채널 32, 출력 채널 64
x = MaxPooling2D((2, 2))(x)  # 2x2 최대 풀링
x = Flatten()(x)  # 2D 데이터를 1D로 변환
x = Dense(128, activation='relu')(x)  # 입력 뉴런 수 자동 결정, 출력 뉴런 수 128
outputs = Dense(10, activation = 'softmax')(x)  # 입력 뉴런 수 128, 출력 뉴런 수 10 (분류 클래스 수)

# 모델 정의
model_tf = Model(inputs=inputs, outputs=outputs)

# 모델 요약 출력
model_tf.summary()


Model: "model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_3 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
 conv2d_6 (Conv2D)           (None, 28, 28, 32)        320       
                                                                 
 max_pooling2d_6 (MaxPoolin  (None, 14, 14, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_7 (Conv2D)           (None, 14, 14, 64)        18496     
                                                                 
 max_pooling2d_7 (MaxPoolin  (None, 7, 7, 64)          0         
 g2D)                                                            
                                                                 
 flatten_3 (Flatten)         (None, 3136)              0   

### Training
* tf는 fit이라는 함수로 학습이 가능하며, early stopping 등도 옵션으로 넣을 수 있다.
* tf는 기본적으로 gpu가 사용가능하면 gpu를 사용함

In [None]:
import torch
import torch.optim as optim
import torch.nn.functional as F
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# GPU 설정
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 데이터셋 로드 및 전처리
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

# 모델, 손실 함수 및 옵티마이저 정의
model = MyCNN().to(device)  # 모델을 GPU로 이동
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 모델 훈련
num_epochs = 5
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)  # 데이터를 GPU로 이동
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}')

# 훈련된 모델 저장
torch.save(model.state_dict(), 'path_to_model_weights.pth')


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 15879471.33it/s]


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 482848.94it/s]


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 4460978.98it/s]


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 3944208.85it/s]


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw

Epoch [1/5], Loss: 0.1396
Epoch [2/5], Loss: 0.0443
Epoch [3/5], Loss: 0.0304
Epoch [4/5], Loss: 0.0205
Epoch [5/5], Loss: 0.0166


In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.data import Dataset

# 데이터셋 로드 및 전처리
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
y_train = tf.keras.utils.to_categorical(y_train, 10)

# 데이터셋 객체 생성
train_dataset = Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64).prefetch(tf.data.AUTOTUNE)

# 모델 정의 및 컴파일 (MyCNN 클래스는 이미 정의되어 있다고 가정)

model_tf.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# 모델 훈련
num_epochs = 5
model_tf.fit(train_dataset, epochs=num_epochs)

# 훈련된 모델 저장
model_tf.save('path_to_model_weights.keras')


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


### Inference

In [None]:
import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# GPU 설정
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 데이터셋 로드 및 전처리
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
test_loader = DataLoader(dataset=test_dataset, batch_size=1, shuffle=False)

# 모델 정의 (MyCNN 클래스는 이미 정의되어 있다고 가정)
model_torch = MyCNN().to(device)
model_torch.load_state_dict(torch.load('path_to_model_weights.pth'))  # 훈련된 모델 가중치 로드
model_torch.eval()  # 평가 모드로 전환

# 예측 수행
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)  # 데이터를 GPU로 이동
        outputs = model_torch(images)
        _, predicted = torch.max(outputs, 1)
        print(f'Predicted: {predicted.item()}, Ground Truth: {labels.item()}')


[1;30;43m스트리밍 출력 내용이 길어서 마지막 5000줄이 삭제되었습니다.[0m
Predicted: 3, Ground Truth: 3
Predicted: 9, Ground Truth: 9
Predicted: 9, Ground Truth: 9
Predicted: 8, Ground Truth: 8
Predicted: 4, Ground Truth: 4
Predicted: 1, Ground Truth: 1
Predicted: 0, Ground Truth: 0
Predicted: 6, Ground Truth: 6
Predicted: 0, Ground Truth: 0
Predicted: 9, Ground Truth: 9
Predicted: 6, Ground Truth: 6
Predicted: 8, Ground Truth: 8
Predicted: 6, Ground Truth: 6
Predicted: 1, Ground Truth: 1
Predicted: 1, Ground Truth: 1
Predicted: 9, Ground Truth: 9
Predicted: 8, Ground Truth: 8
Predicted: 9, Ground Truth: 9
Predicted: 2, Ground Truth: 2
Predicted: 3, Ground Truth: 3
Predicted: 5, Ground Truth: 5
Predicted: 5, Ground Truth: 5
Predicted: 9, Ground Truth: 9
Predicted: 4, Ground Truth: 4
Predicted: 2, Ground Truth: 2
Predicted: 1, Ground Truth: 1
Predicted: 9, Ground Truth: 9
Predicted: 4, Ground Truth: 4
Predicted: 3, Ground Truth: 3
Predicted: 9, Ground Truth: 9
Predicted: 6, Ground Truth: 6
Predicted: 0, Ground

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.data import Dataset

# 데이터셋 로드 및 전처리
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_test = x_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0
y_test = tf.keras.utils.to_categorical(y_test, 10)

# 데이터셋 객체 생성
test_dataset = Dataset.from_tensor_slices((x_test, y_test)).batch(1)

tf.keras.models.load_model('path_to_model_weights.keras')

# 예측 수행
for images, labels in test_dataset:
    outputs = model_tf(images, training=False)
    predicted = tf.argmax(outputs, axis=1)
    ground_truth = tf.argmax(labels, axis=1)
    print(f'Predicted: {predicted.numpy()[0]}, Ground Truth: {ground_truth.numpy()[0]}')


[1;30;43m스트리밍 출력 내용이 길어서 마지막 5000줄이 삭제되었습니다.[0m
Predicted: 3, Ground Truth: 3
Predicted: 9, Ground Truth: 9
Predicted: 9, Ground Truth: 9
Predicted: 8, Ground Truth: 8
Predicted: 4, Ground Truth: 4
Predicted: 1, Ground Truth: 1
Predicted: 0, Ground Truth: 0
Predicted: 6, Ground Truth: 6
Predicted: 0, Ground Truth: 0
Predicted: 9, Ground Truth: 9
Predicted: 6, Ground Truth: 6
Predicted: 8, Ground Truth: 8
Predicted: 6, Ground Truth: 6
Predicted: 1, Ground Truth: 1
Predicted: 1, Ground Truth: 1
Predicted: 9, Ground Truth: 9
Predicted: 8, Ground Truth: 8
Predicted: 9, Ground Truth: 9
Predicted: 2, Ground Truth: 2
Predicted: 3, Ground Truth: 3
Predicted: 5, Ground Truth: 5
Predicted: 5, Ground Truth: 5
Predicted: 9, Ground Truth: 9
Predicted: 4, Ground Truth: 4
Predicted: 2, Ground Truth: 2
Predicted: 1, Ground Truth: 1
Predicted: 9, Ground Truth: 9
Predicted: 4, Ground Truth: 4
Predicted: 3, Ground Truth: 3
Predicted: 9, Ground Truth: 9
Predicted: 6, Ground Truth: 6
Predicted: 0, Ground


*예시들은 ChatGPT를 활용해서 만들었습니다.*