<a href="https://colab.research.google.com/github/ancestor9/2025_Winter_Deep-Learning-with-TensorFlow/blob/main/20260114_04_pytorch/tensor_matmul_convolution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **1. Tensorflow Tensor [click](https://www.tensorflow.org/guide/tensor)**

##### Copyright 2020 The TensorFlow Authors.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Introduction to Tensors

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/guide/tensor"><img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/guide/tensor.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/docs/blob/master/site/en/guide/tensor.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/docs/site/en/guide/tensor.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

In [None]:
import tensorflow as tf
import numpy as np

Tensors are multi-dimensional arrays with a uniform type (called a `dtype`).  You can see all supported `dtypes` at `tf.dtypes`.

If you're familiar with [NumPy](https://numpy.org/devdocs/user/quickstart.html), tensors are (kind of) like `np.arrays`.

All tensors are immutable like Python numbers and strings: you can never update the contents of a tensor, only create a new one.


## Basics

First, create some basic tensors.

Here is a "scalar" or "rank-0" tensor . A scalar contains a single value, and no "axes".

In [None]:
# This will be an int32 tensor by default; see "dtypes" below.
rank_0_tensor = tf.constant(4)
print(rank_0_tensor)

tf.Tensor(4, shape=(), dtype=int32)


A "vector" or "rank-1" tensor is like a list of values. A vector has one axis:

In [None]:
# Let's make this a float tensor.
rank_1_tensor = tf.constant([2.0, 3.0, 4.0])
print(rank_1_tensor)

tf.Tensor([2. 3. 4.], shape=(3,), dtype=float32)


A "matrix" or "rank-2" tensor has two axes:

In [None]:
# If you want to be specific, you can set the dtype (see below) at creation time
rank_2_tensor = tf.constant([[1, 2],
                             [3, 4],
                             [5, 6]], dtype=tf.float16)
print(rank_2_tensor)

tf.Tensor(
[[1. 2.]
 [3. 4.]
 [5. 6.]], shape=(3, 2), dtype=float16)


<table>
<tr>
  <th>A scalar, shape: <code>[]</code></th>
  <th>A vector, shape: <code>[3]</code></th>
  <th>A matrix, shape: <code>[3, 2]</code></th>
</tr>
<tr>
  <td>
   <img src="https://github.com/tensorflow/docs/blob/master/site/en/guide/images/tensor/scalar.png?raw=1" alt="A scalar, the number 4" />
  </td>

  <td>
   <img src="https://github.com/tensorflow/docs/blob/master/site/en/guide/images/tensor/vector.png?raw=1" alt="The line with 3 sections, each one containing a number."/>
  </td>
  <td>
   <img src="https://github.com/tensorflow/docs/blob/master/site/en/guide/images/tensor/matrix.png?raw=1" alt="A 3x2 grid, with each cell containing a number.">
  </td>
</tr>
</table>


Tensors may have more axes; here is a tensor with three axes:

In [None]:
# There can be an arbitrary number of
# axes (sometimes called "dimensions")
rank_3_tensor = tf.constant([
  [[0, 1, 2, 3, 4],
   [5, 6, 7, 8, 9]],
  [[10, 11, 12, 13, 14],
   [15, 16, 17, 18, 19]],
  [[20, 21, 22, 23, 24],
   [25, 26, 27, 28, 29]],])

print(rank_3_tensor)

tf.Tensor(
[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]]

 [[10 11 12 13 14]
  [15 16 17 18 19]]

 [[20 21 22 23 24]
  [25 26 27 28 29]]], shape=(3, 2, 5), dtype=int32)


There are many ways you might visualize a tensor with more than two axes.

<table>
<tr>
  <th colspan=3>A 3-axis tensor, shape: <code>[3, 2, 5]</code></th>
<tr>
<tr>
  <td>
   <img src="https://github.com/tensorflow/docs/blob/master/site/en/guide/images/tensor/3-axis_numpy.png?raw=1"/>
  </td>
  <td>
   <img src="https://github.com/tensorflow/docs/blob/master/site/en/guide/images/tensor/3-axis_front.png?raw=1"/>
  </td>

  <td>
   <img src="https://github.com/tensorflow/docs/blob/master/site/en/guide/images/tensor/3-axis_block.png?raw=1"/>
  </td>
</tr>

</table>

# **2. Pytorch Inner Product [click](https://pytorch.org/docs/stable/generated/torch.matmul.html)**

In [None]:
import torch

In [None]:
# vector x vector
tensor1 = torch.randn(3)
tensor2 = torch.randn(3)
print(torch.matmul(tensor1, tensor2))
torch.matmul(tensor1, tensor2).size()

tensor(-1.5560)


torch.Size([])

In [None]:
# matrix x vector
tensor1 = torch.randn(3, 4)
tensor2 = torch.randn(4)
print(torch.matmul(tensor1, tensor2))
torch.matmul(tensor1, tensor2).size()

tensor([ 1.1461, -2.3819,  0.2535])


torch.Size([3])

In [None]:

# batched matrix x broadcasted vector
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4)

print(torch.matmul(tensor1, tensor2))
torch.matmul(tensor1, tensor2).size()

tensor([[ 0.2762, -1.1717,  1.7380],
        [ 1.2485,  1.2325,  1.2556],
        [-0.9750,  0.5600, -0.0521],
        [-1.1922,  0.4728,  0.4964],
        [ 0.3522,  0.4364, -1.2004],
        [ 2.9120, -0.2097,  0.6990],
        [ 0.0270,  1.2568, -0.0344],
        [ 0.2918, -0.8756,  2.4513],
        [ 2.5257,  1.0546,  1.3443],
        [-2.3078,  0.8173,  1.0562]])


torch.Size([10, 3])

In [None]:

# batched matrix x batched matrix
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(10, 4, 5)

# print(torch.matmul(tensor1, tensor2))
torch.matmul(tensor1, tensor2).size()

torch.Size([10, 3, 5])

In [None]:

# batched matrix x broadcasted matrix
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4, 5)

# print(torch.matmul(tensor1, tensor2))
torch.matmul(tensor1, tensor2).size()

torch.Size([10, 3, 5])

# **3. Deep Learning Inner Product**
- 8은 "데이터가 몇 개인가?" (데이터의 양)
- 3은 "정보를 몇 개의 숫자로 표현할 것인가?" (은닉층의 너비)
- 아래 size를 이해하라

>> 입력 크기: torch.Size([8, 4])

>> 출력 크기: torch.Size([8, 3])

In [None]:
import torch.nn as nn

# 1. 입력 데이터 정의 (Batch_size=8, Input_features=4)
input_data = torch.randn(8, 4)

# 2. 선형 층(Linear Layer) 생성
# 입력 특징 수 4개 -> 은닉층 노드 수 3개
layer = nn.Linear(in_features=4, out_features=3) # out_features를 변동하면?????
# layer = nn.Linear(in_features=4, out_features=128)

# 3. 은닉층 계산 (가중치 곱 + 편향 더하기)
hidden_output = layer(input_data)

# 4. 활성화 함수 적용 (ReLU 등)
relu = nn.ReLU()
activated_output = relu(hidden_output)

print(f"입력 크기: {input_data.shape}")      # torch.Size([8, 4])
print(f"출력 크기: {activated_output.shape}") # torch.Size([8, 3])

입력 크기: torch.Size([8, 4])
출력 크기: torch.Size([8, 3])


# **4. 다양한 Tensor의 Inner Product [click](https://www.youtube.com/watch?v=pPIFauuiwEU)**

In [None]:
import torch

# 1. 이미지 속 데이터 생성
# Matrix A (5 x 4)
A = torch.randn(5, 4)

# 3D Tensor (4 x 3 x 2)
tensor_3d = torch.randn(4, 3, 2)

# 2. 연산 수행
# matmul은 보통 뒤쪽 차원을 곱하므로,
# tensor_3d를 (3, 2, 4)로 순서를 바꾼 뒤 A(5, 4)와 곱하는 테크닉이 필요할 수 있습니다.
# 하지만 가장 직관적인 방법은 einsum입니다. (아래 참고)

# 방법 B에서 설명할 einsum이 이미지의 논리를 가장 잘 따릅니다.

In [None]:
# 'ij' (Matrix A: 5x4)
# 'jkl' (3D Tensor: 4x3x2)
# 'j'가 공통 차원이므로 j를 기준으로 내적이 일어나고 사라집니다.
# 남는 차원은 i, k, l 즉 (5, 3, 2)가 됩니다.

result = torch.einsum('ij,jkl->ikl', A, tensor_3d)

print(f"행렬 A 크기: {A.shape}")             # torch.Size([5, 4])
print(f"3D 텐서 크기: {tensor_3d.shape}")    # torch.Size([4, 3, 2])
print(f"결과 텐서 크기: {result.shape}")     # torch.Size([5, 3, 2])

행렬 A 크기: torch.Size([5, 4])
3D 텐서 크기: torch.Size([4, 3, 2])
결과 텐서 크기: torch.Size([5, 3, 2])


# **5. Understanding Chapter 8 Image classification [click](https://deeplearningwithpython.io/chapters/chapter08_image-classification/)**

<img src ='https://deeplearningwithpython.io/images/ch08/how_convolution_works.fb611af4.png' width = 400>




- Input: (5, 5, 2) - 5×5 크기, 2채널
- Kernels: 3개의 (3, 3, 2) 커널
- Output: (3, 3, 3) - 3×3 크기, 3채널
- 공간 크기 감소: 5 - 3 + 1 = 3 (padding 없을 때)
- 깊이: 커널 개수(3)에 의해 결정

In [54]:
import torch
import torch.nn.functional as F

# Input Feature Map: (Height=5, Width=5, Depth=2)
Input_Feature_Map = torch.randn(5, 5, 2)

# PyTorch 합성곱은 (batch, channels, height, width) 형태를 요구하므로 변환
# (5, 5, 2) → (1, 2, 5, 5)
input_tensor = Input_Feature_Map.permute(2, 0, 1).unsqueeze(0)
print(f"Input shape: {input_tensor.shape}")  # torch.Size([1, 2, 5, 5])

# 커널 3개 정의: 각 커널은 (3, 3, 2) 형태
# PyTorch에서는 (out_channels, in_channels, kernel_height, kernel_width)
num_kernels = 3  # Output depth
kernel_size = 3
in_channels = 2

kernels = torch.randn(num_kernels, in_channels, kernel_size, kernel_size)
print(f"Kernels shape: {kernels.shape}")  # torch.Size([3, 2, 3, 3])

# 합성곱 연산 수행 (padding=0, stride=1)
output = F.conv2d(input_tensor, kernels, padding=0, stride=1)
print(f"Output shape: {output.shape}")  # torch.Size([1, 3, 3, 3])

# 결과를 (Height, Width, Depth) 형태로 변환
Output_Feature_Map = output.squeeze(0).permute(1, 2, 0)
print(f"Output Feature Map shape: {Output_Feature_Map.shape}")  # torch.Size([3, 3, 3])

print("\n=== 상세 정보 ===")
print(f"입력: {Input_Feature_Map.shape} → (5, 5, 2)")
print(f"커널 개수: {num_kernels}개, 각 커널 크기: (3, 3, 2)")
print(f"출력: {Output_Feature_Map.shape} → (3, 3, 3)")
print(f"\n공간 크기 변화: 5×5 → 3×3 (padding=0이므로 2픽셀씩 감소)")
print(f"깊이 변화: 2 → 3 (커널 개수만큼)")

Input shape: torch.Size([1, 2, 5, 5])
Kernels shape: torch.Size([3, 2, 3, 3])
Output shape: torch.Size([1, 3, 3, 3])
Output Feature Map shape: torch.Size([3, 3, 3])

=== 상세 정보 ===
입력: torch.Size([5, 5, 2]) → (5, 5, 2)
커널 개수: 3개, 각 커널 크기: (3, 3, 2)
출력: torch.Size([3, 3, 3]) → (3, 3, 3)

공간 크기 변화: 5×5 → 3×3 (padding=0이므로 2픽셀씩 감소)
깊이 변화: 2 → 3 (커널 개수만큼)


### **핵심 포인트(Input Pacth @ Kernel = Ouptpatch)**

- 1개의 Input patch가 (3, 3, 2) 형태라면, 커널도 **정확히 같은 shape인 (3, 3, 2)**이어야 하며,
- 내적 연산 과정

>> 1. Input patch: (3, 3, 2) = 총 18개의 값 --> (1, 18)로 Flatten

>> 2. Kernel: (3, 3, 2) = 총 18개의 가중치 --> (1, 18)로 Flatten

>> 3. 연산: 각 위치의 값들을 element-wise로 곱한 후 모두 더함(**Convolution**)

>> 4. (3×3×2 = 18개의 곱셈) → 모두 합산 → 단일 스칼라 값 1개



- Oupt depth가 3개 인것은 Kernel이 3개라서