## Windows에서 GPU 지원 TensorFlow 설치하기(CUDA)

- [TensorFlow 설치 절차](https://velog.io/@hsedmr/TensorFlow-%EC%84%A4%EC%B9%98-%EC%A0%88%EC%B0%A8)
- 주의: TensorFlow 2.10 은 기본 Windows에서 GPU를 지원하는 마지막 TensorFlow 릴리스였습니다.

#### - 권장설치 버전 : 윈도우 기준 [py39_gpu210]
- 텐서플로우-2.10.0
- 파이썬-3.9   (3.7-3.10)
- CUDA-11.2
- cuDNN-8.1.0

#### - 코랩 버전 :  우분투 기준
- 텐서플로우-2.15.0
- 파이썬-3.10.12 (3.9-3.11)   (numpy 1.25.2) (pandas 2.0.3) (matplotlib 3.7.1)
- CUDA-12.2
- cuDNN-8.9.0

#### - Do It 딥러닝 버전 :  윈도우 기준 [doitdl] CPU버전
- TensorFlow-2.13.0 (2.0rc1/rc2/2.0/2.1/2.2/2.3/2.4/2.5/2.6)
- 파이썬-3.8 (3.6-3.9) numpy 1.16/1.18/1.19, scikit-learn 0.21/0.22/0.23/0.24, 
- CUDA-11.2
- cuDNN-8.1.0

#### - NLP 버전 :  윈도우 기준 [nlpdl]
- TensorFlow-2.0.0  (keras 2.3.1)  (Gensim 3.8.1)  (sklearn 0.21.3)
- 파이썬(3.5-3.7) (numpy 1.16.5) (matplotlib 2.2.3) (nltk 3.4.5) (konlpy 0.5.1) (pandas 0.25.1)
- CUDA-10.0 
- cuDNN-7.4.0

### Step.1 아나콘다 가상환경

- conda create --name py39_gpu210 python=3.9
- conda create --name nlpdl python=3.7
- conda activate py39_gpu210
- 가상환경 확인 : conda env list
- 가상환경 삭제 : conda env remove -n 가상환경이름

In [1]:
import sys
sys.version 
# 결과값 '3.9.13 (main, Aug 25 2022, 18:24:45) \n[Clang 12.0.0 ]'

'3.9.12 (main, Jun  1 2022, 06:34:44) \n[Clang 12.0.0 ]'

In [2]:
import numpy as np
print(np.__version__)
# 1.21.6

1.23.4


In [6]:
import matplotlib as plt

### Step.2 GPU 설정 : CUDA  및 CUdnn 설치

- conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
- conda install -c conda-forge cudatoolkit=10.0 cudnn=7.4.0
- CUDA 설치 확인 :  nvcc --version

In [3]:
!nvcc --version

zsh:1: command not found: nvcc


### Step.3  TensorFlow 설치

- pip install --upgrade pip
- Anything above 2.10 is not supported on the GPU on Windows Native
- pip install "tensorflow<2.11"
- pip install tensorflow==2.5.0
################### with CUDA
- conda install tensorflow==2.0.0
- conda install tensorflow-gpu==2.0.0

In [1]:
import tensorflow as tf
tf.__version__



'2.9.0'

In [3]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import numpy as np
import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.debugging.set_log_device_placement(True)

def x_norm(x): 
    x = np.array(x)
    x = x / 100 * 0.99 + 0.01
    return x

x = [12.0, 28.0, 36.5, 42.0, 29.8]
x = x_norm(x)

y = [53.6, 82.4, 97.7, 107.6, 85.64]

# Check if TensorFlow is using GPU
gpus = tf.config.experimental.list_physical_devices('GPU')
print('GPU 사용 개수:', len(gpus))
if gpus:
    print('GPU 사용 가능')
else:
    print('GPU 사용 불가능')

# random_uniform -> random.uniform
W = tf.Variable(tf.random.uniform([1], -1.0, 1.0), name="Weight")
b = tf.Variable(tf.random.uniform([1], -1.0, 1.0), name="Bias")

X = tf.placeholder(tf.float32, name="X")
Y = tf.placeholder(tf.float32, name="Y")

hypothesis = tf.add(tf.multiply(W, X), b)
print(hypothesis)

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.3)
cost = tf.reduce_mean(tf.square(Y - hypothesis))
train_op = optimizer.minimize(cost)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    print(sess.run(W), sess.run(b))
    
    for step in range(10000):
        _, cost_val = sess.run([train_op, cost], feed_dict={X: x, Y: y})
        
        if step % 1000 == 0:
            print("Step: ", step, "  Cost: ", cost_val, "  W: ", sess.run(W), "  b: ", sess.run(b))
        
    print("X: 20, Y:", sess.run(hypothesis, feed_dict={X: x_norm(20)}))
    print("X: 30, Y:", sess.run(hypothesis, feed_dict={X: x_norm(30)}))
    print("X: 40, Y:", sess.run(hypothesis, feed_dict={X: x_norm(40)}))
    print("X: 50, Y:", sess.run(hypothesis, feed_dict={X: x_norm(50)}))
    print("X: 60, Y:", sess.run(hypothesis, feed_dict={X: x_norm(60)}))


GPU 사용 개수: 1
GPU 사용 가능
Tensor("Add_1:0", dtype=float32)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: METAL, pci bus id: <undefined>

random_uniform/RandomUniform: (RandomUniform): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform/sub: (Sub): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform/mul: (Mul): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform: (AddV2): /job:localhost/replica:0/task:0/device:GPU:0
Weight: (VariableV2): /job:localhost/replica:0/task:0/device:CPU:0
Weight/Assign: (Assign): /job:localhost/replica:0/task:0/device:CPU:0
Weight/read: (Identity): /job:localhost/replica:0/task:0/device:CPU:0
random_uniform_1/RandomUniform: (RandomUniform): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform_1/sub: (Sub): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform_1/mul: (Mul): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform_1: (AddV2): /job:localhost/replica:0/task:0/device:GPU:0
Bias

### Step.4  TensorFlow 설치확인

- CPU 설정 확인 : python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

- GPU 설정 확인 : python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

In [14]:
# CPU 설정 확인
import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))

tf.Tensor(-1363.5104, shape=(), dtype=float32)


In [15]:
# GPU 설정 확인
import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


In [16]:

import tensorflow as tf

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
          loss='sparse_categorical_crossentropy',
          metrics=['accuracy'])


model.fit(x_train, y_train, epochs=5)

model.evaluate(x_test,  y_test, verbose=2)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Epoch 1/5


2024-07-16 10:42:48.962016: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2024-07-16 10:42:49.162307: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


2024-07-16 10:43:34.362353: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.


313/313 - 1s - loss: 0.0702 - accuracy: 0.9798 - 1s/epoch - 4ms/step


[0.07016460597515106, 0.9797999858856201]

### Step.5  Jupyter Notebook에 커널 추가

- pip install ipykernel

- python -m ipykernel install --user --name py39_gpu210 --display-name "py39_gpu210"

- 커널 목록 확인 :  jupyter kernelspec list
- 커널 삭제하기  :  jupyter kernelspec uninstall venv