# Configuration
```
To configure my environment I used:
- srun --time=2:00:00 --mem=8G --ntasks 1 --gres=gpu:1 --partition=plgrid-gpu-v100 --account
  =plglscclass24-gpu --pty /bin/bash
- conda config --add envs_dirs ${SCRATCH}/.conda/envs 
- conda config --add pkgs_dirs ${SCRATCH}/.conda/pkgs
- conda activate $PLG_GROUPS_STORAGE/plgglscclass/.conda/envs/tf2-gpu
- conda create --name lab2 --clone $PLG_GROUPS_STORAGE/plgglscclass/.conda/envs/tf2-gpu

- conda activate lab2
- conda install -c conda-forge cupy pandas matplotlib numpy

- cp $PLG_GROUPS_STORAGE/plgglscclass/lsc_lab02.ipynb $HOME
- jupyter notebook --no-browser --port=2324 --ip=ag0008
```

In [1]:
import tensorflow as tf
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Flatten

# Load and preprocess the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

physical_devices = tf.config.experimental.list_physical_devices('GPU')
print(physical_devices)

# Define the model
model = Sequential([
    
    Flatten(input_shape=(28, 28)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=5)

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print('\nTest accuracy:', test_acc)

2025-03-17 18:54:29.644766: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-03-17 18:54:30.031007: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-03-17 18:54:30.236374: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-03-17 18:54:30.268932: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-03-17 18:54:30.877575: I tensorflow/core/platform/cpu_feature_guar

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


  super().__init__(**kwargs)
2025-03-17 18:54:39.668583: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2021] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31141 MB memory:  -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:5b:00.0, compute capability: 7.0


Epoch 1/5


I0000 00:00:1742234083.780742 2367196 service.cc:146] XLA service 0x15183801b590 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1742234083.781656 2367196 service.cc:154]   StreamExecutor device (0): Tesla V100-SXM2-32GB, Compute Capability 7.0
2025-03-17 18:54:43.958958: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2025-03-17 18:54:44.281544: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:531] Loaded cuDNN version 90300


[1m 146/1875[0m [32m━[0m[37m━━━━━━━━━━━━━━━━━━━[0m [1m1s[0m 1ms/step - accuracy: 0.6616 - loss: 1.1456

I0000 00:00:1742234085.249079 2367196 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 982us/step - accuracy: 0.8784 - loss: 0.4247
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 967us/step - accuracy: 0.9643 - loss: 0.1216
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 973us/step - accuracy: 0.9768 - loss: 0.0771
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 969us/step - accuracy: 0.9832 - loss: 0.0558
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 968us/step - accuracy: 0.9866 - loss: 0.0432
313/313 - 1s - 3ms/step - accuracy: 0.9767 - loss: 0.0770

Test accuracy: 0.9767000079154968


In [1]:
import numpy as np
import cupy as cp
import time

# Matrix size
N = 10000

# CPU-based matrix multiplication using NumPy
start_time = time.time()
A_cpu = np.random.rand(N, N)
B_cpu = np.random.rand(N, N)
C_cpu = np.dot(A_cpu, B_cpu)
print(C_cpu[1][1])
cpu_time = time.time() - start_time
print(f"CPU time: {cpu_time:.2f} seconds")

# GPU-based matrix multiplication using CuPy
start_time = time.time()
A_gpu = cp.random.rand(N, N, dtype=cp.float32)
B_gpu = cp.random.rand(N, N, dtype=cp.float32)
C_gpu = cp.dot(A_gpu, B_gpu)
print(C_gpu[1][1])
gpu_time = time.time() - start_time
print(f"GPU time: {gpu_time:.2f} seconds")

2486.638899912158
CPU time: 27.88 seconds
2509.3665
GPU time: 3.53 seconds


# Problems

I had got problems with second cell. When I started this cell in GPU part there is a problem with allocation of memory. After restarting of jupyter all problems has gone.