# System Preparation 

This notebook serves as notes for reference to getting setup to develop with SUBER.

- [GPU Preparation](#gpu_prep)
- [PyTorch Installation](#torch_install)
- [Pytorch Examples]()
- [Transformers and Tokenizers]()

TODO -- fix links above.


## <a href="gpu_prep">GPU Preparation</a>

GPU and nvcc (aka cuda) versions should be within the same major version. I've noticed that Ubuntu 22.04 loads on some systems have been way out of **alignment**. Try to get them at the same version.


```
sudo apt-get purge 'nvidia*' 'cuda*'

sudo apt-get install nvidia-driver-535

sudo reboot

wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run

chmod a+x cuda_12.2.0_535.54.03_linux.run
 
sudo ./cuda_12.2.0_535.54.03_linux.run # And follow the prompts

# Edit your .bashrc and put these in. But don't put the hastags in front of them.
# export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}
#export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}


# I also had to do this.  If you cannot type nvcc --version then you need to check the permissions.
sudo chmod -R 755 /usr/local/cuda-12.2


```

The results should be something like this:

```
acshell@ip-10-114-92-249:~$ nvidia-smi | grep -i "cuda version" | awk '{print $9}'
12.2
acshell@ip-10-114-92-249:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

```

## <a href=torch_install>Torch Installation<a/>

PyTorch cuda version should be within a minor version of the cuda drivers and cuda drivers need to align with nvidia drivers.  Try hard to make this happen by paying attention to versions.  

```.bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

```


In [1]:

import torch
print(f"Is Cuda available? {torch.cuda.is_available()}.")  # Should return True
print(f"Torch Cuda Version is {torch.version.cuda}.")  # Should return '12.1'

import time

Is Cuda available? True.
Torch Cuda Version is 12.1.


### Torch Examples

Here are some examples showing that it works.

In [2]:
import torch
import time

# Define the size of the tensors
size = 10000

# Create two large random tensors for CPU
tensor1_cpu = torch.randn(size, size)
tensor2_cpu = torch.randn(size, size)

# Perform matrix multiplication on the CPU and time it
start_time = time.time()
result_cpu = torch.matmul(tensor1_cpu, tensor2_cpu)
end_time = time.time()

print(f"Matrix multiplication on CPU took: {end_time - start_time:.4f} seconds")
print(f"Result tensor size on CPU: {result_cpu.size()}")

# Check if CUDA is available and perform the same test on the GPU
if torch.cuda.is_available():
    device = torch.device("cuda")
    
    # Create two large random tensors for GPU
    tensor1_gpu = tensor1_cpu.to(device)
    tensor2_gpu = tensor2_cpu.to(device)

    # Perform matrix multiplication on the GPU and time it
    torch.cuda.synchronize()  # Ensure all CUDA operations are finished
    start_time = time.time()
    result_gpu = torch.matmul(tensor1_gpu, tensor2_gpu)
    torch.cuda.synchronize()  # Ensure the GPU has finished the computation
    end_time = time.time()

    print(f"Matrix multiplication on GPU took: {end_time - start_time:.4f} seconds")
    print(f"Result tensor size on GPU: {result_gpu.size()}")
else:
    print("CUDA is not available on this system.")


Matrix multiplication on CPU took: 2.8040 seconds
Result tensor size on CPU: torch.Size([10000, 10000])
Matrix multiplication on GPU took: 0.2448 seconds
Result tensor size on GPU: torch.Size([10000, 10000])


In [3]:
## Stable Baselines 3 

## Install Stable Baselines 3

```
pip install stable-baselines3[extra]

```


In [3]:
import stable_baselines3
print(stable_baselines3.__version__)

2.3.2


## SB3 Example

Note, it takes many iteraitons and the proper algorithm to get good results; this just shows it working.  

In [4]:
import gymnasium as gym
from stable_baselines3 import PPO
import matplotlib.pyplot as plt
from IPython import display
import time

# Create the CartPole-v1 environment with the "rgb_array" render mode
env = gym.make("CartPole-v1", render_mode="rgb_array")

# Create the PPO model (you can replace PPO with other algorithms if you want)
model = PPO("MlpPolicy", env, verbose=1)

# Train the agent for 10,000 steps
model.learn(total_timesteps=10000)

# Test the trained agent and render in the notebook
obs, info = env.reset()

# Set up the plot for dynamic updates
#plt.ion()  # Turn on interactive mode for matplotlib
#fig, ax = plt.subplots()

for _ in range(1000):
    action, _states = model.predict(obs)
    obs, reward, done, truncated, info = env.step(action)


    if done or truncated:
        obs, info = env.reset()

env.close()


Using cuda device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 22.8     |
|    ep_rew_mean     | 22.8     |
| time/              |          |
|    fps             | 614      |
|    iterations      | 1        |
|    time_elapsed    | 3        |
|    total_timesteps | 2048     |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 27.5        |
|    ep_rew_mean          | 27.5        |
| time/                   |             |
|    fps                  | 564         |
|    iterations           | 2           |
|    time_elapsed         | 7           |
|    total_timesteps      | 4096        |
| train/                  |             |
|    approx_kl            | 0.007973788 |
|    clip_fraction        | 0.0858      |
|    clip_range           | 0.2         |
|    entropy_loss  

## Install Transformers and Tokenizers

```
pip install -U transformers tokenizers

```

### Transformer and Tokenizer Example

In [14]:
from transformers import BertTokenizer, BertModel
import torch

# Load pre-trained BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

# Sample text
text = "Transformers are amazing for NLP tasks."

# Tokenize the input text
inputs = tokenizer(text, return_tensors="pt")

# Get the tokenized input IDs
input_ids = inputs["input_ids"]

# Decode the token IDs back to text
decoded_text = tokenizer.decode(input_ids[0], skip_special_tokens=True)

# Print original text, tokenized input, and decoded text
print("Original Text:", text)
print("Tokenized Input IDs:", input_ids)
print("Decoded Text:", decoded_text)


Original Text: Transformers are amazing for NLP tasks.
Tokenized Input IDs: tensor([[  101, 19081,  2024,  6429,  2005, 17953,  2361,  8518,  1012,   102]])
Decoded Text: transformers are amazing for nlp tasks.


### Sentence Transformers

```/bash
pip install sentence-transformers

```

Need this as well.

In [6]:
from sentence_transformers import SentenceTransformer

# Load a pre-trained model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Encode a list of sentences
sentences = ["Transformers are amazing for NLP tasks.", "Sentence embeddings are useful."]
embeddings = model.encode(sentences)

# Print the sentence embeddings
print(embeddings)


  from tqdm.autonotebook import tqdm, trange


[[-9.13389921e-02 -2.08391752e-02  3.78195643e-02 -1.00276303e-02
  -2.18986664e-02  6.64148200e-03 -4.42283116e-02  4.97135893e-02
   2.80648023e-02  1.45602422e-02  2.62472089e-02  8.02669078e-02
   4.97584185e-03  8.78880322e-02  4.61270250e-02  3.79769392e-02
   3.22095528e-02  1.52603416e-02 -4.78855930e-02 -8.71268734e-02
   1.09329306e-01  8.22059810e-02  1.47923604e-02 -5.11702746e-02
   5.15239798e-02  6.55859485e-02 -5.36913313e-02 -4.96964194e-02
   3.65458801e-02 -9.47064348e-03 -4.13955748e-02  5.72568886e-02
  -6.83491975e-02  5.84685169e-02 -6.26850203e-02  7.58528113e-02
   1.44350417e-02  1.83785856e-02  1.41708143e-02 -5.39689213e-02
  -3.47521044e-02 -1.90681927e-02  1.80511642e-02 -2.25276724e-02
   4.55970727e-02 -3.47323082e-02 -2.73609255e-02 -1.98490676e-02
  -8.14858940e-04 -2.79270653e-02 -3.95562313e-02 -5.84855042e-02
   5.34983799e-02  1.18243903e-01 -1.34977922e-02  2.81568766e-02
   8.00181460e-03 -5.52489385e-02  1.07748061e-02 -8.26784372e-02
  -7.50691

## Setup Summary -- 

More notes can be added here but Pytorch, and Stable Baselines 3 are the two main modules.  Extras required from both will come up but should not be a huge issue.  