# System Preparation 

This notebook serves as notes for reference to getting setup to develop with SUBER.

- [GPU Preparation](#gpu_prep)
- [PyTorch Installation](#torch_install)
- [Pytorch Examples]()
- [Transformers and Tokenizers]()

TODO -- fix links above.

## Conda or Python Virtual Environments
I switched to ordinary Python virtual environments because Anaconda itself was becoming a chore.  Why would it not simply add the mdodule I wanted?  It would take forever and stall in many cases.  The Python version used for this project is Python 3.10.12.


## <a href="gpu_prep">GPU Preparation</a>

GPU and nvcc (aka cuda) versions should be within the same major version. I've noticed that Ubuntu 22.04 loads on some systems have been way out of **alignment**. Try to get them at the same version.


```
sudo apt-get purge 'nvidia*' 'cuda*'

sudo apt-get install nvidia-driver-535

sudo reboot

wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run

chmod a+x cuda_12.2.0_535.54.03_linux.run
 
sudo ./cuda_12.2.0_535.54.03_linux.run # And follow the prompts

# Edit your .bashrc and put these in. But don't put the hastags in front of them.
# export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}
#export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}


# I also had to do this.  If you cannot type nvcc --version then you need to check the permissions.
sudo chmod -R 755 /usr/local/cuda-12.2


```

The results should be something like this:

```
acshell@ip-10-114-92-249:~$ nvidia-smi | grep -i "cuda version" | awk '{print $9}'
12.2
acshell@ip-10-114-92-249:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

```

## <a href=torch_install>Torch Installation<a/>

You should do this first. If this doesn't work, nothing will. 

PyTorch cuda version should be within a minor version of the cuda drivers and cuda drivers need to align with nvidia drivers.  Try hard to make this happen by paying attention to versions.  

```.bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

```


In [None]:

import torch
print(f"Is Cuda available? {torch.cuda.is_available()}.")  # Should return True
print(f"Torch Cuda Version is {torch.version.cuda}.")  # Should return '12.1'

import time

### Torch Examples

Here are some examples showing that it works.

In [None]:
import torch
import time

# Define the size of the tensors
size = 10000

# Create two large random tensors for CPU
tensor1_cpu = torch.randn(size, size)
tensor2_cpu = torch.randn(size, size)

# Perform matrix multiplication on the CPU and time it
start_time = time.time()
result_cpu = torch.matmul(tensor1_cpu, tensor2_cpu)
end_time = time.time()

print(f"Matrix multiplication on CPU took: {end_time - start_time:.4f} seconds")
print(f"Result tensor size on CPU: {result_cpu.size()}")

# Check if CUDA is available and perform the same test on the GPU
if torch.cuda.is_available():
    device = torch.device("cuda")
    
    # Create two large random tensors for GPU
    tensor1_gpu = tensor1_cpu.to(device)
    tensor2_gpu = tensor2_cpu.to(device)

    # Perform matrix multiplication on the GPU and time it
    torch.cuda.synchronize()  # Ensure all CUDA operations are finished
    start_time = time.time()
    result_gpu = torch.matmul(tensor1_gpu, tensor2_gpu)
    torch.cuda.synchronize()  # Ensure the GPU has finished the computation
    end_time = time.time()

    print(f"Matrix multiplication on GPU took: {end_time - start_time:.4f} seconds")
    print(f"Result tensor size on GPU: {result_gpu.size()}")
else:
    print("CUDA is not available on this system.")


In [None]:
## Stable Baselines 3 

## Install Stable Baselines 3

```
pip install stable-baselines3[extra]

```


In [None]:
import stable_baselines3
print(stable_baselines3.__version__)

### SB3 Example

Note, it takes many iteraitons and the proper algorithm to get good results; this just shows it working.  

In [None]:
import gymnasium as gym
from stable_baselines3 import PPO
import matplotlib.pyplot as plt
from IPython import display
import time

# Create the CartPole-v1 environment with the "rgb_array" render mode
env = gym.make("CartPole-v1", render_mode="rgb_array")

# Create the PPO model (you can replace PPO with other algorithms if you want)
model = PPO("MlpPolicy", env, verbose=1)

# Train the agent for 10,000 steps
model.learn(total_timesteps=10000)

# Test the trained agent and render in the notebook
obs, info = env.reset()

# Set up the plot for dynamic updates
#plt.ion()  # Turn on interactive mode for matplotlib
#fig, ax = plt.subplots()

for _ in range(1000):
    action, _states = model.predict(obs)
    obs, reward, done, truncated, info = env.step(action)


    if done or truncated:
        obs, info = env.reset()

env.close()


## Install Transformers and Tokenizers

```
pip install -U transformers tokenizers

```

### Transformer and Tokenizer Example

In [None]:
from transformers import BertTokenizer, BertModel
import torch

# Load pre-trained BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

# Sample text
text = "Transformers are amazing for NLP tasks."

# Tokenize the input text
inputs = tokenizer(text, return_tensors="pt")

# Get the tokenized input IDs
input_ids = inputs["input_ids"]

# Decode the token IDs back to text
decoded_text = tokenizer.decode(input_ids[0], skip_special_tokens=True)

# Print original text, tokenized input, and decoded text
print("Original Text:", text)
print("Tokenized Input IDs:", input_ids)
print("Decoded Text:", decoded_text)


## Sentence Transformers

```/bash
pip install sentence-transformers

```

Need this as well.

### Sentence Transformers Example

In [None]:
from sentence_transformers import SentenceTransformer

# Load a pre-trained model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Encode a list of sentences
sentences = ["Transformers are amazing for NLP tasks.", "Sentence embeddings are useful."]
embeddings = model.encode(sentences)

# Print the sentence embeddings
print(embeddings)


## Other Stuff

If you are using Jupyter notebook, be sure to install `jupyterlab` and `ipywidgets` with pip.

```.bash




```

## Setup Summary -- 

More notes can be added here but Pytorch, and Stable Baselines 3 are the two main modules.  Extras required from both will come up but should not be a huge issue.  