# Creating a Specific Version of Python and PyTorch Environment on Colab
This document will guide free users in creating a specific version of Python,
 PyTorch and Tensorflow execution environment on Google Colab.

Use Miniconda version 3.8 to create a Python 3.7 environment, which can use both PyTorch 1.7.1 and tensorflow 2.1.0 version.

Author: [Tarek Liu](https://github.com/liuyuweitarek)

Please note:

1. How to deal with Colab time limit?
  
  This document does not overcome the time limit issue for free users, who will need to wait for a specific period before using it again. Therefore, please **make good use of Checkpoint to save and continue training progress**.
2. Can I use the realtime console with the Colab cell?
  
  Unfortunately, I was not successful. In this usage, I cannot utilize the cell's ability to compile and execute programs in real-time. I can only use the "activate virtual environment" -> "run file" method. This makes development and debugging more annoyed. If you have a better approach, feel free to share it with me or submit a PR. Thank you!
3. Please use the version of Miniconda installer which is higher than the Python version you wish to use to create the virtual environment.

## Connect to google drive space


In [34]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Forward to working directory

In [35]:
import os
os.chdir("/content/drive/MyDrive/custom_env_colab")

In [36]:
%env PYTHONPATH = # /env/python

env: PYTHONPATH=# /env/python


## Build virtual conda env

In [37]:
!wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh
!sudo chmod +x Miniconda3-py38_4.12.0-Linux-x86_64.sh
!./Miniconda3-py38_4.12.0-Linux-x86_64.sh -b -f -p /usr/local
!conda update --yes conda

--2024-08-25 18:55:47--  https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.32.241, 104.16.191.158, 2606:4700::6810:bf9e, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.32.241|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 76120962 (73M) [application/x-sh]
Saving to: ‘Miniconda3-py38_4.12.0-Linux-x86_64.sh.1’


2024-08-25 18:55:49 (56.5 MB/s) - ‘Miniconda3-py38_4.12.0-Linux-x86_64.sh.1’ saved [76120962/76120962]

PREFIX=/usr/local
Unpacking payload ...
Collecting package metadata (current_repodata.json): - \ done
Solving environment: / - \ | / failed with initial frozen solve. Retrying with flexible solve.
Solving environment: \ | / failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): \ done
Solving environment: / - \ | / failed with initial frozen solve

In [38]:
import sys
sys.path.append('/usr/local/lib/python3.8/site-packages')

In [39]:
!conda create -n myenv python=3.7 --yes

Collecting package metadata (current_repodata.json): - \ | / - \ done
Solving environment: / failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | done
Solving environment: - \ | / - \ done


  current version: 4.12.0
  latest version: 24.7.1

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /usr/local/envs/myenv

  added / updated specs:
    - python=3.7


The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex

In [40]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python -m pip install -r requirements.txt --default-timeout=3000
python -m pip install torch==1.7.1+cu101 --extra-index-url https://download.pytorch.org/whl --no-cache-dir

Collecting tensorflow-gpu==2.1.0
  Using cached tensorflow_gpu-2.1.0-cp37-cp37m-manylinux2010_x86_64.whl (421.8 MB)
Collecting protobuf==3.20.1
  Using cached protobuf-3.20.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
Collecting wrapt>=1.11.1
  Using cached wrapt-1.16.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (77 kB)
Collecting scipy==1.4.1
  Using cached scipy-1.4.1-cp37-cp37m-manylinux1_x86_64.whl (26.1 MB)
Collecting keras-applications>=1.0.8
  Using cached Keras_Applications-1.0.8-py3-none-any.whl (50 kB)
Collecting six>=1.12.0
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting termcolor>=1.1.0
  Using cached termcolor-2.3.0-py3-none-any.whl (6.9 kB)
Collecting tensorflow-estimator<2.2.0,>=2.1.0rc0
  Using cached tensorflow_estimator-2.1.0-py2.py3-none-any.whl (448 kB)
Collecting google-pasta>=0.1.6
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting opt-einsum>=2.3.2
  Usi



## Replace Cuda Modules
If encounter error like:

```
cp: cannot stat '/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so.11.11.3.6': No such file or directory
```
Keep running cells to [Execute your main function](https://colab.research.google.com/drive/14xMiWxgmwWKLHQr6zTGVLQ4JBdfdrcYJ#scrollTo=77KCEyR_XFkp) and then see more details in [What_if_encounter_not_found_error_while_replacing_cuda_modules](https://colab.research.google.com/drive/14xMiWxgmwWKLHQr6zTGVLQ4JBdfdrcYJ#scrollTo=O6XBpWU_kvn5&line=18&uniqifier=1)

In [41]:
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so.11.11.3.6 /usr/lib64-nvidia/libcublas.so.10
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcusolver.so.11.4.1.48 /usr/lib64-nvidia/libcusolver.so.10
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcusparse.so.11.7.5.86 /usr/lib64-nvidia/libcusparse.so.10
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcudart.so.11.8.89 /usr/lib64-nvidia/libcudart.so.10.1

cp: cannot stat '/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so.11.11.3.6': No such file or directory
cp: cannot stat '/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcusolver.so.11.4.1.48': No such file or directory
cp: cannot stat '/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcusparse.so.11.7.5.86': No such file or directory
cp: cannot stat '/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcudart.so.11.8.89': No such file or directory


## Fit cudnn version

In [42]:
!tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz
!cp -P cuda/include/cudnn.h /usr/lib64-nvidia
!cp -P cuda/lib64/libcudnn* /usr/lib64-nvidia
!chmod a+r /usr/lib64-nvidia/libcudnn*

cuda/include/cudnn.h
cuda/NVIDIA_SLA_cuDNN_Support.txt
cuda/lib64/libcudnn.so
cuda/lib64/libcudnn.so.7
cuda/lib64/libcudnn.so.7.6.5
cuda/lib64/libcudnn_static.a


## Execute your main function here

In [43]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python main.py

2024-08-25 18:58:32.929766: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:58:32.929919: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:58:32.929940: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2024-08-25 18:58:34.283375: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2024-08-25 18:58:34.303808: I tensorflow/stream_executor/cuda/cuda_g



## What if encounter not found error while replacing cuda modules?
Due to changes in the CUDA version on Google Colab, this step may produce errors.
```
e.g. cp: cannot stat '/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so.11.11.3.6': No such file or directory
```
Please try replacing the files which console said it's missing or files listed below with their corresponding versions.

```
2024-08-25 18:14:43.631701: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:14:43.631808: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:14:43.631841: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2024-08-25 18:14:44.443595: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2024-08-25 18:14:44.461980: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-08-25 18:14:44.462181: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s
2024-08-25 18:14:44.462245: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2024-08-25 18:14:44.462357: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:14:44.462461: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:14:44.463143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2024-08-25 18:14:44.463312: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:14:44.463431: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 18:14:44.468145: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
```
These files are missing:

`libnvinfer.so.6` and `libnvinfer_plugin.so.6` are for TensorRT, if you don't use the package, could just ignore them.

`libcublas.so.10`, `libcufft.so.10`, `libcusolver.so.10` and `libcusparse.so.10`. are modules which real matter to GPU usage. Let's find them!

---
First, at new cell, check current colab cuda *.so version
```
!ls -ll /usr/local

#Output

total 132
drwxr-xr-x  1 root root  4096 Aug 25 16:06 bin
drwxr-xr-x  3 root root  4096 Aug 22 13:42 colab
drwxr-xr-x  2 root root  4096 Aug 25 16:06 compiler_compat
drwxr-xr-x  2 root root  4096 Aug 25 16:06 condabin
drwxr-xr-x  2 root root  4096 Aug 25 16:06 conda-meta
lrwxrwxrwx  1 root root    22 Nov 10  2023 cuda -> /etc/alternatives/cuda
lrwxrwxrwx  1 root root    25 Nov 10  2023 cuda-12 -> /etc/alternatives/cuda-12
drwxr-xr-x  1 root root  4096 Nov 10  2023 cuda-12.2
```
It's `cuda-12.2` now.

Check whether missing files in the cuda-12.2 folder
```
# libcublas.so.10, libcufft.so.10, libcusolver.so.10, libcusparse.so.10
!ls -ll /usr/local/cuda-12.2/targets/x86_64-linux/lib/ | grep -E 'libcublas.so|libcufft.so|libcusolver.so|libcusparse.so'

# OUTPUT
lrwxrwxrwx 1 root root        15 Aug 16  2023 libcublas.so -> libcublas.so.12
lrwxrwxrwx 1 root root        21 Aug 16  2023 libcublas.so.12 -> libcublas.so.12.2.5.6
-rw-r--r-- 1 root root 106675248 Aug 16  2023 libcublas.so.12.2.5.6
lrwxrwxrwx 1 root root        14 Aug 16  2023 libcufft.so -> libcufft.so.11
lrwxrwxrwx 1 root root        22 Aug 16  2023 libcufft.so.11 -> libcufft.so.11.0.8.103
-rw-r--r-- 1 root root 178387496 Aug 16  2023 libcufft.so.11.0.8.103
lrwxrwxrwx 1 root root        17 Aug 16  2023 libcusolver.so -> libcusolver.so.11
lrwxrwxrwx 1 root root        25 Aug 16  2023 libcusolver.so.11 -> libcusolver.so.11.5.2.141
-rw-r--r-- 1 root root 115505432 Aug 16  2023 libcusolver.so.11.5.2.141
lrwxrwxrwx 1 root root        17 Aug 16  2023 libcusparse.so -> libcusparse.so.12
lrwxrwxrwx 1 root root        25 Aug 16  2023 libcusparse.so.12 -> libcusparse.so.12.1.2.141
-rw-r--r-- 1 root root 263825056 Aug 16  2023 libcusparse.so.12.1.2.141
```
Let's modify the command
from
```
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so.11.11.3.6 /usr/lib64-nvidia/libcublas.so.10
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcusolver.so.11.4.1.48 /usr/lib64-nvidia/libcusolver.so.10
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcusparse.so.11.7.5.86 /usr/lib64-nvidia/libcusparse.so.10
!cp -P /usr/local/cuda-11.8/targets/x86_64-linux/lib/libcudart.so.11.8.89 /usr/lib64-nvidia/libcudart.so.10.1
```
to
```
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcublas.so.12.2.5.6 /usr/lib64-nvidia/libcublas.so.10
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcufft.so.11.0.8.103 /usr/lib64-nvidia/libcufft.so.10
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcusolver.so.11.5.2.141 /usr/lib64-nvidia/libcusolver.so.10
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcusparse.so.12.1.2.141 /usr/lib64-nvidia/libcusparse.so.10
```


## Fix Error and try again

In [44]:
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcublas.so.12.2.5.6 /usr/lib64-nvidia/libcublas.so.10
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcufft.so.11.0.8.103 /usr/lib64-nvidia/libcufft.so.10
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcusolver.so.11.5.2.141 /usr/lib64-nvidia/libcusolver.so.10
!cp -P /usr/local/cuda-12.2/targets/x86_64-linux/lib/libcusparse.so.12.1.2.141 /usr/lib64-nvidia/libcusparse.so.10

In [45]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python main.py

2024-08-25 19:00:23.478262: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 19:00:23.478383: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2024-08-25 19:00:23.478402: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2024-08-25 19:00:24.470802: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2024-08-25 19:00:24.489992: I tensorflow/stream_executor/cuda/cuda_g

