# Data Driven AI for Remote Sensing Hackathon - Sample Notebook for experimentation


### Install the required packages (skip if already installed)

In [28]:
!pip install -r requirements.txt

Collecting git+https://github.com/IBM/terratorch.git (from -r requirements.txt (line 17))
  Cloning https://github.com/IBM/terratorch.git to /tmp/pip-req-build-qhobajir
  Running command git clone --filter=blob:none --quiet https://github.com/IBM/terratorch.git /tmp/pip-req-build-qhobajir
  Resolved https://github.com/IBM/terratorch.git to commit 2683140e3862954f62212df3417208692c6b879a
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone


### Install necessary libraries (skip if already installed)

In [29]:
!sudo apt-get update && sudo apt-get install ffmpeg libsm6 libxext6  -y

Hit:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]   
Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease                         
Get:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Hit:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Fetched 257 kB in 1s (261 kB/s)
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libsm6 is already the newest version (2:1.2.3-1build2).
libxext6 is already the newest version (2:1.3.4-1build1).
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 14 not upgraded.


### Test the pytorch GPU setup

In [30]:
import torch
import time

def test_pytorch_gpu():
    print(f"PyTorch version: {torch.__version__}")
    
    # Check if CUDA is available
    if torch.cuda.is_available():
        print("CUDA is available. GPU can be used.")
        device = torch.device("cuda")
        print(f"Current CUDA device: {torch.cuda.current_device()}")
        print(f"GPU name: {torch.cuda.get_device_name(0)}")
    else:
        print("CUDA is not available. GPU cannot be used.")
        return
    
    # Create a large tensor on GPU
    size = 5000
    x = torch.randn(size, size, device=device)
    y = torch.randn(size, size, device=device)
    
    # Perform matrix multiplication
    start_time = time.time()
    result = torch.matmul(x, y)
    end_time = time.time()
    
    print(f"Matrix multiplication of {size}x{size} tensors took {end_time - start_time:.4f} seconds")
    
    # Verify the result
    print(f"Result shape: {result.shape}")
    print(f"Result sum: {result.sum().item():.4f}")
    
    # Move result back to CPU for further processing if needed
    result_cpu = result.cpu()
    print(f"Result successfully moved back to CPU. Shape: {result_cpu.shape}")


test_pytorch_gpu()

PyTorch version: 2.3.1.post300
CUDA is available. GPU can be used.
Current CUDA device: 0
GPU name: Tesla T4
Matrix multiplication of 5000x5000 tensors took 0.0001 seconds
Result shape: torch.Size([5000, 5000])
Result sum: -799450.8750
Result successfully moved back to CPU. Shape: torch.Size([5000, 5000])


### Create directories needed for data, model, and config preparations

In [31]:
!mkdir datasets
!mkdir models
!mkdir configs

mkdir: cannot create directory ‘datasets’: File exists
mkdir: cannot create directory ‘models’: File exists
mkdir: cannot create directory ‘configs’: File exists


### install git-lfs and clone the datasets

In [32]:
! sudo apt-get install git-lfs; git lfs install

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
git-lfs is already the newest version (3.0.2-1ubuntu0.2).
0 upgraded, 0 newly installed, 0 to remove and 14 not upgraded.
Updated git hooks.
Git LFS initialized.


In [33]:
### Clone the dataset. Should use this dataset for training.

In [34]:
! cd datasets; git clone https://huggingface.co/datasets/Muthukumaran/fire_scars_hackathon_dataset

fatal: destination path 'fire_scars_hackathon_dataset' already exists and is not an empty directory.


### Unzip the dataset and move it to the datasets directory. Takes a while to download and unzip.

In [35]:
! cd datasets; ! tar -xvzf fire_scars_hackathon_dataset/fire_scars_train_val.tar.gz

fire_scars_train_val/
fire_scars_train_val/train/
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2018280.v1.4.mask.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2018280.v1.4_merged.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2019305.v1.4.mask.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2019305.v1.4_merged.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2020190.v1.4.mask.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2020190.v1.4_merged.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2020285.v1.4.mask.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEH.2020285.v1.4_merged.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEJ.2018185.v1.4.mask.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEJ.2018185.v1.4_merged.tif
fire_scars_train_val/train/subsetted_512x512_HLS.S30.T10SEJ.2018220.v1.4.mask.tif
fire_scars_train_val/train/subsetted_5

### Modify the model config file. 

# **Note:** You SHOULD change the config file to play with the training parameters. Also, change the paths within `< >` to the correct paths.

## Run the training using terratorch.
#### This will take a while to complete.
#### The training logs will be saved to the EFS mount point.

In [38]:
!terratorch fit --config configs/fire_scars.yaml

2024-10-23 06:18:56.514359: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-23 06:18:56.529098: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-23 06:18:56.549088: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-23 06:18:56.555818: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-23 06:18:56.569370: I tensorflow/core/platform/cpu_feature_guar

In [37]:
pip install wandb

Note: you may need to restart the kernel to use updated packages.


In [15]:
import wandb

In [16]:
wandb.init(project="remotesensing102")

[34m[1mwandb[0m: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
[34m[1mwandb[0m: Currently logged in as: [33map5132449[0m ([33map5132449-srm-institute-of-science-and-technology[0m). Use [1m`wandb login --relogin`[0m to force relogin


In [19]:
wandb.init

<function wandb.sdk.wandb_init.init(job_type: 'str | None' = None, dir: 'StrPath | None' = None, config: 'dict | str | None' = None, project: 'str | None' = None, entity: 'str | None' = None, reinit: 'bool | None' = None, tags: 'Sequence | None' = None, group: 'str | None' = None, name: 'str | None' = None, notes: 'str | None' = None, magic: 'dict | str | bool | None' = None, config_exclude_keys: 'list[str] | None' = None, config_include_keys: 'list[str] | None' = None, anonymous: 'str | None' = None, mode: 'str | None' = None, allow_val_change: 'bool | None' = None, resume: 'bool | str | None' = None, force: 'bool | None' = None, tensorboard: 'bool | None' = None, sync_tensorboard: 'bool | None' = None, monitor_gym: 'bool | None' = None, save_code: 'bool | None' = None, id: 'str | None' = None, fork_from: 'str | None' = None, resume_from: 'str | None' = None, settings: 'Settings | dict[str, Any] | None' = None) -> 'Run'>

In [45]:
cd wandb

/home/sagemaker-user/rsds-hackathon-24/wandb


In [46]:
!wandb sync ../logs

Find logs at: /tmp/debug-cli.sagemaker-user.log
Found 4 tfevent files in /home/sagemaker-user/rsds-hackathon-24/logs/fire_scars
Syncing: https://wandb.ai/ap5132449-srm-institute-of-science-and-technology/rsds-hackathon-24/runs/tpjr76er ...
2024-10-23 06:29:07.042092: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-23 06:29:07.056997: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-23 06:29:07.077966: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-23 06:29:07.084220: E external/local_

In [25]:
terratorch predict -c configs/fire_scars.yaml

[0m[01;36mdebug-internal.log[0m@  [01;36mlatest-run[0m@                    settings
[01;36mdebug.log[0m@           [01;34mrun-20241023_051756-70b9o16r[0m/


In [26]:
cd RTC:rsds-hackathon-24


[Errno 2] No such file or directory: 'RTC:rsds-hackathon-24'
/home/sagemaker-user/rsds-hackathon-24/wandb


In [27]:
cd ..

/home/sagemaker-user/rsds-hackathon-24
