# Installation of necessary packages

Under Python 3.10 environment, you could execute `pip install -r requirements.txt` in the terminal to install all the necessary packages. 

In [3]:
%pwd
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


# Launch training process

## Set configurations
Execute first `accelerate config` to set configurations for the train. If you can install [**DeepSpeed**](https://github.com/microsoft/DeepSpeed), than you could use DeepSpeed module in [**Accelerate**](https://huggingface.co/docs/accelerate/index) to speed up the training process. 

#### It is recommended (by me) to choose the following, for a configuration without using **DeepSpeed**:

In which compute environment are you running? -> **This machine**,

Which type of machine are you using? -> **No distributed training**,

Do you want to run your training on CPU only? -> **NO**,

Do you wish to optimize your script with torch dynamo? -> **NO**,

Do you want to use DeepSpeed? -> **NO**,

What GPU(s) should be used for training on this machine as a comma-seperated list? -> **all**,

Would you like to enable numa efficiency?(Currently only supported on NVIDIA hardware) -> **yes**,

Do you wish to use FP16 or BF16 (mixed precision)? -> **fp16**.

#### If you have **DeepSpeed**:

ZeRO optimization stage 0 is enough for training on 1 GPU. Gradient accumulation depends on the training batch size, usually we keep $batch\_size \times gradient\_accumulation = 8$. In addition, the use of fp8 (mixed precision) requires further installation.

It is recommended (by me) to choose the following:

How many gradient accumulation steps you're passing in your script? -> 1,

Do you want to use gradient clipping? -> yes,

What is the gradient clipping value? -> 1.0,

Do you want to enable `deepspeed.zero.Init` when using ZeRO Stage-3 for constructing massive models? -> NO,

Do you want to enable Mixture-of-Experts training (MoE)? -> NO,

How many GPU(s) should be used for distributed training? -> 1.

## Launch training

Execute `accelerate launch trainC.py` can directly trigger the training process. You could change the parameters in the file, for instance, number of epochs, learning rates.

In [9]:
!accelerate launch trainC.py

[2024-08-28 11:24:58,399] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  def forward(ctx, input, weight, bias=None):
  def backward(ctx, grad_output):
  @torch.cuda.amp.custom_fwd(cast_inputs=torch.float32)
[2024-08-28 11:25:02,316] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  def forward(ctx, input, weight, bias=None):
  def backward(ctx, grad_output):
[2024-08-28 11:25:02,791] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-08-28 11:25:02,791] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/caffin/桌面/image_satellite/trainC.py", line 39, in <module>
[rank0]:     model, optimizer, train_dataloader, scheduler = accelerator.prepare(
[rank0]:   File "/home/caffin/anaconda3/envs/test/lib/python3.10/site-packages/accelerate/accelerator.py", line 1303, in prepare
[rank0]:     

If you encounter errors like `RuntimeError: expected scalar type Half but found Float`, or `RuntimeError: Input type (float) and bias type (c10::Half) should be the same`. Please adapt the corresponding types, by adding `.to(torch.float16)` or `.to(input.dtype)`.

# Inference pipeline

Inference could be done directly in **example.ipynb**.