- Step 1: Create Docker image
sh cli/docker_env/build_image.sh
- Step 2: Execute Docker container
sh cli/docker_env/run_container.sh
- Step 1: Install dependencies
sh cli/docker_env/after_docker.sh
Notes Ensure directories and file paths are set up corresponding with and without Docker
export CUDA_VISIBLE_DEVICES=0
deepspeed \
--master_port 8888 training_hub/train_lisa/train_lisa.py \
--filepath-config training_hub/train_lisa/config_lisa/config_lisa_vqa_vindr_medsam_llavamed_docker.yaml \
--local_rank 0
- Configuration is used by a single file which centralizes all the configuration parameters
- This file is located as in training argument --filepath-config
- This sourcecode requires thirdparty API keys to be set in the environment variables.
- Prepare your thirdparty API keys and set them in the environment variables as follows
export HF_TOKEN=""
export WANDB_API_KEY=""
- Directories of datasets including images and labels are set in the configuration file
- Change "dirpath_labels" and "dirpath_images" to your working directory
- The dataset is being stored in A6000 and your PC (/media/hieu/6ac7d369-b609-4b09-97b0-27ed881b25f9/duong/data-center/VinDr/train_png_16bit)
- Models including tokenizers, vision models and LMMs are from Hugging Face libraries
- Best combination reference is in the configuration file
- Med SAM checkpoint please download from Med SAM and locate in the directory of "vision_pretrained" in the configuration file
- Dataset and dataloader are defined in src/dataloader_hub/lisa_dataloader
- Model is defined in src/model_hub/lisa
- Training script is defined in training_hub/train_lisa