!!! Some repositories are actually in this repository, can directly install dependencies !!!
Assuming you have conda installed, let's prepare a conda env:
conda_env_name=h3vlfm_world
conda create -n $conda_env_name python=3.9 cmake=3.14.0
conda activate $conda_env_name
Install proper version of torch:
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
Following Habitat-lab's instruction, install Habitat-sim:
conda install habitat-sim=0.3.1 withbullet -c conda-forge -c aihabitat
Then install Habitat-lab
cd habitat-lab
pip install -e habitat-lab
pip install -e habitat-baselines
cd ..
Following Mobile-SAM's instruction:
pip install git+https://github.com/ChaoningZhang/MobileSAM.git
Following GroundingDINO's instruction:
May define CUDA_HOME <= 11.8 export CUDA_HOME=/path/to/cuda-11.8
cd GroundingDINO/
pip install -e . --no-dependencies
Then place the pretrained model weights:
mkdir weights
cd weights
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
cd ..
pip install salesforce-lavis==1.0.2
We use finetuned version of semantic segmentation model RedNet.
Therefore, you need to download the segmentation model in RedNet/model path.
git clone https://github.com/Peterande/D-FINE.git
git clone https://github.com/CSAILVision/places365.git
pip install flask
pip install open3d
pip install dash
pip install scikit-learn
pip install joblib
pip install seaborn
pip install faster_coco_eval
pip install calflops
pip install flash-attn --no-build-isolation
pip install modelscope
pip install opencv-python==4.10.0.84
pip install transformers==4.37.0
pip install openpyxl
pip install supervision==0.25.1
pip install yapf==0.43.0
- Download Scene & Episode Datasets
Following the instructions for HM3D and MatterPort3D in Habitat-lab's Datasets.md.
- Locate Datasets
The file structure should look like this:
data
└── datasets
└── objectnav
├── hm3d
│ └── v1
│ ├── train
│ │ ├── content
│ │ └── train.json.gz
│ └── val
│ ├── content
│ └── val.json.gz
└── mp3d
└── v1
├── train
│ ├── content
│ └── train.json.gz
└── val
├── content
└── val.json.gz
Run the following commands:
./scripts/launch_vlm_servers_qwen25_gdino_with_ram.sh
python -u -m falcon.run --config-name=experiments/qwen25_gdino_objectnav_hm3d_debug_scene.yaml habitat_baselines.num_environments=1 > debug/20250219/eval_llm_single_floor_gdino.log 2>&1