VMamba Environment Setup and Troubleshooting Guide Created: April 2026

# VMamba Environment Setup and Troubleshooting Manual

Below is the **VMamba Environment Setup and Troubleshooting Manual (Directly Reusable Version)** that I have organized for you based on the actual issues encountered this time.  
The structure is consistent:

**Symptom → Cause → Solution → How to Quickly Identify in the Future**

You can refer directly to this guide if you encounter similar problems again.

---

# 1. Final Stable Environment

The stable combination that ultimately worked this time is:

```bash
Python 3.10
PyTorch 2.1.0
CUDA 11.8
gcc/g++ 11
mmengine 0.10.1
mmcv 2.1.0
mmsegmentation 1.2.2
mmdet 3.3.0
mmpretrain 1.2.0
numpy 1.26.4
opencv-python-headless 4.10.0
```

This combination is stable because:

* PyTorch officially supports `2.1.0 + CUDA 11.8`.
* MMCV installation depends on wheels that precisely match the current PyTorch/CUDA version; using versions that are too new often falls back to source builds.
* NumPy 2.x is incompatible with many packages compiled against the NumPy 1.x ABI. Downgrading to `numpy<2` is the standard fix.

---

# 2. Problem 1: `pip install .` Reports `No module named 'torch'`

## Symptom

When executing in the `kernels/selective_scan` directory:

```bash
pip install .
```

The error occurs:

```text
ModuleNotFoundError: No module named 'torch'
```

This appeared clearly in your logs.

## Cause

`pip install .` enables **build isolation** by default.  
This temporary build environment does not have `torch` from your current conda environment, but `selective_scan` needs to import torch during the build process, resulting in an immediate error.

## Solution

Use instead:

```bash
pip install . --no-build-isolation
```

## How to Quickly Identify in the Future

If both of the following are true:

* `python -c "import torch"` succeeds in the current environment
* But `pip install .` fails with "No module named 'torch'"

It's almost certainly caused by **build isolation**.

---

# 3. Problem 2: CUDA Compilation Reports `unsupported GNU version`

## Symptom

When compiling `selective_scan`, `nvcc` reports:

```text
#error -- unsupported GNU version! gcc versions later than 11 are not supported!
```

This appeared clearly in the logs.

## Cause

You are using **CUDA 11.8**.  
CUDA 11.8 has restrictions on the GCC version; **nvcc will refuse to compile if GCC is too new**.

This is not an issue with VMamba code or Python packages, but a **CUDA toolchain compatibility issue**.

## Solution

Install and switch to gcc-11 / g++-11:

```bash
sudo apt update
sudo apt install gcc-11 g++-11 -y

export CC=gcc-11
export CXX=g++-11
```

Then recompile:

```bash
cd ~/projects/VMamba-main/kernels/selective_scan
pip install . --no-build-isolation
```

This is exactly how you succeeded later.

## How to Quickly Identify in the Future

Whenever you see:

* `nvcc`
* `unsupported GNU version`
* `gcc versions later than 11 are not supported`

Immediately think: **Switch to gcc-11**.

---

# 4. Problem 3: Compilation Succeeds but Import Fails with `libc10.so` Not Found

## Symptom

`selective_scan` installed successfully, but import fails with:

```text
ImportError: libc10.so: cannot open shared object file
```

This also appeared in your logs.

## Cause

This is **runtime dynamic linking failure**, not compilation failure.

`selective_scan_cuda_oflex.so` depends on PyTorch shared libraries during import, such as:

* `libc10.so`
* `libtorch_cpu.so`

These libraries are typically located at:

```bash
.../site-packages/torch/lib
```

And not necessarily in:

```bash
$CONDA_PREFIX/lib
```

Therefore, adding only `$CONDA_PREFIX/lib` is insufficient.

## Solution

First locate the actual torch lib directory:

```bash
python -c "import torch, os; print(os.path.join(os.path.dirname(torch.__file__), 'lib'))"
```

Then add it to `LD_LIBRARY_PATH`:

```bash
export TORCH_LIB=$(python -c "import torch, os; print(os.path.join(os.path.dirname(torch.__file__), 'lib'))")
export LD_LIBRARY_PATH=$TORCH_LIB:$LD_LIBRARY_PATH
```

Test again:

```bash
python -c "import torch; import selective_scan_cuda_oflex; print('OK')"
```

You have already successfully output `OK` later.

## How to Quickly Identify in the Future

If:

* The extension `.so` compiled successfully
* But import fails with `libc10.so` / `libtorch_cpu.so` not found

First check the `torch/lib` path, rather than reinstalling torch.

---

# 5. Problem 4: `import torch` Fails with `_OutOfMemoryError` Attribute Error

## Symptom

When verifying torch in a new environment:

```text
AttributeError: module 'torch._C' has no attribute '_OutOfMemoryError'
```

## Cause

This indicates that **the PyTorch installation itself is corrupted**, or there is a conflict between the Python layer and the underlying binary layer.

It is not an issue with mmengine/mmcv, but rather **torch itself is broken**.

## Solution

Do not continue installing mm packages.  
Rebuild the environment directly and reinstall PyTorch using an officially supported version combination.

One stable combination officially provided by PyTorch is:

```bash
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia
```

This is explicitly listed on the official historical versions page.

## How to Quickly Identify in the Future

If even:

```bash
python -c "import torch"
```

fails, and the error occurs near `torch._C`, fix torch first before installing anything else.

---

# 6. Problem 5: `libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent`

## Symptom

When verifying torch again:

```text
ImportError: .../libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent
```

## Cause

This is a typical **version conflict between PyTorch and MKL / intel-openmp**.  
PyTorch official issues clearly state that newer MKL versions can trigger this missing symbol problem.

## Solution

Downgrade MKL and intel-openmp in the current conda environment:

```bash
conda install "mkl<2024.1" "intel-openmp<2024.1" -c conda-forge -y
```

Then re-verify:

```bash
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.version.cuda)"
```

You later successfully obtained:

```text
2.1.0
True
11.8
```

## How to Quickly Identify in the Future

If you see:

```text
undefined symbol: iJIT_NotifyEvent
```

First check:

* `mkl`
* `intel-openmp`

Rather than reinstalling CUDA right away.

---

# 7. Problem 6: `mim install mmcv` Falls Back to Source Build

## Symptom

When you installed `mmcv`, `mim` did not fetch a wheel but downloaded:

```text
mmcv-2.2.0.tar.gz
```

And then started a source build, reporting various build issues.

## Cause

The official MMCV installation mechanism matches wheels based on:

* CUDA version
* PyTorch version
* MMCV version

If the current combination is too new and no matching wheel exists, it falls back to a source build. Source builds are more prone to failure.

## Solution

Avoid compiling from source; instead, switch to a more stable combination, such as:

```bash
torch 2.1.0 + cu118
mmcv 2.1.0
```

This is also the combination that eventually worked for you.

## How to Quickly Identify in the Future

If `mim install mmcv` downloads a `.tar.gz` instead of a `.whl`, be cautious.  
This usually indicates that the current version combination is not ideal.

---

# 8. Problem 7: NumPy 2.x ABI Incompatibility with Current Binary Packages

## Symptom

When importing `mmcv` or `torch`, a message similar to the following appears:

```text
A module that was compiled using NumPy 1.x cannot be run in NumPy 2.x
```

But the mm package versions were printed successfully.

## Cause

Many current binary packages are still compiled against the **NumPy 1.x ABI**.  
If the environment has **NumPy 2.x**, compatibility issues may arise. This is a known issue in the PyTorch community.

## Solution

Pin NumPy to 1.26.4:

```bash
python -m pip install numpy==1.26.4
```

## How to Quickly Identify in the Future

Whenever you see:

* `compiled using NumPy 1.x`
* And the current `numpy.__version__` is 2.x

Prioritize downgrading to:

```bash
numpy==1.26.4
```

---

# 9. Problem 8: OpenCV Requires `numpy>=2` Conversely

## Symptom

After downgrading NumPy to 1.26.4, you saw:

```text
opencv-python 4.13.0.92 requires numpy>=2
opencv-python-headless 4.13.0.92 requires numpy>=2
```

## Cause

The current OpenCV version is too new and requires NumPy 2.x.  
This conflicts with the newly downgraded `numpy 1.26.4`.

## Solution

Uninstall conflicting versions, keep only headless, and pin to a compatible version:

```bash
python -m pip uninstall -y opencv-python opencv-python-headless
python -m pip install --no-cache-dir opencv-python-headless==4.10.0.84
```

You later verified successfully:

```bash
numpy 1.26.4
cv2 4.10.0
```

## How to Quickly Identify in the Future

In server/WSL environments, prioritize installing only:

```bash
opencv-python-headless
```

Do not install both:

* `opencv-python`
* `opencv-python-headless`

---

# 10. Final Correct Installation Order

For future VMamba environment setup from scratch, the recommended order is as follows.

## 1) Create Environment

```bash
conda create -n vmamba_seg python=3.10 -y
conda activate vmamba_seg
```

## 2) Install PyTorch

```bash
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia -y
conda install "mkl<2024.1" "intel-openmp<2024.1" -c conda-forge -y
```

## 3) Verify torch

```bash
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.version.cuda)"
```

## 4) Install OpenMMLab Dependencies

```bash
python -m pip install --upgrade pip
python -m pip install setuptools==81.0.0 wheel
python -m pip install -U openmim

mim install mmengine==0.10.1
mim install mmcv==2.1.0
python -m pip install mmsegmentation==1.2.2 mmdet==3.3.0 mmpretrain==1.2.0
```

## 5) Fix NumPy/OpenCV

```bash
python -m pip install numpy==1.26.4
python -m pip uninstall -y opencv-python opencv-python-headless
python -m pip install --no-cache-dir opencv-python-headless==4.10.0.84
```

## 6) Verify mm Packages

```bash
python -c "import mmengine, mmcv, mmseg, mmdet, mmpretrain; print(mmengine.__version__, mmcv.__version__, mmseg.__version__, mmdet.__version__, mmpretrain.__version__)"
```

## 7) Compile selective_scan

```bash
export CC=gcc-11
export CXX=g++-11

cd ~/projects/VMamba-main/kernels/selective_scan
pip install . --no-build-isolation
```

## 8) Verify selective_scan

```bash
python -c "import torch; import selective_scan_cuda_oflex; print('selective_scan ok')"
```

If `libc10.so` is missing here, supplement with:

```bash
export TORCH_LIB=$(python -c "import torch, os; print(os.path.join(os.path.dirname(torch.__file__), 'lib'))")
export LD_LIBRARY_PATH=$TORCH_LIB:$LD_LIBRARY_PATH
python -c "import torch; import selective_scan_cuda_oflex; print('selective_scan ok')"
```

---

# 11. General Troubleshooting Order for Future Issues

No matter what environmental problem you encounter in the future, diagnose it in these three layers first:

## Layer 1: PyTorch Core Layer

First check:

```bash
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.version.cuda)"
```

If this step fails, do not proceed further.

## Layer 2: Binary Compatibility Layer

Check:

* Whether NumPy is 2.x
* Whether OpenCV requires `numpy>=2`
* Whether `libc10.so` is missing

## Layer 3: Custom Extension Layer

Check:

* Whether `--no-build-isolation` was used
* Whether gcc is version 11
* Whether `selective_scan` can actually be imported

---

# 12. One-Sentence Summary

All the issues you encountered this time boil down to three categories:

1. **Build Issues**  
   e.g., missing `--no-build-isolation`, wrong gcc version

2. **Dynamic Linking Issues**  
   e.g., `libc10.so` not found

3. **Version Compatibility Issues**  
   e.g., conflicts between PyTorch + MKL, NumPy 2.x, OpenCV, and MMCV combinations

As long as you categorize problems into these three types in the future, you won't get lost.

VMamba Environment Setup and Troubleshooting Guide Created: April 2026 #397

Description

VMamba Environment Setup and Troubleshooting Manual

1. Final Stable Environment

2. Problem 1: pip install . Reports No module named 'torch'

Symptom

Cause

Solution

How to Quickly Identify in the Future

3. Problem 2: CUDA Compilation Reports unsupported GNU version

Symptom

Cause

Solution

How to Quickly Identify in the Future

4. Problem 3: Compilation Succeeds but Import Fails with libc10.so Not Found

Symptom

Cause

Solution

How to Quickly Identify in the Future

5. Problem 4: import torch Fails with _OutOfMemoryError Attribute Error

Symptom

Cause

Solution

How to Quickly Identify in the Future

6. Problem 5: libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent

Symptom

Cause

Solution

How to Quickly Identify in the Future

7. Problem 6: mim install mmcv Falls Back to Source Build

Symptom

Cause

Solution

How to Quickly Identify in the Future

8. Problem 7: NumPy 2.x ABI Incompatibility with Current Binary Packages

Symptom

Cause

Solution

How to Quickly Identify in the Future

9. Problem 8: OpenCV Requires numpy>=2 Conversely

Symptom

Cause

Solution

How to Quickly Identify in the Future

10. Final Correct Installation Order

1) Create Environment

2) Install PyTorch

3) Verify torch

4) Install OpenMMLab Dependencies

5) Fix NumPy/OpenCV

6) Verify mm Packages

7) Compile selective_scan

8) Verify selective_scan

11. General Troubleshooting Order for Future Issues

Layer 1: PyTorch Core Layer

Layer 2: Binary Compatibility Layer

Layer 3: Custom Extension Layer

12. One-Sentence Summary

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

2. Problem 1: `pip install .` Reports `No module named 'torch'`

3. Problem 2: CUDA Compilation Reports `unsupported GNU version`

4. Problem 3: Compilation Succeeds but Import Fails with `libc10.so` Not Found

5. Problem 4: `import torch` Fails with `_OutOfMemoryError` Attribute Error

6. Problem 5: `libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent`

7. Problem 6: `mim install mmcv` Falls Back to Source Build

9. Problem 8: OpenCV Requires `numpy>=2` Conversely