-
Notifications
You must be signed in to change notification settings - Fork 1
FAQ
Common questions and solutions for the Whales Identification project.
- Installation Issues
- Model Issues
- API Issues
- Docker Issues
- Performance Issues
- Licensing Questions
- Usage Questions
Problem: Missing OpenCV system dependencies
Solution (Ubuntu/Debian):
sudo apt-get update
sudo apt-get install -y libgl1-mesa-glx libglib2.0-0 libsm6 libxext6 libxrender-dev libgomp1Solution (macOS):
brew install opencv # Usually not required on macOSWhy it happens: OpenCV (cv2) requires system libraries for image processing. These are not installed by default on some Linux distributions.
Problem: Hugging Face CLI not installed
Solution:
pip install huggingface_hub==0.20.3Verify:
huggingface-cli --versionWhy it happens: The model download script uses huggingface-cli to download models from Hugging Face Hub.
Problem: Running command from wrong directory
Solution:
# Backend commands must run from whales_be_service/
cd whales_be_service
poetry install
# Frontend commands must run from frontend/
cd frontend
npm installWhy it happens: Poetry looks for pyproject.toml in the current directory. The backend's pyproject.toml is in whales_be_service/, not the project root.
Problem: Docker images not built
Solution:
# Build images
docker compose build
# Force rebuild (if needed)
docker compose build --no-cache
# Start services
docker compose upWhy it happens: Docker needs to build images before running containers.
Answer: Models are available from two sources:
-
Hugging Face (Recommended):
./scripts/download_models.sh
- URL: https://huggingface.co/baltsat/Whales-Identification/tree/main
- File:
efficientnet_b4_512_fold0.ckpt(2.1 GB)
-
Yandex Disk (Alternative):
- URL: https://disk.yandex.ru/d/GshqU9o6nNz7ZA
- Download all models
Verify download:
ls -lh `models/efficientnet_b4_512_fold0.ckpt`
# Should show ~2.1 GB fileProblem: Models not downloaded or wrong path
Solution:
# 1. Check models directory exists
ls models/
# 2. Download models
./scripts/download_models.sh
# 3. Verify model exists
ls -lh `models/efficientnet_b4_512_fold0.ckpt`
# 4. Check path in whale_infer.py
# Should be: `models/efficientnet_b4_512_fold0.ckpt`Why it happens: The API expects models in the models/ directory, but they are not committed to git (.gitignore).
Problem: Running on CPU instead of GPU
Solution:
# Check if GPU is available
python -c "import torch; print(torch.cuda.is_available())"
# If False, install CUDA-enabled PyTorch
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118Expected times:
- GPU (V100): 2-4 seconds
- CPU: 5-15 seconds
- If >30 seconds: check for bottlenecks
Problem: Image too large or batch size too big
Solution:
# Resize large images before inference
from PIL import Image
def resize_if_needed(image_path, max_size=1920):
img = Image.open(image_path)
if max(img.size) > max_size:
ratio = max_size / max(img.size)
new_size = tuple(int(dim * ratio) for dim in img.size)
img = img.resize(new_size, Image.LANCZOS)
return imgOr reduce batch size:
# Instead of batch_size=32
batch_size = 16 # Or even 8Problem: File format not supported or MIME type mismatch
Solution:
# Supported formats: JPG, PNG, JPEG
# Convert if needed
convert whale.bmp whale.jpg
# Verify MIME type
file --mime-type whale.jpg
# Should be: image/jpeg or image/pngWhy it happens: API only accepts image/jpeg and image/png content types for security.
Problem: Multiple possible causes
Solution:
# 1. Check logs
docker compose logs backend
# 2. Common causes:
# - Model not loaded: download models
# - OpenCV error: install system dependencies
# - CUDA error: check GPU availability
# 3. Test with simple request
curl -X POST "http://localhost:8000/predict-single" \
-F "file=@small_test_image.jpg"Problem: CORS error or wrong backend URL
Solution:
# 1. Verify backend is running
curl http://localhost:8000/docs
# 2. Check CORS settings in main.py
# Should allow frontend origin:
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:8080", "http://localhost:5173"],
...
)
# 3. Check frontend API URL
# In frontend/.env or vite.config.ts
VITE_BACKEND_URL=http://localhost:8000Problem: Another service using port 8000 or 8080
Solution (macOS/Linux):
# Find process using port
lsof -i :8000
# Kill process
kill -9 <PID>Solution (Windows):
netstat -ano | findstr :8000
taskkill /PID <PID> /FAlternative: Change ports in docker-compose.yml:
services:
backend:
ports:
- "8001:8000" # Change 8000 → 8001Problem: User not in docker group
Solution:
# Add user to docker group
sudo usermod -aG docker $USER
# Refresh groups
newgrp docker
# Verify
docker psWhy it happens: Docker daemon requires root or docker group membership.
Problem: Docker out of disk space
Solution:
# Clean up old images
docker system prune -a
# Check space
df -h
# Remove unused volumes
docker volume pruneProblem: Processing 100 images sequentially
Solution:
# Use multiprocessing for batch inference
from concurrent.futures import ThreadPoolExecutor
def process_batch_parallel(images, max_workers=4):
with ThreadPoolExecutor(max_workers=max_workers) as executor:
results = list(executor.map(model.predict, images))
return resultsOr use GPU batch processing:
# Process in batches of 16
batch_size = 16
for i in range(0, len(images), batch_size):
batch = images[i:i+batch_size]
tensor_batch = torch.stack([preprocess(img) for img in batch])
predictions = model(tensor_batch)Problem: Model loaded multiple times or not released
Solution:
# Ensure model is loaded once
if not hasattr(app.state, 'model'):
app.state.model = load_model()
# Use context manager for inference
with torch.no_grad():
predictions = model(tensor)
# Clear CUDA cache if using GPU
import torch
torch.cuda.empty_cache()Answer:
- Happy Whale data: CC-BY-NC-4.0 (non-commercial)
- Ministry RF data: Research-only license
- ImageNet pretrained weights: Non-commercial terms
For commercial use:
- Train models from scratch with your own data
- Use commercial datasets
- Contact data providers for commercial licenses
See:
Answer: ✅ Yes! The code is MIT licensed.
You can:
- Modify the code
- Use in your own projects
- Fork the repository
- Contribute back via pull requests
You must:
- Include original copyright notice
- Include MIT license text
See: LICENSE
Answer: ✅ Yes! Models can be used for non-commercial research.
Requirements:
- Cite the original datasets (Happy Whale, Ministry RF)
- Acknowledge the project in publications
- Share results with the community
Citation:
@software{whales_identification_2024,
author = {Baltsat, Konstantin and Tarasov, Artem and Vandanov, Sergey and Serov, Alexandr},
title = {Whales Identification: ML Library for Marine Mammal Detection},
year = {2024},
url = {https://github.com/0x0000dead/whales-identification}
}Answer:
Recommended:
- Resolution: ≥1920×1080
- Format: JPG or PNG
- Lighting: Good natural lighting
- Angle: Aerial view, 30-45° angle
- Distance: Whale fills 20-80% of frame
Minimum:
- Resolution: ≥800×600
- No extreme blur or occlusion
- Whale clearly visible
Accuracy impact:
- High-quality: 90-93% precision
- Low-quality (<800×600): 70-80% precision (15-20% drop)
Answer:
Limits:
- Single image: Max 10 MB
- Batch ZIP: Max 50 MB, max 100 images
Recommendations:
- Small batches (1-10): <1 minute
- Medium batches (10-50): 1-5 minutes
- Large batches (50-100): 5-15 minutes
For larger datasets:
- Split into multiple ZIP files
- Use script to process in chunks:
import os
import zipfile
def split_batch(image_dir, max_per_zip=50):
images = sorted(os.listdir(image_dir))
for i in range(0, len(images), max_per_zip):
batch = images[i:i+max_per_zip]
zip_name = f"batch_{i//max_per_zip + 1}.zip"
with zipfile.ZipFile(zip_name, 'w') as zipf:
for img in batch:
zipf.write(os.path.join(image_dir, img), img)
print(f"Created {zip_name} with {len(batch)} images")Answer: Depends on your use case:
Best accuracy (93%):
- Vision Transformer L/32
- Use for: Research, validation, high-value species
Production API (91%, 2s):
- Vision Transformer B/16
- Use for: API deployments, GPU servers
Real-time (<1s, 88%):
- EfficientNet-B0
- Use for: Real-time apps, edge devices
Edge devices (82%, 0.8s):
- ResNet-54
- Use for: Jetson Nano, low-power devices
See Model Cards for detailed comparison.
Answer:
The model predicts one whale per image. For multiple whales:
Workaround:
- Manually crop each whale
- Upload each crop separately
Planned feature (v0.2.0):
- Object detection with YOLO/Faster R-CNN
- Automatic cropping of multiple whales
- Batch prediction on all detections
Answer: 1,000 individual whales and dolphins across species including:
- Humpback Whale (Megaptera novaeangliae)
- Blue Whale (Balaenoptera musculus)
- Fin Whale (Balaenoptera physalus)
- Gray Whale (Eschrichtius robustus)
- Beluga Whale (Delphinapterus leucas)
- Right Whale (Eubalaena spp.)
- Sperm Whale (Physeter macrocephalus)
- Orca (Orcinus orca)
- Bottlenose Dolphin (Tursiops truncatus)
- Spinner Dolphin (Stenella longirostris)
- ... and more
Full mapping: whales_be_service/config.yaml
Dataset: ~80,000 images (~60,000 train + ~20,000 test) labeled for 1,000 individual marine mammals
Answer:
Overall metrics (Vision Transformer L/32):
- Precision@1: 93.2%
- Precision@5: 97.8%
- Recall (Sensitivity): 91.5%
- Specificity: 92.3%
- F1-Score: 0.923
- Inference Time: 3.5s (GPU), 7.5s (CPU)
ТЗ Compliance: ✅ All metrics exceed requirements (Precision ≥80%, Recall >85%, Specificity >90%, F1 >0.6, Time ≤8s)
Per-species (top performers):
- Humpback Whale: 95.3%
- Blue Whale: 94.1%
- Orca: 94.8%
Limitations:
- 15-20% accuracy drop on:
- Low-resolution (<800×600)
- Heavy occlusion (>50%)
- Poor lighting (night, fog)
- Extreme angles
See Model Cards for detailed metrics.
- Installation Guide - Setup instructions
- Usage Guide - How to use the API
- API Reference - Complete API docs
- Architecture - System design
- Testing - Testing guide
- Contributing - Development guide
- GitHub Issues: Report a bug
- GitHub Discussions: Ask a question
- ✅ Search existing issues
- ✅ Check documentation
- ✅ Review this FAQ
- ✅ Try troubleshooting steps
Provide:
- ✅ Clear description of problem
- ✅ Steps to reproduce
- ✅ Error messages (full stack trace)
- ✅ Environment info (OS, Python version, Docker version)
- ✅ Expected vs actual behavior
Example:
## Problem
API returns 500 error when uploading image
## Steps to Reproduce
1. Start Docker: `docker compose up`
2. Upload whale_001.jpg via frontend
3. Error appears
## Error MessageInternal Server Error: Model not found
## Environment
- OS: Ubuntu 22.04
- Python: 3.11.6
- Docker: 24.0.5
- Model: Downloaded from Hugging Face
## Expected
Prediction result with species name
## Actual
500 error
Last Updated: September 1, 2025 Version: 0.1.0