# Frequently Asked Questions

## General

### What is FlashOptim?

FlashOptim is a model optimization toolkit for compressing and accelerating deep learning models. It supports quantization, pruning, distillation, and neural architecture search.

### Which models are supported?

FlashOptim works with any PyTorch model, with first-class support for FlashVision detection and classification models.

### What hardware is required?

- **Training/Optimization**: GPU recommended (CUDA 11.8+)
- **Inference**: CPU, GPU, or edge devices (via ONNX/TensorRT export)

---

## Quantization

### PTQ vs QAT — which should I use?

- **PTQ** is faster (no training required) but may lose 1-2% accuracy
- **QAT** requires training but typically recovers accuracy to within 0.5%
- Start with PTQ; use QAT if accuracy drop is unacceptable

### How many calibration samples do I need?

Typically 100-500 representative samples are sufficient. More samples help with histogram-based calibration.

---

## Pruning

### Does unstructured pruning actually speed up inference?

Not without sparse hardware or sparse inference engines. Use structured pruning for guaranteed speedup on standard hardware.

### Can I combine pruning with quantization?

Yes! Apply pruning first, fine-tune, then quantize. This gives both sparsity and reduced precision.

---

## Distillation

### Does the teacher need to be the same architecture?

No. Feature distillation with projection layers can handle different architectures. Logit distillation works regardless of architecture.

### What temperature should I use?

Start with T=4. Higher temperatures (5-10) work better when the teacher is much larger.

---

## NAS

### How long does a NAS search take?

Depends on strategy and search space:
- Random search: minutes to hours
- Evolutionary: hours to days
- Use proxy tasks to speed up evaluation

---

## Deployment

### What export formats are supported?

- ONNX (recommended)
- TensorRT (planned)
- OpenVINO (planned)
- CoreML (planned)

### How do I benchmark my optimized model?

```bash
flashoptim benchmark --model optimized/model.onnx --device cpu --warmup 10 --runs 100
```