# Download Annotation Models for MS2LDA

## Recommended Method: Use the CLI

The easiest way to download all necessary MS2LDA data files is to use the command line:

```bash
ms2lda --only-download
```

This will download:
- ✅ Spec2Vec models and embeddings (~5.7 GB)
- ✅ Fingerprint calculation JARs (43 MB)
- ✅ MotifDB reference databases

## Alternative: Download Only Spec2Vec Models

If you only need the Spec2Vec annotation models, you can use the function directly:

In [2]:
help(download_model_and_data)

Help on function download_model_and_data in module MS2LDA.utils:

download_model_and_data(file_urls=['https://zenodo.org/records/12625409/files/020724_Spec2Vec_pos_CleanedLibraries.model?download=1', 'https://zenodo.org/records/12625409/files/020724_Spec2Vec_pos_CleanedLibraries.model.syn1neg.npy?download=1', 'https://zenodo.org/records/12625409/files/020724_Spec2Vec_pos_CleanedLibraries.model.wv.vectors.npy?download=1', 'https://zenodo.org/records/12625409/files/positive_s2v_library.pkl?download=1'], mode='positive')
    Downloads the spec2vec model and the needed datasets for the automated annotation



In [None]:
# Import MS2LDA
import sys
from pathlib import Path
try:
    import MS2LDA
except ImportError:
    sys.path.insert(0, str(Path.cwd().parent.parent))
    import MS2LDA

from MS2LDA.utils import download_model_and_data

In [None]:
# Download Spec2Vec models only
download_model_and_data(mode="positive")

## What Gets Downloaded?

The `download_model_and_data()` function downloads the following files to enable automated motif annotation:

1. **Spec2Vec Model** (9.5 MB)
   - Neural network model for converting spectra to embeddings
   
2. **Reference Embeddings** (976 MB)
   - Pre-computed embeddings for ~150,000 reference spectra
   
3. **Reference Database** (4.1 GB)
   - SQLite database with compound names, SMILES, and metadata
   
4. **Model Weights** (636 MB total)
   - Neural network weight files

**Total download size: ~5.7 GB**

These files are saved to: `<MS2LDA_package>/Add_On/Spec2Vec/model_positive_mode/`

## Note

This notebook exists for backward compatibility. For new users, we recommend using the CLI command `ms2lda --only-download` which downloads all necessary files including MotifDB and fingerprint JARs.