Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
0536eff
faster squidpy notebook
LucaMarconato Aug 11, 2025
b1945bf
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
86285dd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
7c9eb30
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
f2c66ea
Merge branch 'fix/notebooks' of https://github.com/scverse/spatialdat…
LucaMarconato Aug 12, 2025
72fac10
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
0965ca8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
3cf3f45
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
eb74239
Merge branch 'fix/notebooks' of https://github.com/scverse/spatialdat…
LucaMarconato Aug 12, 2025
fd91971
removed file to ignore
LucaMarconato Aug 12, 2025
fa2d764
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
c48be76
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2025
d53edf4
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
4266d8a
Merge branch 'fix/notebooks' of https://github.com/scverse/spatialdat…
LucaMarconato Aug 12, 2025
05cb0fb
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
94c30ef
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
40f6236
autorun: storage format; spatialdata from 2872036 (0.4.1.dev7+g7604a3…
LucaMarconato Aug 12, 2025
6312ba9
simplify and improve notebooks requiring aligned data and densenet no…
LucaMarconato Aug 13, 2025
d34329b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 13, 2025
8ef39a5
Merge branch 'main' into fix/notebooks
LucaMarconato Aug 14, 2025
094ebc0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 14, 2025
50bbae0
fix docs
LucaMarconato Aug 14, 2025
b551d81
Merge branch 'fix/notebooks' of https://github.com/scverse/spatialdat…
LucaMarconato Aug 14, 2025
43bfde8
fix docs
LucaMarconato Aug 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,5 @@ notebooks/examples/logs/

# others
node_modules/
notebooks/examples/_latest_run_notebook.ipynb
notebooks/developers_resources/storage_format/_latest_run_notebook.ipynb
1 change: 1 addition & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ python:
path: .
extra_requirements:
- doc
- pre
101 changes: 53 additions & 48 deletions datasets/README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,28 @@
# Spatial omics datasets

Here you can find all datasets necessary to run the example notebooks already converted to the ZARR file format.

If you want to convert additional datasets check out the scripts available in the [spatialdata sandbox](https://github.com/giovp/spatialdata-sandbox).

:::{note}
S3 URLs cannot be opened directly in a web browser. They should be treated as Zarr stores.
For example, appending `.zgroup` to any of the URLs will allow you to see that file.
:::


| Technology | Sample | File Size | Filename (spatialdata-sandbox) | download data | work with data remotely (**see note below**) | license |
| :---------------------------------------- | :-------------------------------------------------------- | --------: | :----------------------------- | :---------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------- | :---------------- |
| Visium HD | Mouse intestin [^2] | 1 GB | visium_hd_3.0.0_id | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_hd_3.0.0_io.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_hd_3.0.0_io.zarr/) | CCA |
| Visium | Breast cancer [^3] | 1.5 GB | visium_associated_xenium_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io.zarr/) | CCA |
| Xenium | Breast cancer [^3] | 2.8 GB | xenium_rep1_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io.zarr/) | CCA |
| Xenium | Breast cancer [^3] | 3.7 GB | xenium_rep2_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io.zarr/) | CCA |
| CyCIF (MCMICRO output) | Small lung adenocarcinoma [^4] | 250 MB | mcmicro_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/mcmicro_io.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/mcmicro_io.zarr/) | CC BY-NC 4.0 DEED |
| MERFISH | Mouse brain [^5] | 50 MB | merfish | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/merfish.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/merfish.zarr/) | CC0 1.0 DEED |
| MIBI-TOF | Colorectal carcinoma [^6] | 25 MB | mibitof | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/mibitof.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/mibitof.zarr/) | CC BY 4.0 DEED |
| Imaging Mass Cytometry (Steinbock output) | 4 different cancers (SCCHN, BCC, NSCLC, CRC) [^7][^8][^9] | 820 MB | steinbock_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/steinbock_io.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/steinbock_io.zarr/) | CC BY 4.0 DEED |
| Molecular Cartography (SPArrOW output) | Mouse Liver [^10][^11] | 70 MB | MouseLiver | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/mouse_liver.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/mouse_liver.zarr) | CC BY 4.0 DEED |
| SpaceM | T cells [^12] | 116 MB | spacem_scseahorse1 | [.zarr.zip](https://s3.embl.de/spatialdata/raw_data/20220121_ScSeahorse1.zip) | NA | CC BY 4.0 DEED |
| SpaceM | Hepa and NIH3T3 cells [^13] | 59 MB | spacem_hepanih3t3 | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/spacem_helanih3t3.zip) | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/spacem_helanih3t3.zarr) | CC BY 4.0 DEED |

For the first 3 datasets, we also provide a version of them in which they are all aligned in a common coordinate system, and where we added the cell-type information, as described in our paper, to annotate the Xenium cells.
| Technology | Sample | File Size | Filename (spatialdata-sandbox) | download data | work with data remotely (**see note below**) | license |
| :---------------------------------------- | :-------------------------------------------------------- | --------: | :-------------------------- | :---------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------- | :---------------- |
| Visium | Breast Cancer [^3] | 1.5 GB | visium_associated_xenium_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io_aligned.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io_aligned.zarr/) | CCA |
| Xenium | Breast Cancer [^3] | 2.8 GB | xenium_rep1_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io_aligned.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io_aligned.zarr/) | CCA |
| Xenium | Breast Cancer [^3] | 3.7 GB | xenium_rep2_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io_aligned.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep2_io_aligned.zarr/) | CCA |
Here you can find all datasets necessary to run the example notebooks already converted
to the SpatialData Zarr file format.

Scripts to convert data from several other technologies into SpatialData Zarr are
available in the [spatialdata sandbox](https://github.com/giovp/spatialdata-sandbox); in
particular:

- CyCIF (MCMICRO output)
- Imaging Mass Cytometry, IMC (Steinbock output)
- seqFISH

| Technology | Sample | File Size | Filename (spatialdata-sandbox) | download data (latest stable release) | license |
|:------------------------------------------|:----------------------------------------------------------|----------:|:-------------------------------|:------------------------------------------------------------------------------------------------|:------------------|
| Visium HD | Mouse intestin [^1] | ~2.4 GB | visium_hd_3.0.0_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_hd_3.0.0_io.zip) | CCA |
| Visium | Breast cancer [^2] | ~1.5 GB | visium_associated_xenium_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/visium_associated_xenium_io.zip) | CCA |
| Xenium | Breast cancer [^2] | ~2.8 GB | xenium_rep1_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_rep1_io.zip) | CCA |
| Xenium | Lung cancer [^3] | ~5.4 GB | xenium_2.0.0_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/xenium_2.0.0_io.zip) | CCA |
| CyCIF (MCMICRO output) | Small lung adenocarcinoma [^4] | ~250 MB | mcmicro_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/mcmicro_io.zip) | CC BY-NC 4.0 DEED |
| MERFISH | Mouse brain [^5] | ~50 MB | merfish | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/merfish.zip) | CC0 1.0 DEED |
| MIBI-TOF | Colorectal carcinoma [^6] | ~25 MB | mibitof | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/mibitof.zip) | CC BY 4.0 DEED |
| Imaging Mass Cytometry (Steinbock output) | 4 different cancers (SCCHN, BCC, NSCLC, CRC) [^7][^8][^9] | ~800 MB | steinbock_io | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/steinbock_io.zip) | CC BY 4.0 DEED |
| Molecular Cartography (SPArrOW output) | Mouse Liver [^10][^11] | ~70 MB | MouseLiver | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/mouse_liver.zip) | CC BY 4.0 DEED |
| SpaceM | Hepa and NIH3T3 cells [^12] | ~60 MB | spacem_hepanih3t3 | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/spacem_helanih3t3.zip) | CC BY 4.0 DEED |

## Licenses abbreviations

Expand All @@ -40,39 +33,51 @@ For the first 3 datasets, we also provide a version of them in which they are al

The data retains the license of the original published data.

<!-- to add: raccoon, blobs, "additional resources for methods developers" -->
<!-- Artificial datasets
| Description | File Size| Filename | download data | work with data remotely [^1] |
| :--------------------- | :------------------------- | --------:| :-------------------------- | :---------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------- || - | - | 11 kB| toy | [.zarr.zip](https://s3.embl.de/spatialdata/spatialdata-sandbox/toy.zip) | [S3](https://s3.embl.de/spatialdata/spatialdata-sandbox/toy.zarr/) | -->

# Artificial datasets

Also, here you can find [additional datasets and resources for methods developers](https://github.com/scverse/spatialdata-notebooks/blob/main/notebooks/developers_resources/storage_format/).
Also, here you can
find [additional datasets and resources for methods developers](https://github.com/scverse/spatialdata-notebooks/blob/main/notebooks/developers_resources/storage_format/).

# References

If you use the datasets please cite the original sources and double-check their license.

[^2]: From https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-mouse-intestine
[^1]:
From https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-mouse-intestine

[^3]: Janesick, A. et al. High resolution mapping of the breast cancer tumor microenvironment using integrated single cell, spatial and in situ analysis of FFPE tissue. bioRxiv 2022.10.06.510405 (2022) doi:10.1101/2022.10.06.510405.
[^2]: Janesick, A. et al. High resolution mapping of the breast cancer tumor
microenvironment using integrated single cell, spatial and in situ analysis of FFPE
tissue. bioRxiv 2022.10.06.510405 (2022) doi:10.1101/2022.10.06.510405.

[^4]: Schapiro, D. et al. MCMICRO: A scalable, modular image-processing pipeline for multiplexed tissue imaging. Cold Spring Harbor Laboratory 2021.03.15.435473 (2021) doi:10.1101/2021.03.15.435473.
[^3]:
From https://www.10xgenomics.com/datasets/preview-data-ffpe-human-lung-cancer-with-xenium-multimodal-cell-segmentation-1-standard

[^5]: Moffitt, J. R. et al. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 362, (2018).
[^4]: Schapiro, D. et al. MCMICRO: A scalable, modular image-processing pipeline for
multiplexed tissue imaging. Cold Spring Harbor Laboratory 2021.03.15.435473 (2021) doi:
10.1101/2021.03.15.435473.

[^6]: Hartmann, F. J. et al. Single-cell metabolic profiling of human cytotoxic T cells. Nat. Biotechnol. (2020) doi:10.1038/s41587-020-0651-8.
[^5]: Moffitt, J. R. et al. Molecular, spatial, and functional single-cell profiling of
the hypothalamic preoptic region. Science 362, (2018).

[^7]: Windhager, J., Bodenmiller, B. & Eling, N. An end-to-end workflow for multiplexed image processing and analysis. bioRxiv 2021.11.12.468357 (2021) doi:10.1101/2021.11.12.468357.
[^6]: Hartmann, F. J. et al. Single-cell metabolic profiling of human cytotoxic T cells.
Nat. Biotechnol. (2020) doi:10.1038/s41587-020-0651-8.

[^8]: Eling, N. & Windhager, J. Example imaging mass cytometry raw data. (2022). doi:10.5281/zenodo.5949116.
[^7]: Windhager, J., Bodenmiller, B. & Eling, N. An end-to-end workflow for multiplexed
image processing and analysis. bioRxiv 2021.11.12.468357 (2021) doi:
10.1101/2021.11.12.468357.

[^9]: Eling, N. & Windhager, J. steinbock results of IMC example data. (2022). doi:10.5281/zenodo.7412972.
[^8]: Eling, N. & Windhager, J. Example imaging mass cytometry raw data. (2022). doi:
10.5281/zenodo.5949116.

[^10]: Guilliams, Martin, et al. "Spatial proteogenomics reveals distinct and evolutionarily conserved hepatic macrophage niches." Cell 185.2 (2022) doi:10.1016/j.cell2021.12.018
[^9]: Eling, N. & Windhager, J. steinbock results of IMC example data. (2022). doi:
10.5281/zenodo.7412972.

[^11]: Pollaris, Lotte, et al. "SPArrOW: a flexible, interactive and scalable pipeline for spatial transcriptomics analysis." bioRxiv (2024) doi:10.1101/2024.07.04.601829
[^10]: Guilliams, Martin, et al. "Spatial proteogenomics reveals distinct and
evolutionarily conserved hepatic macrophage niches." Cell 185.2 (2022) doi:
10.1016/j.cell2021.12.018

[^12]: See https://github.com/giovp/spatialdata-sandbox/blob/main/spacem_scseahorse1/README.md
[^11]: Pollaris, Lotte, et al. "SPArrOW: a flexible, interactive and scalable pipeline
for spatial transcriptomics analysis." bioRxiv (2024) doi:10.1101/2024.07.04.601829

[^13]: See https://github.com/giovp/spatialdata-sandbox/blob/main/spacem_helanih3t3/README.md
[^12]:
See https://github.com/giovp/spatialdata-sandbox/blob/main/spacem_helanih3t3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -923,4 +923,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
"spatialdata_software_version": "0.4.1.dev11+gb889b53",
"version": "0.1"
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -1437,4 +1437,4 @@
}
},
"zarr_consolidated_format": 1
}
}
2 changes: 1 addition & 1 deletion notebooks/examples/alignment_using_landmarks.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -570,7 +570,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
"version": "3.11.0"
},
"vscode": {
"interpreter": {
Expand Down
33 changes: 33 additions & 0 deletions notebooks/examples/alignment_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
from spatialdata.transformations import Affine
from spatialdata import SpatialData
from spatialdata.transformations import (
BaseTransformation,
Sequence,
get_transformation,
set_transformation,
)


AFFINE_VISIUM_XENIUM = Affine(
[
[1.61711846e-01, 2.58258090e00, -1.24575040e04],
[-2.58258090e00, 1.61711846e-01, 3.98647301e04],
[0.0, 0.0, 1.0],
],
input_axes=("x", "y"),
output_axes=("x", "y"),
)


def postpone_transformation(
sdata: SpatialData,
transformation: BaseTransformation,
source_coordinate_system: str,
target_coordinate_system: str,
):
for element_type, element_name, element in sdata._gen_elements():
old_transformations = get_transformation(element, get_all=True)
if source_coordinate_system in old_transformations:
old_transformation = old_transformations[source_coordinate_system]
sequence = Sequence([old_transformation, transformation])
set_transformation(element, sequence, target_coordinate_system)
Loading