# Cropping the data

In [2]:
from pathlib import Path
from insitupy import InSituData, CACHE

## Load Xenium data into `InSituData` object

Now the Xenium data can be parsed by providing the data path to the `InSituPy` project folder

In [3]:
insitupy_project = Path(CACHE / "out/demo_insitupy_project")
xd = InSituData.read(insitupy_project)

In [4]:
xd

[1m[31mInSituData[0m
[1mMethod:[0m		Xenium
[1mSlide ID:[0m	0001879
[1mSample ID:[0m	Replicate 1
[1mPath:[0m		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
[1mMetadata file:[0m	.ispy

No modalities loaded.

In [5]:
# read all data modalities but the transcripts
xd.load_all(skip="transcripts")

In [6]:
xd

[1m[31mInSituData[0m
[1mMethod:[0m		Xenium
[1mSlide ID:[0m	0001879
[1mSample ID:[0m	Replicate 1
[1mPath:[0m		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
[1mMetadata file:[0m	.ispy
    ➤ [34m[1mimages[0m
       [1mnuclei:[0m	(25778, 35416)
       [1mCD20:[0m	(25778, 35416)
       [1mHER2:[0m	(25778, 35416)
       [1mHE:[0m	(25778, 35416, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 157600 × 297
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
           var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
           uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
           obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
           varm: 'PCs'
           layers: 'c

In [7]:
# Visualize the data
xd.show()

## Cropping of data

There are two different methods implemented for cropping the data.

### Option 1: Crop using limit values

In [7]:
# alternatively you can also crop using the xlim/ylim arguments
xd_cropped = xd.crop(xlim=(2000,3000), ylim=(2000,3000))

In [8]:
xd_cropped

[1m[31mInSituData[0m
[1mMethod:[0m		Xenium
[1mSlide ID:[0m	0001879
[1mSample ID:[0m	Replicate 1
[1mPath:[0m		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
[1mMetadata file:[0m	.ispy
    ➤ [34m[1mimages[0m
       [1mnuclei:[0m	(4706, 4706)
       [1mCD20:[0m	(4706, 4706)
       [1mHER2:[0m	(4706, 4706)
       [1mHE:[0m	(4706, 4706, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 4550 × 297
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
           var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
           uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
           obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
           varm: 'PCs'
           layers: 'counts', 'n

In [9]:
xd_cropped.show()

### Option 2: Crop from `regions`

We can also crop a region from the dataset. To specify the region, a tuple in the shape `(region_key, region_name)` is used.

In [10]:
xd_cropped = xd.crop(
    region_tuple=("demo_regions", "Region1"))

In [11]:
xd_cropped

[1m[31mInSituData[0m
[1mMethod:[0m		Xenium
[1mSlide ID:[0m	0001879
[1mSample ID:[0m	Replicate 1
[1mPath:[0m		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
[1mMetadata file:[0m	.ispy
    ➤ [34m[1mimages[0m
       [1mnuclei:[0m	(2701, 3309)
       [1mCD20:[0m	(2701, 3309)
       [1mHER2:[0m	(2701, 3309)
       [1mHE:[0m	(2701, 3309, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 2289 × 297
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
           var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
           uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
           obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
           varm: 'PCs'
           layers: 'counts', 'n

In [12]:
xd_cropped.show()

## Saving the cropped data

### Saving to the existing project path is not possible

Due to the cropping event, saving to the existing project path is not possible and the `.save()` function throws an error:

In [13]:
xd_cropped.save()

Project is neither saved nor updated. Try `saveas()` instead to save the data to a new project folder. A reason for this could be the data has been cropped in the meantime.
  warn(


Reload also does not work because it was not saved as an `InSituPy` project.

In [14]:
xd_cropped.reload()

No modalities with existing save path found. Consider saving the data with `saveas()` first.


### Saving to new project directory

In [15]:
cropped_insitupy_project = insitupy_project.parent / f"{insitupy_project.name}_cropped"

In [16]:
xd_cropped.saveas(cropped_insitupy_project, overwrite=True)

Saving data to C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project_cropped
Saved.


### Reload from `InSituPy` project folder

Reloading from project folder makes visualizations more efficient. But of course only the modalities that had been loaded before the cropping event can be reloaded in this step.

In [17]:
# reload from insitupy project
xd_cropped = InSituData.read(cropped_insitupy_project)
xd_cropped.load_all()

In [18]:
xd_cropped

[1m[31mInSituData[0m
[1mMethod:[0m		Xenium
[1mSlide ID:[0m	0001879
[1mSample ID:[0m	Replicate 1
[1mPath:[0m		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project_cropped
[1mMetadata file:[0m	.ispy
    ➤ [34m[1mimages[0m
       [1mnuclei:[0m	(2701, 3309)
       [1mCD20:[0m	(2701, 3309)
       [1mHER2:[0m	(2701, 3309)
       [1mHE:[0m	(2701, 3309, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 2289 × 297
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
           var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
           uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
           obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
           varm: 'PCs'
           layers: 'cou

In [61]:
xd_cropped.show()