In [None]:
### ASPIRE v0.9.1 Demo

### Data Sources

`Source` objects represent datasets on disk. On creation, metadata is extracted from the dataset. The image data itself is loaded and processed as-needed using batching to reduce memory load.

#### Pre-processed with Relion

Many datasets are pre-processed (particle picking, denoising and CTF estimation) using RELION prior to being uploaded to EMPIAR. The metadata including CTF parameters is stored in a `.star` file, which contains all the information needed to load in the dataset. 


In [None]:
# Load Relion Source
from aspire.source import RelionSource
import os
import numpy as np

# Put your path to the ASPIRE-0.9.1 Demo Repo here
root_folder = ""
rln_data_folder = os.path.join(root_folder, "relion_data")
starfile_path = os.path.join(rln_data_folder, "Polish/job028/shiny.star")

rln_src = RelionSource(starfile_path, data_folder=rln_data_folder, max_rows=4000, pixel_size=1.2)

Peek at the first 10 images. This sample data, which is from the Beta-Galactosidase enzyme, is from Relion tutorial data

In [None]:
# Load and display raw images
rln_src.images(0,10).show()

Source objects have associated metadata for each particle, which is implemented in rows of a Pandas dataframe. We can take a look at all the metadata that we were able to get from the Relion STAR file.

In [None]:
## Show metadata
rln_src._metadata

Source objects have in-built operations. We can add a Downsampling operation to the source, which will be applied when we ask the Source to load images, via the `images()` method.

In [None]:
# Downsampling
rln_src.phase_flip()
rln_src.images(0,10).show()

Since CTF information from Relion has been loaded, we can correct for CTF via phase flipping:

In [None]:
# Phase flip
rln_src.downsample(64)
rln_src.images(0,10).show()

Assuming an anisotropic noise distribution in these images, we can use ASPIRE's noise whitening tools to estimate and then whiten the noise:

In [None]:
# Whiten
from aspire.noise import AnisotropicNoiseEstimator
estimator = AnisotropicNoiseEstimator(rln_src)
rln_src.whiten(estimator.filter)
rln_src.images(0,10).show()

#### Command line option: `aspire preprocess`
The process we did above can also be done from the command line via:

```
aspire preprocess
    --data_folder=/Users/langfield/ASPIRE_demo/relion_data
    --starfile_in=/Users/langfield/ASPIRE_demo/relion_data/Polish/job028/shiny.star
    --starfile_out=/Users/langfield/ASPIRE_demo/relion_data/preprocessed.star
    --max_rows=4000
    --downsample=64
    --phase_flip 
    --whiten 
```

Now that we have done some preprocessing, we can showcase Covariance Wiener Filter denoising. This step generates 
denoised particle images via the CWF method. These are used in the classification step for class averaging. 
The original preprocessed images, however, are used for the averaging process.

In [None]:
# Cov2D denoising and peek
from aspire.denoising import DenoiserCov2D
cwf_denoiser = DenoiserCov2D(rln_src)

classification_src = cwf_denoiser.denoise()
classification_src.images(0,10).show()

#### Command line option: `aspire denoise`
CWF denoising can also be done from the command line via:

```
aspire denoise
    --data_folder=/Users/langfield/ASPIRE_demo/relion_data
    --starfile_in=/Users/langfield/ASPIRE_demo/relion_data/preprocessed.star
    --starfile_out=/Users/langfield/ASPIRE_demo/relion_data/preprocessed_denoised.star
    --max_rows=4000
    --max_resolution=64
```

#### Raw micrographs with particle locations

Some data sources come to us without this preprocessing done, and we must do additional preprocessing to prepare it for later stages of the ASPIRE pipeline. 

In [None]:
### Micrograph
# Take a look at a raw micrograph
import mrcfile
import matplotlib.pyplot as plt
import os

mrc_data_folder = os.path.join(root_folder, "micrographs")
mrc_filename = os.path.join(mrc_data_folder, "sample.mrc")
with mrcfile.open(mrc_filename, mode="r") as mrc:
    mrc_img = mrc.data

plt.figure(figsize=(10,10))
plt.imshow(mrc_img, cmap="gray")
plt.show()


Run APPLE, ASPIRE's particle picking tool

In [None]:
# Picked particles
from aspire.apple import Apple
apple_picker = Apple(particle_size=78, output_dir=mrc_data_folder)
centers, particles_img = apple_picker.process_micrograph(mrc_filename)


Display the picked particles

In [None]:
## Display picked particles
plt.figure(figsize=(10,10))
plt.imshow(particles_img, cmap="gray")
plt.show()

#### Command line option: `aspire apple`
Particle picking can also be done from the command line via:

```
aspire apple
    --mrc_path=/Users/langfield/ASPIRE_demo/micrographs/sample.mrc
    --output_dir=/Users/langfield/ASPIRE_demo/micrographs
    --create_jpg
    --particle_size=78
```

This has now created a STAR file containing a list of particle center coordinates. We can represent this new data source with a CentersCoordinateSource

In [None]:
# CoordinateSource
from aspire.source import CentersCoordinateSource

coords_filename = os.path.join(mrc_data_folder, "sample_applepick.star")

ctrs_src = CentersCoordinateSource(files=[(mrc_filename, coords_filename)], particle_size=78)
ctrs_src.images(0,10).show()

Note that the CentersCoordinateSource has blank metadata, since the only information we have  gleaned from the micrograph is particle locations

In [None]:
# metadata
ctrs_src._metadata

We can use ASPIRE's own CTF estimator tool

In [None]:
# Estimate the CTF of this micrograph
from aspire.ctf import estimate_ctf
results = estimate_ctf(data_folder=mrc_data_folder,
                    pixel_size=1,
                    cs=2.0,
                    amplitude_contrast=0.07,
                    voltage=300.0,
                    num_tapers=2,
                    psd_size=512,
                    g_min=30.0,
                    g_max=5.0,
                    output_dir=mrc_data_folder,
                    dtype=np.float64,
                )

Peek at the estimated CTF

In [None]:
# ctf image
# sample.ctf
ctf_data = mrcfile.open(os.path.join(mrc_data_folder, "sample.ctf")).data
plt.figure(figsize=(10,10))
plt.imshow(ctf_data, cmap="gray")
plt.show()

#### Command line option: `aspire estimate-ctf`
CTF estimation can also be done from the command line via:

```
aspire estimate-ctf
    --data_folder=/Users/langfield/ASPIRE_demo/micrographs
    --pixel_size=1
    --cs=2.0
    ...
    --output_dir=/Users/langfield/ASPIRE_demo/micrographs
    
```

#### Command line option: `aspire extract-particles`
Loading a micrograph source and cropping particles / saving to a STAR file can be done from the command line via:

```
aspire extract-particles
    --mrc_paths=/Users/langfield/ASPIRE_demo/micrographs/sample.mrc
    --coord_paths=/Users/langfield/ASPIRE_demo/micrographs/sample_applepick.star
    --starfile_out=/Users/langfield/ASPIRE_demo/micrographs/saved_source.star
    --centers
    --downsample=64
    --whiten
    --particle_size=78
```

### Image Bases

Now we can demonstrate the speed and accuracy of the Fourier-Bessel and Prolate Spherical bases for 2D images.

In [None]:
from aspire.basis import FBBasis2D, PSWFBasis2D
from aspire.image import Image

# get denoised particles as numpy array
cwf_particles = classification_src.images(0,10)
fb_basis = FBBasis2D((64,64), dtype=np.float64)
fb_coeffs = fb_basis.evaluate_t(cwf_particles.asnumpy())
fb_imgs = fb_basis.evaluate(fb_coeffs)
# Numpy Array, which we can convert into an ASPIRE Image objects
print("Original images:")
cwf_particles.show()
print("Fourier-Bessel images")
Image(fb_imgs).show()
print("Differences")
(cwf_particles - Image(fb_imgs)).show()


Now we'll do the same with Prolate Spheroidal basis:

In [None]:
ps_basis = PSWFBasis2D((64,64), dtype=np.float64)
ps_coeffs = ps_basis.evaluate_t(cwf_particles.asnumpy())
ps_imgs = ps_basis.evaluate(ps_coeffs)
print("Original images:")
cwf_particles.show()
print("Prolate Spherical images:")
ps_imgs.show()
print("Differences")
(cwf_particles- ps_imgs).show()