# Getting started with astir

## 0. Load necessary libraries

In [1]:
from astir.data_readers import from_csv_yaml
import pandas as pd

## 1. Load data

We start by reading expression data in the form of a csv file and marker gene information in the form of a yaml file:

In [2]:
expression_mat_path = "../../../astir/tests/test-data/sce.csv"
yaml_marker_path = "../../../astir/tests/test-data/jackson-2020-markers.yml"

We can view both the expression data and marker data:

In [3]:
!head -n 20 ../../../astir/tests/test-data/jackson-2020-markers.yml


cell_states:
  RTK_signalling:
    - Her2
    - EGFR
  proliferation:
    - Ki-67
    - phospho Histone
  mTOR_signalling:
    - phospho mTOR
    - phospho S6
  apoptosis:
    - cleaved PARP
    - Cleaved Caspase3

cell_types:
  Stromal:
    - Vimentin
    - Fibronectin
  B cells:


In [4]:
pd.read_csv(expression_mat_path, index_col=0)[['EGFR','E-Cadherin', 'CD45', 'Cytokeratin 5']].head()

Unnamed: 0,EGFR,E-Cadherin,CD45,Cytokeratin 5
BaselTMA_SP41_186_X5Y4_3679,0.346787,0.938354,0.22773,0.095283
BaselTMA_SP41_153_X7Y5_246,0.833752,1.364884,0.068526,0.124031
BaselTMA_SP41_20_X12Y5_197,0.110006,0.177361,0.301222,0.05275
BaselTMA_SP41_14_X1Y8_84,0.282666,1.122174,0.606941,0.093352
BaselTMA_SP41_166_X15Y4_266,0.209066,0.402554,0.588273,0.064545


Then we can create an astir object using the `from_csv_yaml` function. For more data loading options, see the data loading tutorial.

In [5]:
ast = from_csv_yaml(expression_mat_path, marker_yaml=yaml_marker_path)
print(ast)

Astir object with 6 columns of cell types, 4 columns of cell states and 100 rows.


## 2. Fitting cell types

To fit cell types, simply call

In [6]:
ast.fit_type()

training astir:   0%|          | 0/10 [00:00<?, ?epochs/s]

---------- 1/5 Cell Type Classification ----------


training astir: 100%|██████████| 10/10 [00:01<00:00,  7.64epochs/s]
training astir: 100%|██████████| 10/10 [00:00<00:00, 60.73epochs/s]
training astir:   0%|          | 0/10 [00:00<?, ?epochs/s]

Done!
---------- 2/5 Cell Type Classification ----------
Done!
---------- 3/5 Cell Type Classification ----------


training astir: 100%|██████████| 10/10 [00:00<00:00, 57.64epochs/s]
training astir: 100%|██████████| 10/10 [00:00<00:00, 59.49epochs/s]
training astir:   0%|          | 0/10 [00:00<?, ?epochs/s]

Done!
---------- 4/5 Cell Type Classification ----------
Done!
---------- 5/5 Cell Type Classification ----------


training astir: 100%|██████████| 10/10 [00:00<00:00, 55.89epochs/s]

Done!





We can then get cell type assignment probabilities by calling

In [9]:
assignments = ast.get_celltype_probabilities()
assignments

Unnamed: 0,Stromal,B cells,T cells,Macrophage,Epithelial (basal),Epithelial (luminal),Other
BaselTMA_SP41_186_X5Y4_3679,0.319220,0.131815,0.112066,0.319723,0.003948,0.006126,0.107103
BaselTMA_SP41_153_X7Y5_246,0.004108,0.004771,0.003579,0.002551,0.092131,0.883193,0.009667
BaselTMA_SP41_20_X12Y5_197,0.437544,0.106634,0.103579,0.239754,0.004641,0.004611,0.103237
BaselTMA_SP41_14_X1Y8_84,0.041962,0.031774,0.539347,0.010087,0.010139,0.355061,0.011630
BaselTMA_SP41_166_X15Y4_266,0.377869,0.092916,0.194404,0.263221,0.002855,0.003429,0.065306
...,...,...,...,...,...,...,...
BaselTMA_SP41_114_X13Y4_1057,0.035546,0.044499,0.041883,0.027560,0.140432,0.661050,0.049030
BaselTMA_SP41_141_X11Y2_2596,0.026889,0.032341,0.014774,0.026627,0.196066,0.646148,0.057153
BaselTMA_SP41_100_X15Y5_170,0.307612,0.152112,0.068384,0.162715,0.045153,0.058820,0.205205
BaselTMA_SP41_14_X1Y8_2604,0.346417,0.112759,0.104856,0.340324,0.002034,0.002894,0.090717


where each row corresponds to a cell, and each column to a cell type, with the entry being the probability of that cell belonging to a particular cell type.

To fetch an array corresponding to the most likely cell type assignments, call

In [10]:
# TODO

## 3. Fitting cell state

Similarly as before, to fit cell state, call

In [11]:
ast.fit_state()

training astir: 100%|██████████| 100/100 [00:00<00:00, 789.04epochs/s]
training astir:   0%|          | 0/100 [00:00<?, ?epochs/s]

---------- 1/5 Cell State Classification ----------
---------- 2/5 Cell State Classification ----------


training astir: 100%|██████████| 100/100 [00:00<00:00, 796.65epochs/s]
training astir: 100%|██████████| 100/100 [00:00<00:00, 842.76epochs/s]
training astir:   0%|          | 0/100 [00:00<?, ?epochs/s]

---------- 3/5 Cell State Classification ----------
---------- 4/5 Cell State Classification ----------


training astir: 100%|██████████| 100/100 [00:00<00:00, 818.31epochs/s]
training astir: 100%|██████████| 100/100 [00:00<00:00, 766.65epochs/s]
training astir: 0epochs [00:00, ?epochs/s]

---------- 5/5 Cell State Classification ----------





and cell state assignments can be inferred via

In [13]:
states = ast.get_cellstates()
states

AttributeError: 'Astir' object has no attribute 'get_cellstates'