Skip to content
Breon Schmidt edited this page Jun 30, 2020 · 10 revisions

Welcome to ALLSorts!

ALLSorts is a B-Cell Acute Lymphoblastic Leukemia (B-ALL) subtype classifier, taking gene expression counts and making predictions across 18 molecular subtypes and 5 meta-subtypes! This is a Python based implementation utilising the incredible Scikit Learn.

Hold up - I need some background here...

  1. What is a B-Cell ALL? B-ALL is a form of Acute Lymphoblastic Leukemia (ALL), the most common paediatric cancer. It occurs when the maturation of B-Cell lymphoblasts is arrested, leading to their gradual accumulation.

  2. Eep! What are subtypes then? It turns out that B-ALL can find its genesis through a variety of causal mechanisms, subtypes, with some conveying a higher risk than others. The World Health Organisation (WHO) have outlined 9 subtypes that encapsulate these distinct mechanisms (2 of which are provisional entries) [1]. However, a recent study from the St. Jude Children's Research Hospital has revealed the existence of perhaps 23 [2]!

  3. And a classifier helps... how? Given that treatment can be adjusted based on the knowledge of which subtype of B-ALL a patient may have, it would be very useful to have some way of identifying which! With RNA Sequencing (RNA-Seq) we can quantify the activity of genes. And, as it turns out, there are distinct patterns across genes that are indicative of different subtypes. If we can learn the pattern that defines each subtype, we can then have some pipeline for identifying the subtype. A classifier is a supervised machine learning method that attempts to do just that. In short, it attempts to learns a model from true examples after which it can then perform a predictive task - In this case, assigning a subtype to a sample.

  4. ...Meta-subtypes? One interesting feature of subtypes is that some are more closely than others - some are phenocopies of another established group. For example, Ph and Ph-like, differ only in which causal mechanism creates the similar phenotype (hence the *-like). ALLSorts groups these similar subtypes into 5 meta-subtypes and performs classification hierarchically, i.e. B-ALL Sample > Ph Group > Ph / Ph-like.

Look... I just came here to use the classifier...

Fine. You can click on the various Wiki pages in the sidebar to get you started. You might want to start with installing ALLSorts on your system.

Is development ongoing?

It's a part of my PhD project and will be updated periodically. But it's exciting that it exist publicly now for you to use, no? Keep an eye on Projects to get an idea for what I'm up to.

References?

[1] Arber, D. A., Orazi, A., Hasserjian, R., Thiele, J., Borowitz, M. J., Le Beau, M. M., … Vardiman, J. W. (2016). The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood, 127(20), 2391–2405.

[2] Gu, Z., Churchman, M. L., Roberts, K. G., Moore, I., Zhou, X., Nakitandwe, J., … Mullighan, C. G. (2019). PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia. Nature Genetics, 51(2), 296–307.