We believe the large amount of single cell and other omics data available in the public domain conceive valuable information in characterizing cellular biological properties in different tissue microenvironments, analogous analysis on which can provide substantial novel insights to the fundamental biological understanding in biochemistry, molecular and cell biology.
The central goal of this knowledge base is to provide resources to pioneer the development of biological explainable methodologies in single cell data study. The high complexity of experimental settings and errors in biological omics data raise a high level demand in cutting edge representation, transfer and multi-task learning approaches for biological explainable data interpretation. On the other hand, the recently generated data single cell multi-omics data is with a comparable scale to social network or imaging data that could facilitate the benchmark of newly developed AI and data mining frameworks.
Tackling the challenges involves an interdisciplinary education approach that can translate the state-of-the-art topics, problem setups, research language, and benchmark and metrics between the fields of artificial intelligence, data mining, computational biology and biotechnology.
We will bridge the translation of the topics from different research fields by providing a series of computational and biological challenges and a well annotated data Testbed.
The educational component will also be designed for facilitating the training of future scientists from college and high school STEM students. College and high school intern students in Biomedical Data Research Lab at Indiana University School of Medicine (PI: Drs.Chi Zhang and Sha Cao) will be directly involved in the construction of this knowledge base, with a specific focus on the educational component.
- Developing a series of computational methods to predict gene sets, networks, or specific data representations in single cell and other omics data that annotate the biological characteristics in different tissue contexts.
- Providing a high quality Testbed
- Facilitating the interactive methodology developments between computer scientists, statistician and computational biologists
- Providing an educational environment for (1) computer scientists to know the current challenges in computational modeling of biological omics, (2) computational biologists to know the capability of state-of-the-art AI methods, and (3) broader range trainees to learn the whole process and involve in practical applications.
- Highschool/undergraduate training (students’ notes)
- Testbed with well cleaned paired benchmark set and questions for development of novel AI algorithm (summary and relevant publications)
- Expose state-of-the-art AI methods or framework to computational biologists (summary of relevant publications)
- Benchmark newly developed methods and pipelines with state-of-the-art methods
- Construction a transcriptomics reference map to annotate cell type and cell type specific functions in different bain microenvironments
Ph.D. candidate, School of Medicine, Indiana University
Assistant Professor of Medical & Molecular Genetics, School of Medicine, Indiana University