Skip to content

Latest commit

 

History

History
56 lines (39 loc) · 3.28 KB

index.md

File metadata and controls

56 lines (39 loc) · 3.28 KB

Welcome!

ACES

ACES is a library designed for the automatic extraction of cohorts from event-stream datasets for downstream machine learning tasks. Check out below for an overview of ACES and how it could be useful in your workflows!

---
glob:
maxdepth: 2
---
README <readme>
Usage Guide <usage>
Task Examples <notebooks/examples>
Predicates DataFrame <notebooks/predicates>
MEDS Data Tutorial <notebooks/tutorial_meds>
Technical Details <technical>
Computational Profile <profiling>
Module API Reference <api/modules>
License <license>

Why ACES?

If you have a dataset and want to leverage it for machine learning tasks, the ACES ecosystem offers a streamlined and user-friendly approach. Here's how you can easily transform, prepare, and utilize your dataset with MEDS and ACES for efficient and effective machine learning:

I. Transform to MEDS

  • Simplicity: Converting your dataset to the Medical Event Data Standard (MEDS) is straightforward and user-friendly compared to other Common Data Models (CDMs).
  • Minimal Bias: This conversion process ensures that your data remains as close to its raw form as possible, minimizing the introduction of biases.
  • MEDS-ETL: Follow this link for detailed instructions and ETLs to transform your dataset into the MEDS format!

II. Identify Predicates

  • Task-Specific Concepts: Identify the predicates (data concepts) required for your specific machine learning tasks.
  • Pre-Defined Criteria: Utilize our pre-defined criteria across various tasks and clinical areas to expedite this process.
  • MEDS-DEV: Access our benchmark of tasks to find relevant predicates!

III. Set Dataset-Agnostic Criteria

  • Standardization: Combine the identified predicates with standardized, dataset-agnostic criteria files.
  • Examples: Refer to the MEDS-DEV examples for guidance on how to structure your criteria files for your private datasets!

IV. Run ACES

  • Run the ACES Command-Line Interface tool (aces-cli) to extract cohorts based on your task - check out the Usage Guide for more information!

V. Run MEDS-Tab

  • Painless Reproducibility: Use MEDS-Tab to obtain comparable, reproducible, and well-tuned XGBoost results tailored to your dataset-specific feature space!

By following these steps, you can seamlessly transform your dataset, define necessary criteria, and leverage powerful machine learning tools within the ACES and MEDS ecosystem. This approach not only simplifies the process but also ensures high-quality, reproducible results for your machine learning for health projects. It can reliably take no more than a week of full-time human effort to perform Steps I-V on new datasets in reasonable raw formulations!