Suggestion: Standardize `model.predict()` output format across all models

Hi team,

Thanks for the great work on biolearn. The modular model interface and `GeoData` integration make it really convenient to work with methylation-based predictors.

While using multiple models in a batch prediction workflow, I noticed that the `model.predict()` output format varies depending on the model:

1. Most models return a DataFrame where:
- Sample IDs are stored as the row index (pandas .index)
- The predicted value is stored in a single "Predicted" column

2. Some models (e.g. Zhang_10) return sample IDs as a regular "id" or "SampleID" column, rather than as the DataFrame index

3. Models like `TwelveCellDeconvoluteBloodEPIC` return **transposed** matrices:
- Rows = cell types
- Columns = sample IDs (requiring a `.T` to match other models)

To ensure downstream code (e.g. batch pipelines, pd.concat, or tidy reshaping) works reliably, would it be possible to standardize the `.predict()` output for all models? 

Really appreciate all the effort you've put into this framework!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Suggestion: Standardize `model.predict()` output format across all models #161

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Suggestion: Standardize model.predict() output format across all models #161

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Suggestion: Standardize `model.predict()` output format across all models #161