hello world; what this is and where it's going...

Hi. I'm Simon. 

I built this because I got tired of watching the same mistakes over and over .... again and again (also in my own work): test set leakage, preprocessing that bleeds across folds, models "assessed" X times until the number looked right. Nobody wants to cheat or deceive themselves. 

The tools just made it too easy. `ml` tries to fix that. `split()` tracks where your data came from. `evaluate()` is the practice exam, `assess()` is the final   -   and it counts. `cv()` rotates within dev, test stays sealed. `verify()` lets you check the chain after. The API doesn't lecture you, it just makes the wrong thing harder to do.

v1.1.0 is on PyPI now: 11 algorithms, optional Rust backend, same workflow in Python and R. Julia is next to complete. There's a [preprint](https://arxiv.org/abs/2603.10742) if you want the theory behind it. One person, zero funding, built in evenings and weekends. 

If you tried an earlier version and things broke... I'm sorry for any inconvenience. The API moved too fast. I take stability seriously now, but I won't lie: feature engineering inside the CV loop (rolling windows, lag features) is next, and if the correct design requires a breaking change to `prepare()`, it'll break. Correctness over comfort. 

What would actually help me: try it on your data. Tell me where it confused you, where it crashed, where the API felt wrong. File an issue, no matter how small. If something is over-engineered, tell me. I'd rather hear it now than after it's baked in. I'm not looking for drive-by PRs. 

I am looking for people who care about making ML less leaky. If that's you   -   welcome. Pick something, try it, is something broken? pls try to break or improve it, tell me what you think.

Cheers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hello world; what this is and where it's going... #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

hello world; what this is and where it's going... #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions