Marco/ruff and ty checks#153
Conversation
* tentative maturin update * bumping the version * reverting the pyproject change, not important * [chore] Cleaner wds (#147) * Some cleanup + better benchmarks * cargo fmt, broken dev setup * removing a silly change * better workload spread while wds, tarball extraction is still the bottleneck * better workload spread while wds, tarball extraction is still the bottleneck * adding a small plot helper * Renaming a confusing variable + adding more benchmarks * Updating pyproject + PD12M bench on Epyc * Fix the missing attributes from wds samples * slightly cleaner code * Adding a unit test + cleaner error handling * Adding backtrace to the tests * code review * Updating the version post-merge * [infra] Update the build process (#148) * tentative maturin update * bumping the version * reverting the pyproject change, not important
| repos: | ||
| - repo: https://github.com/pre-commit/pre-commit-hooks | ||
| rev: v4.3.0 | ||
| - repo: builtin |
There was a problem hiding this comment.
https://prek.j178.dev/builtin/
faster though not compatible with standard precommit
There was a problem hiding this comment.
ooh new to me, could be useful at Mistral actually / big pre-commit users and it's quite slow
| "prek>=0.2.29", | ||
| ] | ||
|
|
||
| dev = [ | ||
| "matplotlib", | ||
| "pandas", | ||
| "prek>=0.2.29", |
There was a problem hiding this comment.
actually I'm not sure how we want to split the dependencies between dev and test, if there should be overlap?
There was a problem hiding this comment.
above looks good to me, basically test == unit tests and dev is benchmarks really ?
| if sweep: | ||
| results_sweep = {} | ||
| for num_workers in range(2, (os.cpu_count() * 2 or 2), 2): | ||
| for num_workers in range(2, (os.cpu_count() or 1) * 2, 2): |
There was a problem hiding this comment.
ah yes this is really crap, I've been changing that in branches to just go *2 until we reach cpu count. I wanted to so something more iterative here but that's a bit ugly/hard to read
| def custom_transform(sample): | ||
| if "jpg" in sample: | ||
| sample["jpg"] = transform(sample["jpg"]) | ||
| if "png" in sample: | ||
| sample["png"] = transform(sample["png"]) | ||
| return sample | ||
|
|
||
| if transform: | ||
|
|
||
| def custom_transform(sample): | ||
| if "jpg" in sample: | ||
| sample["jpg"] = transform(sample["jpg"]) | ||
| if "png" in sample: | ||
| sample["png"] = transform(sample["png"]) | ||
| return sample | ||
| else: | ||
| custom_transform = lambda x: x # noqa: E731 |
There was a problem hiding this comment.
will revisit this
There was a problem hiding this comment.
supposed to commit this apparently, but I'm not too sure
There was a problem hiding this comment.
yes, I'm still not super used to it but it's more precise than the toml in that everything is versioned, so it makes sure we're all working with the same binaries. We should have done this before, my bad
blefaudeux
left a comment
There was a problem hiding this comment.
looking good to me, thanks @MarcoForte ! Happy to add you to the authors if you'd like to keep going with some additions by the way
| - name: Install deps | ||
| run: | | ||
| uv pip install maturin twine | ||
| uv tool install maturin |
There was a problem hiding this comment.
ah I forgot but twine is not required indeed (top of head) I think that maturin does everything here
Ruff and ty and prek for precommit checks
CICD caught the intentional error in this commit