-
Notifications
You must be signed in to change notification settings - Fork 35
chore: bump version to 0.18.0 #439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Docs preview URL |
Coverage Report
Files without new missing coverage
278 files skipped due to complete coverage. Coverage success: total of 97.98% is above 97.98% 🎉 |
e0f6e0a to
357d0ad
Compare
|



Changelog
tensorboard,wandb,comet_ml,aim,mlflow,clearml,dvclive,csv,json,rich) inedsnlp.trainvia theloggerparameter. Default is [jsonandrich] for backward compatibility.batch_size = 10000 tokensandsub_batch_size = 5 splitsto accumulate batches of 2000 tokens.pyarrow_write_kwargsto pass to pyarrow.dataset.write_datasetend_valueparameter to configure if the learning rate should decay to zero or another value.eds.explodepipe that splits one document into multiple documents, one per span yielded by itsspan_getterparameter, each new document containing exactly that single span.Training a span classifiertutorial, and reorganized deep-learning docsScheduledOptimizernow warns when a parameter selector does not match any parameter.Fixed
use_sectionineds.historyshould now correctly handle cases when there are other sections following history sections.words[-10:10]syntax in trainable span classifiercontext_getterparameterpost_initwas applied after the instantiation of the optimizer : if the model discovered new labels, and therefore changed its parameter tensors to reflect that, these new tensors were not taken into account by the optimizer, which could likely lead to subpar performance. Now,post_initis applied before the optimizer is instantiated, so that the optimizer can correctly handle the new tensors.write_parquetand support forpolarsinpyproject.toml. Now all implemented readers and writers are correctly registered as entry points.Changed
Sections cues in
eds.historyare now section titles, and not the full section.💥 Validation metrics are now found under the root field
validationin the training logs (e.g.metrics['validation']['ner']['micro']['f'])It is now recommended to define optimizer groups of
ScheduledOptimizeras a list of dicts of optim hyper-parameters, each containing aselectorregex key, rather than as a single dict with aselectoras keys and a dict of optim hyper-parameters as values. This allows for more flexibility in defining the optimizer groups, and is more consistent with the rest of the EDS-NLP API. This makes it easier to reference groups values from other places in config files, since their path doesn't contain a complex regex string anymore. See the updated training tutorials for more details.If this PR is a bug fix, the bug is documented in the test suite.
Changes were documented in the changelog (pending section).
If necessary, changes were made to the documentation (eg new pipeline).