John Snow Labs NLP Test 1.4.0: Enhancing Support for Toxicity test and new QA benchmark datasets (NarrativeQA, TruthfulQA, QuAC, HellaSwag, MMLU and OpenbookQA)
John Snow Labs NLP Test 1.4.0: Enhancing Support for Toxicity test and new QA benchmark datasets (NarrativeQA, TruthfulQA, QuAC, HellaSwag, MMLU and OpenbookQA)
π’ Overview
NLP Test 1.4.0 π comes with brand new features, including: new capabilities for testing Large Language Models for toxicity and support for new QA benchmark datasets (NarrativeQA, TruthfulQA, QuAC, HellaSwag, MMLU and OpenbookQA) for robustness, representation, fairness and accuracy tests. It also includes addition of some new robustness tests and many other enhancements and bug fixes!
A big thank you to our early-stage community for their contributions, feedback, questions, and feature requests π
Make sure to give the project a star right here β
π₯ New Features & Enhancements
- Adding support for NarrativeQA dataset #487
- Adding support for toxicity task #488
- Adding support for TruthfulQA dataset #477
- Adding support for new dyslexia swap test for robustness testing #474
- Adding support for new slangificator test for robustness testing #463
- Adding support for new abbreviation test for robustness testing #471
- Adding support for OpenBookQA dataset #479
- Adding support for MMLU dataset #481
- Adding support for hellaswag dataset #486
- Adding new tutorial notebooks #497
β How to Use
Get started now! π
pip install nlptest
Create your test harness in 3 lines of code π§ͺ
# Set OpenAI API keys
os.environ['OPENAI_API_KEY'] = ''
# Import and create a Harness object
from nlptest import Harness
h = Harness(task='toxicity', model='text-davinci-002', hub='openai', data='toxicity-test-tiny')
# Generate test cases, run them and view a report
h.generate().run().report()
π Documentation
β€οΈ Community support
- Slack For live discussion with the NLP Test community, join the
#nlptestchannel - GitHub For bug reports, feature requests, and contributions
- Discussions To engage with other community members, share ideas, and show off how you use NLP Test!
We would love to have you join the mission π open an issue, a PR, or give us some feedback on features you'd like to see! π
β»οΈ Changelog
What's Changed
- updated/doc by @Prikshit7766 in #459
- docs/Update documentation of models by @RakshitKhajuria in #465
- refactor user prompt by @alytarik in #472
- Feature/dyslexia swap feature by @ArkajyotiChakraborty in #417
- Feature/add support for abbreviation test by @RakshitKhajuria in #471
- Hotfix/get rid of some dependencies by @chakravarthik27 in #473
- Draft: refactor/perturbations and samples to support QA. by @chakravarthik27 in #460
- feature/Add speech to text typo by @Prikshit7766 in #475
- hotfix/get rid of inflect dependency and refactoring robustness by @RakshitKhajuria in #478
- Added TruthfulQA Dataset by @RakshitKhajuria in #477
- feature/Add support for slangificator test by @Prikshit7766 in #463
- Dataset/OpenBookQA datasets by @Prikshit7766 in #479
- Datasets/MMLU Datasets by @Prikshit7766 in #481
- Docs/update model hub-summarization nb-readme by @RakshitKhajuria in #480
- Hotfix/fixed some tests and refactored number_to_word.py by @RakshitKhajuria in #483
- Dataset/quac dataset by @Prikshit7766 in #484
- Feature/dyslexia swap test by @alytarik in #474
- Feature/hellaswag dataset by @alytarik in #486
- Feature/narrativeqa dataset by @alytarik in #487
- Feature/create toxicity test 438 by @chakravarthik27 in #488
- hot-fix/fix-slangify-test by @RakshitKhajuria in #489
- DRAFT : Docs/update nb and docs by @RakshitKhajuria in #490
- Update datasets by @RakshitKhajuria in #493
- Fix/toxicity by @chakravarthik27 in #492
- Feature/add tutorial nbs by @ArshaanNazir in #497
- default toxicity config by @chakravarthik27 in #498
- docs/add dataset notebooks by @alytarik in #499
- Release/1.4.0 by @ArshaanNazir in #500
New Contributors
- @ArkajyotiChakraborty made their first contribution in #417
Full Changelog: v1.3.0...v1.4.0