The Corpus of Linguistic Acceptability (CoLA) in its full form consists of 10657 sentences from 23 linguistics publications, expertly annotated for acceptability (grammaticality) by their original authors. The public version provided here contains 9594 sentences belonging to training and development sets, and excludes 1063 sentences belonging to a held out test set.
BERT (Bidirectional Encoder Representations from Transformers), released in late 2018, BERT is a method of pretraining language representations that was used to create models that NLP practicioners can then download and use. we can either use these models to extract high quality language features from text data, or we can fine-tune these models on a specific task (classification, entity recognition, question answering, etc.).
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding