Data augmentation for NLP, presented at EMNLP 2019
-
Updated
Mar 19, 2023 - Python
Data augmentation for NLP, presented at EMNLP 2019
🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.
📃Language Model based sentences scoring library
ICLR 2018 Quick-Thought vectors
Extract Information from web corpus using Open Information Extraction.
Tensorflow Implementation of Variational Attention for Sequence to Sequence Models (COLING 2018)
A web application that interfaces two GEC systems. [web instance is down]
Tensorflow Implementation of Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation (NAACL 2019).
A program that can generate a secure password of up to 100 characters, extract securely selected words from the diceware wordlist, generate a password from a sentence, and check for vulnerabilities in a given password.
A sentence segmentation library with wide language support optimized for speed and utility.
word2vec with a context based on sentences.
A Quick Thought implemented by pytorch.
Yet Another Sequence Encoder - Encode sequences to vector of vector in python !
63k Chinese sentences with simplified, traditional, pinyin and english translation for offline use
Source code for ACL 2023 paper "miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings".
Add a description, image, and links to the sentence topic page so that developers can more easily learn about it.
To associate your repository with the sentence topic, visit your repo's landing page and select "manage topics."