word2vec++ is a Distributed Representations of Words (word2vec) library and tools implementation, written in C++11 from the scratch
-
Updated
Oct 14, 2023 - C++
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
word2vec++ is a Distributed Representations of Words (word2vec) library and tools implementation, written in C++11 from the scratch
Code accompanying the CVPR 2019 paper: https://arxiv.org/abs/1812.04155
Customizable machine translation in C++
Do NLP without coding! Simple NLP framework.
The FlexFringe tool for flexible learning of state machines (deterministic automata) from traces. See the paper at https://arxiv.org/abs/2203.16331
ShortText classification
Implemented Preprocessing steps, Feature Extraction techniques and Naive Bayes Classifier in C++. Moreover, we have also implemented all the steps using python for comparative analysis.
pg_mystem - расширение PostgreSQL для лемматизации (морфологической нормализации) текстов на русском языке. PostgreSQL extension for Yandex Mystem
Telegram Data Clustering Contest (Bossy Gnu's submission )
Lemmagen Python bindings exported from https://pypi.python.org/pypi/Lemmagen
This is our Minor Project-1, which is upon Features Extraction for Spam Email Detection using Natural Language Processing.
Training new TR detection model using Tesseract OCR engine 5.2 with new fonts.
Introducing an advanced project that uses OpenAI's API services to create an interactive chatbot. Users can easily ask inquiries verbally or in writing, and receive responses in both text and speech formats. Evyan-OpenAI-ChatBoT is a combination of powerful natural language processing and user-friendly design to enhance conversational engagement.
Peter Norvig's Spell Corrector Implemented in C++
Implementation of Text Classifier in C++. This project focuses on implementing each stages of a text based classifier i.e. from preparing data (including tokenization, padding and creating embedded Matrix), and feed that to a MLP to see how it performance.
Fast word-like N-gram embeddings
Neural network for binary sentiment analysis made in C++. Data cleaned with Python/Jupyter notebook. Coded from scratch using STL.
To analyze correlation between coding style and coding proficiency, and whether coding styles show regional variations.
Created by Alan Turing