Given a collection of documents, this project does the tokenization and stemming of all the words in the document collections. The implementation is done in java.
-
Updated
Feb 16, 2017 - Java
A grammar describes the syntax of a programming language, and might be defined in Backus-Naur form (BNF). A lexer performs lexical analysis, turning text into tokens. A parser takes tokens and builds a data structure like an abstract syntax tree (AST). The parser is concerned with context: does the sequence of tokens fit the grammar? A compiler is a combined lexer and parser, built for a specific grammar.
Given a collection of documents, this project does the tokenization and stemming of all the words in the document collections. The implementation is done in java.
An interpreter for a small imperative language.
Exploring Information Retrieval | Indexer Creation
Clone from https://github.com/phuonglh/vn.vitk and add some code to work with directory. Example command: bin/spark-submit /pathTo/vn.vitk-3.0.jar -i /pathTo/DataFolder/
Cool interpreted programming language
An interpreter for a (very) simple functional programming language.
Implement 'bad apple' program language in nearly 450 lines
Command-line lexical analyzer which only accepts arithmetic expressions
Java Brainfuck Interpreter
various operations on java string characters - self made
Simple inverted index implementation with term analysis and scoring.
My legal background gave me a deep appreciation for language's importance. It's not just words; it's a profound understanding woven into every case. This connection led me to coding, where I coded a potent pipeline system with Stanford CoreNLP.