Skip to content
Vincent Hellendoorn edited this page Jul 10, 2017 · 8 revisions

What is SLP?

SLP stands for Software Language Processing (SLP), a play on Natural Langue Processing (NLP). This recent field attacks the complexity of software by treating it first and foremost as a means of communication between developers, which yet serves a very concrete purpose. This perspective allows us to combine insights from linguistics, software engineering research and programming languages research to create better models of software. In this perspective, code is a kind of language, and language models are very useful (think of Siri, auto-correct, chat-bots: all powered by language models).

Why a new tool?

In code, we are beginning to discover the benefits too: good language models enable great code completion, could actually help hunt down bugs, suggest better variable names and even help you get your pull requests accepted! But code is also very different from natural languages like English in some crucial ways: it is dynamic, rapidly evolving with every commit, adding new identifiers, methods, modules and even entire new libraries. It is also deeply hierarchical, with files in packages in modules in eco-systems, etc. Traditional language models are too static to deal with this

So what does SLP-Core do?

SLP-Core allows the mixing and dynamic updating of any combination of language models. We first demonstrate that power in the form of nested n-gram models: a simple count-based model that adapts to your project structure and is able to outperform (LSTM) deep-learners, while working even better in combination with them, reported in our FSE'2017 paper.

In the long term, SLP-Team's goal is to enable the (increasing) power of language models for software developers. It is part of a bigger eco-system in becoming that will include components for Deep Learning, Syntax-based models and Evolution-tracing models for source code.

Clone this wiki locally