Skip to content

Commit

Permalink
cEP-0018: Integration of ANTLR in coala
Browse files Browse the repository at this point in the history
Propose integration of ANTLR in coala.

Closes #118
  • Loading branch information
virresh committed Jun 6, 2018
1 parent d7de189 commit 49c6cec
Show file tree
Hide file tree
Showing 2 changed files with 275 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ in the `cEP-0000.md` document.
| [cEP-0012](cEP-0012.md) | coala's Command Line Interface | This cEP describes the design of coala's Command Line Interface (the action selection screen). |
| [cEP-0013](cEP-0013.md) | Cobot Enhancement and porting | This cEP describes about the new features that are to be added to cobot as a part of the [GSoC project](https://summerofcode.withgoogle.com/projects/#4913450777051136). |
| [cEP-0014](cEP-0014.md) | Generate relevant coafile sections | This cEP proposes a framework for coala-quickstart to generate more relevant `.coafile` sections based on project files and user preferences. |
| [cEP-0018](cEP-0018.md) | Integration of ANTLR into coala | This cEP describes how an API based on ANTLR will be constructed and maintained |
| [cEP-0019](cEP-0019.md) | Meta-review System | This cEP describes the details of the process of implementing the meta-review system as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#5188493739819008). |
| [cEP-0027](cEP-0027.md) | coala Bears Testing API | This cEP describes the implementation process of `BaseTestHelper` class and `GlobalBearTestHelper` class to improve testing API of coala as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#6625036551585792). |
| [cEP-0021](cEP-0021.md) | Profile Bears | This cEP describes the detailed implementation of a profiling interface as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#6109762077327360). |
274 changes: 274 additions & 0 deletions cEP-0018.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
# Integration of ANTLR into coala core

| Metadata | |
| -------- | --------------------------------------------- |
| cEP | 18 |
| Version | 1.0 |
| Title | Integration of ANTLR into coala core |
| Authors | Viresh Gupta <mailto:viresh16118@iiitd.ac.in> |
| Status | Proposed |
| Type | Feature |

## Abstract

This document describes how an API based on ANTLR will be constructed and
maintained.

## Introduction

ANTLR provides parsers for various language grammars and thus we can provide
an interface from within coala and make it available to bear writers so that
they can write advanced linting bears. This will aim at supporting the more
flexible visitor based method of AST traversal (as opposed to listener based
mechanisms).

The proposal is to introduce a parallel concept to the coala-bears library,
which will be called `coala-antlr` hereon.

## Proposed Change

Here is the detailed implementation stepwise:

1. There will be a separate repository named as `coala-antlr` which will
be installable via `pip install coala-antlr`.
2. A new package would be introduced in the coala-antlr repository,
which would maintain all the dependencies for `antlr-ast` based
bears. This would be the `coantlib` package
3. The `coantlib` package will provide endpoints relevant to the
visitor model of traversing the ast generated via ANTLR.
4. Another package named `coantparsers` will be created inside
`coala-antlr` repository. This package will hold parsers and lexers
generated for various languages generated (with python target) beforehand
for the supported set of grammars.
5. `coantlib` will have an `ASTLoader` class that will be responsible
for loading the AST of a given file depending on the user input, using file
extension as a fallback.
For e.g, a .c file will be loaded using the parser from
`coantlib` for c language if no language is specified.
6. Another `ASTWalker` class will be responsible for providing a single method
to resolve the language supplied by the bear and create a walker instance
of the appropriate type.
7. Also `coantlib.walkers` will have several walker classes derived from
`ASTWalker` class that will be language specific and grammar dependant.
For e.g a `Py3ASTWalker`.
8. The bears dependant on the `coantlib` will reside in the same repository as
the `coantlib`. All of them will be in a different package, the `coantbears`
package, all of them installable together via a single option in the
setup.py along with `coantlib`.
9. The bears will be able to use a high level API by defining the `LANGUAGE`
instance variable, which will return to them a walker with high level API's
in case the language is supported. (for e.g if `LANGUAGE='Python'`, they
will automatically get a `BasePyASTWalker` instance in the walker class
variable)

## Management of the new `coala-antlr` repository

Managing a new repository is a heavy task, and this will be highly automated as
follows:

1. The parsers in `coala-antlr/coantparsers` would be generated and
pushed via CI builds whenever a new commit is pushed.
2. The cib tool can be enhanced to deal with the installation of bears that
require only some specified parsers (for e.g a `PyUnusedVarBear` would
only require parser for python).
3. There will be nightly cron jobs for keeping parsers up-to-date.
4. For adding support for a new language, a new Walker class deriving
`ASTWalker` would be added and the rest will be automatically taken care by
the library.

## Code Samples/Prototypes

Here is a prototype for the implementations within `coantlib`:

```python
import antlr4
import coantparsers
import inspector

from antlr4 import InputStream
from coalib.bearlib.languages.definitions import *
from coalib.bearlib.languages.Language import parse_lang_str, Languages


def resolve_with_package_info(lang):
"""
Use the inspector methods to search amongst all
classes that inherit ASTWalker within this module
and return that class whose supported language matches
"""


class ASTLoader(object):

mapping = {
C: [coantparsers.clexer, coantparsers.cparser],
Python: [coantparsers.pylexer, coantparsers.pyparser],
JavaScript: [coantparsers.jslexer, coantparsers.jsparser],
...
}

@staticmethod
def load_file(file, filename, lang='auto'):
"""
Loads file's AST into memory
"""
if(lang == 'auto'):
ext = get_extension(filename)
else:
ext = Languages(parse_lang_str(lang)[0])[0]
if ext is None:
raise FileExtensionNotSupported('Unknown File Type')
inputStream = InputStream(''.join(file))
lexer = mapping[ext][0](inputStream)
parser = mapping[ext][1](lexer)
return parser


class ASTWalker(object):
treeRoot = None
tree = None

@staticmethod
def make_walker():
"""
Resolves walker object for appropriate type
"""
if lang == 'auto':
return ASTWalker
else:
return resolve_with_package_info(lang)

@staticmethod
def get_walker(file, filename, lang):
"""
Returns a walker to the required language
"""
ret_class = make_walker(lang)
if not ret_class:
raise Exception('Un-supported Language')
else:
return ret_class(file, filename)

def __init__(file, filename):
self.tree = ASTLoader.load_file(file, filename)
self.treeRoot = self.tree
...
```

### Prototype of `Py3ASTWalker`

These kinds of Walkers will supply a high level walker API

```python
"""Prototype of Python ASTWalkers."""


class BasePyASTWalker(ASTWalker):
"""Base class for Python Walkers. This will be an abstract class."""

LANGUAGES = {'Python'}

def get_imports():
"""Return list of strings containing import statements."""

def get_methods():
"""Return list of strings containing all methods from the file."""
...


class Py2ASTWalker(BasePyASTWalker):
"""Python 2 specific walker."""

LANGUAGES = {'Python 2'}

def get_print_statements():
"""Return list of print statements, since print is a keyword in py2."""
...


class Py3ASTWalker(BasePyASTWalker):
"""Python 3 specific walker."""

LANGUAGES = {'Python3'}

def get_integer_div_stmts():
"""
Return a list of integer divide statements.
since / and // are different in py3, this is py3 specific
"""
...
```

### Prototype of `ASTBear` class implementation

```python
"""Prototype of ASTBear class."""

from coantlib import ASTWalker
from coalib.bears.LocalBear import Bear


class ASTBear(LocalBear):
"""
ASTBear class.
This class is similar to Local Bear except for the added capability of
initialising walker for use by the instance of this bear.
"""

walker = None

def initialise(file, filename):
"""Initialise self.walker according to the file and language."""
if LANGUAGES:
self.walker = ASTWalker.get_walker(file,
filename,
self.LANGUAGES[0])
else:
self.walker = ASTWalker.get_walker(file, filename, 'auto')

def run(self,
filename,
file,
tree,
*args,
dependency_results=None,
**kwargs
):
"""Actual run method - to be implemented by bear."""
raise NotImplementedError('Needs to be done by bear')
```

### A test bear

```python
"""A sample Bear."""

from coantlib.ASTBear import ASTBear
from coalib.results.Result import Result


class TestBear(ASTBear):
"""A test bear that uses walker from coantlib for linting."""

def run(self,
filename,
file,
tree,
*args,
dependency_results=None,
**kwargs
):
"""Essence of logic for bear."""
self.initialise(file, filename)
violations = {}
method_list = self.walker.get_methods()
for method in method_list:
"""
logic for checking naming convention
and yielding an apt result
wherever required
"""
...
```

0 comments on commit 49c6cec

Please sign in to comment.