Browse files

cEP 9: Extract and use information

Extract and utilize information from project files.
  • Loading branch information...
satwikkansal committed May 25, 2017
1 parent 3c5ec97 commit eb4f194fe8b4cdede65061c08f880c2880936332
Showing with 209 additions and 0 deletions.
  1. +209 −0
@@ -0,0 +1,209 @@
| Metadata | |
| ------------ |----------------------------------------------------------|
| cEP | 9 |
| Version | 0.1 |
| Title | Extract and utilize information from project files |
| Authors | Satwik Kansal <> |
| Status | Proposed |
| Type | Feature |
## Abstract
There are many kinds of project files like ".editorconfig", "Gruntfile.js", etc. that contain meta-information that can used by tools like coala-quickstart to provide better analysis. This cEP proposes a mechnaism to represent and extract information from project files and an interface to utilize the extracted information.
## Proposed Approach
### 1. Implementation of the “Info” class
A subclass of `Info` class will represent a specific kind of information. For example, the class `IgnorePathInfo` may represent “ignored file paths”.
class Info:
description = "Some information"
# for validation
value_type = ("data-type of the information value, can also be an ``Enum``"
"of possible choices.")
def __init__(self,
value: value_type
:param source: Source from where the information is discovered.
:param value: Value of the infomation to be represented.
:param scope: Scope of the applicability of the information. Some
of the values may be "bear-specific" or "global".
self.source = source
self._value = value
# optional metadata
self.scope = scope
def value(self):
return self._value
def name(self):
return self.__class__.__name__
### 2. Implementation of “InfoExtractor” class
Any derived class of the `InfoExtractor` abstract class consists of:
- Methods for parsing the project file.
- Methods to find information in the parsed file.
- A collection of `Info` objects representing the information extracted from a particular kind of project file.
- Files to parse and extract information from.
from coalib.parsing.Globbing import glob
class InfoExtractor:
# Kinds of files supported by the InfoExtractor
supported_files = []
# Meta-data
# A list of ``Info`` classes relevant to the class
supported_information_kinds = []
def __init__(self,
:param target_files: list of file globs to extract information
:param project_directory: Absolute path to project directory in which
the target files will be searched.
self.target_files = target_files = project_directory
self._information = dict()
def information(self):
Return extracted information (if any)
return self._information
def parse_file(self, filename):
Parses the given file and returns the parsed file.
raise NotImplementedError
def _add_info(self, info):
Method to organize and add the supplied information in self.information attribute.
:param info: Collection of ``Info`` instances.
def extract_information(self):
Extracts the information, saves in the object and returns it.
filenames = retrieve_files(self.target_files)
for fname in filenames:
pfile = self.parse_file(fname)
file_info = find_information(pfile)
if file_info:
self._add_info(fname, file_info)
return self.information
def find_information(self, fname, parsed_file):
Returns a collection of ``Info`` instances.
raise NotImplementedError
def retrieve_files(file_globs, directory):
Returns matched filenames acoording to the list of file globs and
supported files of the extractor.
matched_files = []
for g in file_globs:
return matched_files
## Example
import json
class LicenseUsedInfo(Info):
description = "License of the project."
value_type = Enum('MIT', 'Apache 2.0', 'GNU GPL')
class PackageJSONInfoExtractor(InfoExtractor):
supported_files = "package.json"
supported_information_kinds = [LicenseUsedInfo]
def parse_file(self, filename):
return json.loads(filename)
def find_information(self, fname, parsed_file):
if parsed_file.get("license"):
return LicenseUsedInfo(
project_dir = "any_project_dir"
pjson_info = PackageJSONInfoExtractor(
license_used = pjson_info.get("LicenseUsedInfo")
Another example
class LinterUsedInfo(Info):
description = "Linter used in project."
value_type = str
def __init__(self,
value: value_type,
super().__init__(source, value)
self.is_supported_by_coala = is_supported_by_coala
class GruntfileExtractor(InfoExtractor):
def parse_file(self, filename):
# parse the JS file using some library function
parsed_file = parse_javascript(filename)
return parsed_file
def find_information(self, fname, parsed_file):
# extract lint subtasks that coala supports from parsed Gruntfile.js
linters_used = ["jshint", "csslint"]
return [LinterUsedInfo(fname, l, True) for l in linters_used]
gfe = GruntfileExtractor("Gruntfile.js", project_dir)
coala_supported_linters = gfe.information.get("LinterUsedInfo")
## Possible Applications
This mechanism can be used by coala-quickstart to automatically extract some of the setting values for the bears and also to suggest project-relevant bears.
Related Links:
- to extract information

0 comments on commit eb4f194

Please sign in to comment.