STAC is a tool that supports Static Textual Analysis of Code. It is a light-weight tool that integrates basic features of code indexing into a one-stop stand-alone solution.
STAC uses regular expressions to recognize the different text patterns in source code. Therefore, the code does not have to compile, or even to be complete, for STAC to work. Different regular expressions are provided to match the programming language of the project being indexed.
STAC uses Camel-case splitting to split tokens into natural language word and system specific word (such as abbreviations and acronyms). The user can choose "advance splitting" to split tokens that do not follow proper camel-casing.
STAC uses Porter Stemmer (http://www.tartarus.org/~martin/PorterStemmer) to reduce words to their morphological roots by removing derivational and inflectional suffixes.
STAC integrates the Stanford coreNLP (http://stanfordnlp.github.io/CoreNLP/) to reduce words to their lemmas (lingustic valid forms).
- STAC provides a spell-checking feature.
- STAC Generates basic statistical information about the project.
- STAC integrates a user-defined dictionary to allow the user to determine certain tokens that should not be split
- STAC maintains two lists of stop words, including natural language words (e.g., the, shall) and programming keywords (e.g., private, int)
Currently STAC supports the following programming languages:
- C# (.cs)
- Java (.java)
- C++ (.cpp, .h)
For more details on using STAC, see Help.md
In order to open and modify the C# source project you will need:
Once available, open SourceCodeIndexer.sln in src directory in Visual Studio and make SourceCodeIndexer.UI as the startup project.