Skip to content

ltcmelo/psychec

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
C
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Psyche-C

Psyche is a rather unique compiler frontend for the C programming language that is specifically designed for the implementation of static analysis tools. This is where the "uniqueness" of Psyche-C comes from:

  • Clean separation between the syntactic and semantic compiler phases.
  • Algorithmic and heuristic syntax disambiguation.
  • Type inference of missing struct, union, enum, and typedef
    (i.e., tolerance and "recovery" against #include failures).
  • API inspired by that of the Roslyn .NET compiler.
  • AST resembling that of the LLVM's Clang frontend.

Library and API

Psyche-C is implemented as a library. Its native API is in C++ (APIs for other languages are on the way).

void analyse(const SourceText& srcText, const FileInfo& fi)
{
    ParseOptions parseOpts;
    parseOpts.setTreatmentOfAmbiguities(ParseOptions::TreatmentOfAmbiguities::DisambiguateAlgorithmically);
    
    auto tree = SyntaxTree::parseText(srcText,
                                      TextPreprocessingState::Preprocessed,
                                      TextCompleteness::Fragment,
                                      parseOpts,
                                      fi.fileName());

    auto compilation = Compilation::create("code-analysis");
    compilation->addSyntaxTree(tree.get());

    AnalysisVisitor analysis(tree.get(), compilation->semanticModel(tree.get()));
    analysis.run(tree->translationUnitRoot());
}
SyntaxVisitor::Action AnalysisVisitor::visitFunctionDefinition(const FunctionDefinitionSyntax* node) override
{
    const sym = semaModel->declaredSymbol(node);
    if (sym->kind() == SymbolKind::Function) {
        const FunctionSymbol* funSym = sym->asFunction();
        // ...
    }
    return Action::Skip;
}

The cnippet Driver

Psyche-C comes with the cnippet driver so that it can also be used as an ordinary C parser.

void f()
{
    int ;
}

If you "compile" the snippet above with cnippet, you'll see a diagnostic similar/equal to what you would see with GCC or Clang.

~ cnip test.c
test.c:4:4 error: declaration does not declare anything
int ;
    ^

NOTE: Semantic analysis isn't yet complete.

Type Inference

Psyche-C can infer the missing types of a code snippet (a.k.a. as an incomplete program or program fragment).

void f()
{
    T v = 0;
    v->value = 42;
    v->next = v;
}

If you compile the snippet above with GCC or Clang, you'll see a diagnostic such as "declaration forTis not available".
With cnippet, "compilation" succeeds, as the following definitions are (implicitly) synthesised.

typedef struct TYPE_2__ TYPE_1__;
struct TYPE_2__ 
{
    int value;
    struct TYPE_2__* next;
} ;
typedef TYPE_1__* T;

These are a few application of type inference for C:

  • Enabling, on incomplete source-code, static analysis techniques that require fully-typed programs.
  • Compiling partial code (e.g., a snippet retrieved from a bug tracker) for object-code inspection.
  • Generating test-input data for a function in isolation (without its dependencies).
  • Quick prototyping of an algorithm, without the need of explicit types.

NOTE: Type inference isn't yet available on master, only in the original branch.

Documentation and Resources

Building and Testing

Except for type inference, which is written in Haskell, Psyche-C is written in C++17; cnippet is written in Python 3.

To build:

cmake CMakeLists.txt && make -j 4

To run the tests:

./test-suite

Related Publications