Preprocessor

pad edited this page Oct 22, 2010 · 8 revisions

Preprocessor

Introduction

A recurring problem with the C/C++ parser included in pfff is the failure of the parser on code using certain cpp idioms. For instance with:

$ cat class.cpp
#define SLOT
class A {
        public SLOT:
          void foo(int);
};
int main() {}

The parser will choke with:

$ /pfff -parse_cpp class.cpp
parse error
 = File "class.cpp", line 3, column 8,  charpos = 31
    around = 'SLOT', whole content =    public SLOT:
ERROR-RECOV: found sync '}' at line 5
ERROR-RECOV: found sync bis, eating } and ;
badcount: 4
bad: #define SLOT
bad: class A {
BAD:!!!!!       public SLOT:
bad:      void foo(int);
bad: };

This is "normal". The C/C++ parser in pfff try to parse as-is the source code, without calling first cpp, the C preprocessor. The reason is that in a refactoring context, you can not call cpp because cpp expands all the macros and ifdefs leading then to a representation of the source code that is not convenient to work on. Indeed modifying the expanded code with a refactoring would not easily back propagate to the original source (before expansion). To still be usable, the C++ parser recognizes a few macros idioms such as FOR_EACH, attributes, etc. This is documented here [1]_.

Supported idioms

The macro file hook

TODO Use -D or -macro_file.

TODO new semantics for the -macro_file option, by default now expand macros only when necessary.

Partial expansion

TODO Handling of expanded code containing ##. Now compute the result.

TODO do expansion of macros only when needed when have actually a parse error and also leverage the definition of macros in the parsed file (or in a optional_standard.h file passed as a parameter). This should reduce the need for many hardcoded definitions in standard.h

Special macro keywords

TODO YACFE_IDENT_BUILDER

The macro file "builtins"

TODO To force use the -macro_file_builtins option instead.

Help generating the macro file

TODO top 10 recurring problematic macros Consider the ident tokens also in the 2 lines before the error line for the 10-most-problematic-parsing-errors diagnostic

TODO a new -extract_macros command line action to help the parser. Works with
the -macro_file option. e.g.
$ ./spatch -extract_macros ~/linux > /tmp/alldefs.h $ ./spatch -macro_file /tmp/alldefs.h -sp_file foo.cocci -dir ~/linux

Debugging parsing errors

TODO -verbose_parsing

References

[1] Yoann Padioleau, Parsing C/C++ Code without Pre-processing, CC'2009
http://padator.org/papers/yacfe-cc09.pdf