This repository has been archived by the owner on Jun 4, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 204
/
parse_php.mli.nw
258 lines (199 loc) · 7.55 KB
/
parse_php.mli.nw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
\section{The main entry point of \pfff, [[Parse_php.parse]]}
The [[parse_php.mli]] file defines the main function
to parse a PHP file:
<<parse_php.mli>>=
<<type parsing_stat>>
<<type program2>>
(* This is the main function *)
val parse :
?pp:string option ->
Common.filename -> (program2 * parsing_stat)
val parse_program:
?pp:string option ->
Common.filename -> Ast_php.program
<<extra parse function signature>>
@
The parser does not just return the AST of the file
(normally a [[Ast_php.program]] type, an alias
for [[Ast_php.toplevel list]])
but also the tokens associated with each toplevel elements and its string
representation (the [[program2]] type below), as well as parsing
statistics (the [[parsing_stat]] type defined in the next section).
<<type program2>>=
type program2 = toplevel2 list
and toplevel2 =
Ast_php.toplevel (* NotParsedCorrectly if parse error *) * info_item
(* the token list contains also the comment-tokens *)
and info_item = (string * Parser_php.token list)
@
\footnote{
The previous snippet contains a note about the [[NotParsedCorrectly]]
constructor which was originally used to provide error recovery
in the parser. This is not used any more but it may be back in the futur.
}
Returning also the tokens is useful as the AST
itself by default does not contain the comment or whitespace tokens
(except when one call the [[comment_annotate_php]] function
in [[pfff/analyzis_php/]]) but some later processing
phases may need such information. For instance the \pfff
semantic code visualizer ([[pfff_browser]] in [[pfff/gui/]]) need
those information to colorize not only the code but also the comments.
If one does not care about those extra information, the
[[program_of_program2]] function helps getting only the ``raw'' AST:
<<parse_php.mli>>=
val program_of_program2 : program2 -> Ast_php.program
@
See the definition of [[Ast_php.program]] in the next
chapter.
The [[pp]] argument of [[parse]], for PHP preprocessing support,
will be explained in Section~\ref{sec:pp}.
\iffacebook
By default it will be equal to [[Some "xhpize"]] so that
the file is preprocessed by XHP first and then parsed by \pfff.
\fi
The [[parse_php.mli]] defines also a PHP tokenizer, a subpart
of the parser that may be useful on its own.
\l to redo mark slee checkModule, or simply to help debug the full parser
<<parse_php.mli>>=
val tokens: Common.filename -> Parser_php.token list
@
%(*val parse_piece_by_piece : Common.filename -> (program2 * parsing_stat)*)
\section{Parsing statistics}
<<type parsing_stat>>=
type parsing_stat = {
filename: Common.filename;
mutable correct: int;
mutable bad: int;
}
@
<<parse_php.mli>>=
val print_parsing_stat_list: parsing_stat list -> unit
@
\section{[[pfff -parse_php]]}
You can test the parsing capability of \pfff by calling it
with the [[-parse_php]] command line argument:
<<test_parsing_php actions>>=
"-parse_php", " <file or dir>",
Common.mk_action_n_arg test_parse_php;
@
<<test_parse_php>>=
let test_parse_php xs =
let ext = ".*\\.\\(php\\|phpt\\)$" in
(* could now use Lib_parsing_php.find_php_files_of_dir_or_files *)
let fullxs = Common.files_of_dir_or_files_no_vcs_post_filter ext xs in
let stat_list = ref [] in
<<initialize -parse_php regression testing hash>>
Common.check_stack_nbfiles (List.length fullxs);
fullxs +> List.iter (fun file ->
pr2 ("PARSING: " ^ file);
let (xs, stat) = Parse_php.parse file in
Common.push2 stat stat_list;
<<add stat for regression testing in hash>>
);
Parse_php.print_parsing_stat_list !stat_list;
<<print regression testing results>>
@
<<initialize -parse_php regression testing hash>>=
let newscore = Common.empty_score () in
@
<<add stat for regression testing in hash>>=
let s = sprintf "bad = %d" stat.Parse_php.bad in
if stat.Parse_php.bad = 0
then Hashtbl.add newscore file (Common.Ok)
else Hashtbl.add newscore file (Common.Pb s)
;
@
<<print regression testing results>>=
let dirname_opt =
match xs with
| [x] when is_directory x -> Some x
| _ -> None
in
let score_path = Filename.concat Config.path "tmp" in
dirname_opt +> Common.do_option (fun dirname ->
pr2 "--------------------------------";
pr2 "regression testing information";
pr2 "--------------------------------";
let str = Str.global_replace (Str.regexp "/") "__" dirname in
Common.regression_testing newscore
(Filename.concat score_path
("score_parsing__" ^str ^ ext ^ ".marshalled"))
);
()
@
\section{Preprocessing support, [[pfff -pp]]}
\label{sec:pp}
It is not uncommon for programmers to extend their programming
language by using preprocessing tools such as [[cpp]] or [[m4]]. \pfff
by default will probably not be able to parse such files as they may
contain constructs which are not proper PHP constructs (but [[cpp]] or
[[m4]] constructs). A solution is to first call your preprocessor on
your file and feed the result to \pfff. Some little support is provided
by \pfff to make it easy to call your preprocessor on-the-fly by accepting
a [[-pp]] command line argument.
\l #line ?
\l Expanded \ref{sec:ast-virtual-elements}
<<flag_parsing_php.ml pp related flags>>=
let verbose_pp = ref false
let caching_parsing = ref false
(* in facebook context, we want xhp support by default *)
let xhp_builtin = ref true
(* Alternative way to get xhp by calling xhpize as a preprocessor.
* Slower than builtin_xhp and have some issues where the comments
* are removed, unless you use the experimental_merge_tokens_xhp
* but which has some issues itself.
*)
let pp_default = ref (None: string option)
let xhp_command = "xhpize"
let obsolete_merge_tokens_xhp = ref false
let cmdline_flags_pp () = [
"-pp", Arg.String (fun s -> pp_default := Some s),
" <cmd> optional preprocessor (e.g. xhpize)";
"-verbose_pp", Arg.Set verbose_pp,
" ";
<<other cmdline_flags_pp>>
]
@
\section{XHP support, [[pfff -parse_xhp]]}
A useful PHP preprocessor is XHP\cite{XHP}. Some special
support for XHP is provided in \pfff:
XHP is something between a programmatic UI library and a full templating system.
<<other cmdline_flags_pp>>=
"-xhp", Arg.Set xhp_builtin,
" parsing XHP constructs (default)";
"-xhp_with_xhpize", Arg.Unit (fun () ->
xhp_builtin := false;
pp_default := Some xhp_command),
" parsing XHP using xhpize as a preprocessor";
"-no_xhp", Arg.Clear xhp_builtin,
" ";
@
One can either use the [[-pp xhpize]] or [[-xhp]]
command line options as a first
way to handle PHP files using XHP. Here is how to use it:
\begin{verbatim}
$ ./pfff -pp xhpize foo_with_xhp_construct.php
or
$ ./pfff -xhp foo_with_xhp_construct.php
\end{verbatim}
Note that this is only a partial solution to the management of XHP
or other extensions. Indeed, in a refactoring context, one would
prefer to have in the AST a direct representation of the actual
source file. So, \pfff also supports certain extensions directly
in the AST as explained in Section~\ref{sec:ast-xhp}.
\section{Xdebug traces parsing}
As explained later in Section~\ref{sec:ast-xdebug}, \pfff
provides some support to parse xdebug traces. The grammar and ASTs
have been extended to handle certain xdebug constructs used
in the traces. Moreover, a faster version of [[Parse_php.parse]]
specialized to the parsing of PHP expressions in those traces
is provided:
<<extra parse function signature>>=
val xdebug_expr_of_string: string -> Ast_php.expr
val class_def_of_string: string -> Ast_php.class_def
@
\section{Other parsing functions}
<<extra parse function signature>>=
val expr_of_string: string -> Ast_php.expr
val program_of_string: string -> Ast_php.program
@