Permalink
Browse files

Added some files with analysis of Compiler

  • Loading branch information...
1 parent 3363bbf commit 02949ea567c90aed62b211202cb827458246fafc @dmiller dmiller committed Sep 8, 2009
@@ -50,6 +50,10 @@ public override Type ClrType
get { return typeof(Boolean); }
}
+ #endregion
+
+ #region Code generation
+
public override Expression GenDlr(GenContext context)
{
return Expression.Constant(_val);
@@ -83,22 +83,12 @@ public override Expression GenDlr(GenContext context)
for (int i = 0; i < argCount; i++ )
args[i] = Compiler.MaybeBox(((Expr)_args.nth(i)).GenDlr(context));
- Expression call = GenerateInvocation(InvocationReturnType, fn, args);
+ Expression call = GenerateInvocation(fn, args);
return call;
}
- private Type InvocationReturnType
- {
- get
- {
- return (_tag == null)
- ? null
- : Compiler.TagToType(_tag);
- }
- }
-
- private static Expression GenerateInvocation(Type returnType, Expression fn, Expression[] args)
+ private static Expression GenerateInvocation(Expression fn, Expression[] args)
{
MethodInfo mi;
Expression[] actualArgs;
Binary file not shown.
@@ -0,0 +1,174 @@
+Somewhere (far) below are some questions on internals of the Clojure compiler,
+all related to one particular aspect of the compiler, but one that is ubiquitous.
+
+I've been trying to analyze the context parameter used in parsing and emitting code.
+The one of type C, where
+
+public enum C{
+ STATEMENT, //value ignored
+ EXPRESSION, //value required
+ RETURN, //tail position relative to enclosing recur frame
+ EVAL
+}
+
+This parameter is passed to IParser.parse() and Expr.emit() and some others,
+so it is everywhere.
+
+The first three values are fairly obvious. The last is a bit more mysterious.
+And the root call for parsing, to analyze, for both eval and compile pass C.EVAL.
+
+
+I stepped through all the Compiler code, looking for how the context parameter is used.
+
+
+Parsing
+=======
+
+Parsing subforms
+----------------
+
+There are several idioms for how the context is passed on when parsing subforms:
+
+(a) pass the current value through to the subform
+(b) pass C.EXPRESSION to the subform -- i.e., the subform's value is required
+(c) pass a value to the subform computed by: context == C.EVAL ? context ? C.EXPRESSION
+ in other words, pass along EVAL if that's the current context,
+ otherwise, we need a value (do not pass along STATEMENT or RETURN)
+(d) ignore -- not used
+
+There are two special cases:
+
+(a) in BodyExpr, all subforms but the last are parsed with C.STATEMENT unless the
+ current context is C.EVAL (in which case use C.EVAL).
+ The last subform is parsed with the current context.
+(b) The finally clause of a TryExpr is parsed with C.STATEMENT.
+
+
+Parsing escape
+--------------
+
+There are several expression types where the form to be parsed wraps itself in
+a no-arg function -- (fn [] form) -- and parses the resulting form instead.
+
+Those are:
+
+LetExpr: if ( context == C.EVAL || ( context == C.EXPR && isLoop ) ) ...
+LetFnExpr: if ( context == C.EVAL ) ...
+ThrowExpr: if ( context == c.EVAL ) ...
+TryExpr: if ( context != c.RETURN ) ...
+
+
+recur context
+-------------
+
+RecurExpr's parser throws an exception if the context is not C.RETURN.
+
+
+
+Emitting code
+=============
+
+Emitting subforms
+-----------------
+
+There are several idioms for how the context is passed on to subforms when they are
+asked to emit code.
+
+(a) pass through the current context to subform emit calls
+(c) pass c.EXPRESSION to subform emit calls
+(a) ignored
+
+Again, BodyExpr and TryExpr are the exceptions, just as in parsing:
+
+(a) BodyExpr: all subforms but last are emitted with context C.STATEMENT,
+ the last form is emitted using the current context
+(b) TryExpr: the finally clause is emitted with C.STATEMENT. Also, if the
+ current context is not C.STATEMENT, code is emitted to save and restore
+ the value from the body's evaluation, giving a value to be returned to the caller.
+
+
+Popping the stack
+-----------------
+
+The context is used to determine when a value needs to be popped from the runtime
+evaluation stack:
+
+ if(context == C.STATEMENT) gen.pop();
+
+
+
+Clearing locals
+---------------
+
+Finally the context is used to force locals to be cleared before a tail call:
+
+ if(context == C.RETURN)
+ {
+ FnMethod method = (FnMethod) METHOD.deref();
+ method.emitClearLocals(gen);
+ }
+
+
+Getting started
+===============
+
+When #eval is handed a form to evaluate, it does of several things (after macroexpansion):
+
+(a) If the form is (do form-1 form-2 ... ), eval each form-i in turn and return
+ the value of the last one.
+
+(b) If the form is an IPersistentCollection that does not begin with Symbol whose
+ name starts with "def", wrap the form in a no-arg function -- (fn [] form),
+ parse it starting with context C.EXPRESSION, then instantiate and eval the function.
+
+(c) otherwise, parse it starting with context C.EVAL.
+
+
+When #compile1 is handed a form to compile, it is similar, except case (b) is eliminated.
+
+
+Questions
+=========
+
+The reason I'm so interested in this is that as currently written, the compiler
+in ClojureCLR does not do _any_ of this. However, I have to add a little more
+machinery to make sure that I catch recur calls occurring in the wrong places,
+so I found analyzing the machinery in ClojureJVM that handles that.
+And I've learned that I ignore things that Rich has done at my own peril.
+So I'm trying to make sure I'm not missing some crucial aspect of the use of
+Context in parsing forms and emitting code.
+
+
+(0) What am I missing? (read that question any way you want)
+
+(1) Is there a succinct description for the meaning of C.EVAL?
+
+(2) Why are 'forms that of the form (defXXX ... )' singled out for special treatment
+ in eval?
+
+
+(3) (T or F) The forms that wrap themselves as (fn [] form) do so because they need
+ to make sure they have a context for local variable mangling, closing over, etc.,
+ that is provided only by a FnExpr. (Those forms are let, letfn, throw, try.)
+
+
+(4) Do we need an interpreted eval? Is there a problem with just wrapping any form
+ in an (fn [] form) and processing that?
+
+
+(5) If there is no need for the three separate cases in eval, is there any need for the
+ information carried by a context of C.Eval?
+
+
+(6) If we don't have to worry about popping values from the runtime stack where values
+ are being ignored, is there any need for the information carried by C.STATEMENT?
+ (In ClojureCLR, the DLR handles this automatically.)
+
+And while I'm at it: C.RETURN is used for tail-call detection. Recur is obvious.
+Not so obvious?
+
+
+(7) Are there any succinct examples of the problem being solved by emitClearLocals?
+
+(8) Why is emitClearLocals not called in more places? (It is called for InstanceMethodExpr,
+ InvokeExpr, NewExpr, and StaticMethodExpr.)

0 comments on commit 02949ea

Please sign in to comment.