Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SICP metacircular evaluator #33

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

SICP metacircular evaluator #33

wants to merge 9 commits into from

Conversation

whilo
Copy link

@whilo whilo commented Sep 9, 2013

Hi,

I have worked through parts of SICP for our own language implementation https://github.com/lamda-lang/jekyll. I am interested in writing a Clojure/Lamda runtime with the help of ClojureC (instead of bootstrapping from a C runtime), even if only for the fun of it. So I have implemented the metacircular evaluator in Clojure and ported it to ClojureC over the weekend. I have added missing cljs functions, but I still miss a reader.
Do you have any recommendations of how to approach the reader? I would try to port cljs.reader, but I am still unfamilar with most pieces I am poking with.

Christian

runtime with ClojureC. Copy/port some cljs functions to core.cljc,
some reader still needs porting. Fix README.
@schani
Copy link
Owner

schani commented Sep 9, 2013

This is wonderful! I'll have a close look at it as soon as I have time.

Regarding the reader: I took a cursory glance at the ClojureScript reader and it seems that apart from some string manipulation stuff we should be in pretty good shape to use it with little modification. Email me if there are any particular points you're curious about or need help with - mark.probst@gmail.com

Port reader from ClojureScript, works for:
Integer
Ratio
Float/Double
String
Symbol
Keyword
Vector
List
Map

Some problem corrupts symbols though, still investigating, possibly a
garbage collection error. Can be tested by applying nested arithmetic
expressions repeatedly with the metacircular REPL.
@whilo
Copy link
Author

whilo commented Sep 13, 2013

Ok. The reader seems to work basically, but symbols get corrupted after a short delay. Just compose simple arithmetic expressions like (+ (* 4 10) 2) or (cons 5 (cdr '(1 2))) and extend them on the metacircular REPL. At some point of extension (the first one was enough on my machine) the symbol "+" is sometimes corrupted and sometimes the expression evaluates fine. The longer the expression the more likely the corruption, it seems. I suspected it was a garbage collection issue, but just copying the string with "make_string_copy" in "make_symbol" hasn't helped.

There is also a bug in the order of namespace initialization in the driver code for me btw, the generated driver c-code initializes the sample.metacircular (or cljc.user) namespace after cljc.core which leads to uninitialized externs. I have hand adjusted it for now.

We can transfer this discussion into email, but then it is not documented publicly. Whatever you prefer, I am still farily new to github.

@whilo
Copy link
Author

whilo commented Sep 13, 2013

I meant the driver initializes the user namespace before cljc.core:

int MAIN_FUNCTION_NAME(int argc, char argv[]) {
environment_t *env = NULL;
cljc_init ();
BEGIN_MAIN_CODE;
init_sample_DOT_metacircular ();
init_cljc_DOT_core ();
return integer_get (FUNCALL1 ((closure_t
)VAR_NAME (cljc_DOT_core_SLASH_main_exit_value), cljc_core_apply (2, (closure_t_)VALUE_NONE, VAR_NAME (sample_DOT_metacircular_SLASH__main), FUNCALL2 ((closure_t_)VAR_NAME (cljc_DOT_core_SLASH_vector_from_c_string_array), make_integer (argc), make_raw_pointer (argv)), VALUE_NONE, NULL)));END_MAIN_CODE;
}

Maybe I missed something, it has worked initially with my checkout from 3 months ago or so.

Fix symbol string corruption by using strdup to copy the symbol
string. Probably this leaks still. Add Objective-C code to
parse-float. REPL works now as expected :-D
@whilo
Copy link
Author

whilo commented Sep 14, 2013

I have had a look at https://github.com/kanaka/clojurescript and I suspect I have to port compiler.cljs (and analyzer.cljs which looks straightforward) to emit the cljc runtime primitives (eval). For most primitive types it should be easy to emit within cljc, a problem though is integration with the cljc environment. I'd prefer having one environment, basically allowing to (re)define vars at runtime. I am not sure whether this is feasible though. I have not looked yet into how to implement the macro system.

This allows cljc.reader to read: #inst "2013-09-02T16:42:00.000-00:00",
but since we don't have a Date type yet it just emits a vector of the
respective fields as integers. Cleanups.
@schani
Copy link
Owner

schani commented Sep 23, 2013

I'm sorry I'm only responding so late - it's been quite busy lately.

About the string issue: This isn't documented (of course), but make_string() assumes that the string it is given will not mutate or go away. If it might, use make_string_copy(), which does what your modified make_string() does.

Symbols and keywords: Why didn't you use intern_symbol() and intern_keyword()? They make symbols and keywords unique, so they can be compared via pointer comparison. I don't know why we would want uninterned symbols or keywords.

I'm not sure what you mean by porting compiler.cljs. ClojureC has "ported" compiler.cljs to emit C instead of JavaScript.

As you said, one of the difficulties is interacting with the environment. In the simplest case, redefining variables. Currently this isn't possible because ClojureC top-level variables are represented in the generated C as top-level C variables, and code that references them does so directly via these variables. To make them visible to an interpreter we'd have to build namespace datastructures, like Clojure does, where variables are registered via their symbols. It'll be even harder for protocols, fields, etc, but I'm sure it can be done.

- Port analyzer and make it accessible from the REPL. Add respective
runtime data primitives to cljc.core. Doesn't work yet, because
sample.metacircular building seems to depend on init_cljc_DOT_*
constructor order in driver.c, either the REGEX constants are not
properly initialized or the vtable of the methods is null (?).

- Port printing to new IPrintWithWriter protocol and move string
  conversion to str. Isolation of IPrintable protocol still TODO.

- Add UUID. Fix reading of sets and other '#' initialized objects.

- Make StringBuilder mutable and behave like in Java or JavaScript, not
returning a new StringBuilder on -append!. This makes code more easily
portable.

- Date type still missing as well as port of macro system removed for
  now.
@whilo
Copy link
Author

whilo commented Sep 24, 2013

I need your help. I have ported and could get along with gdb and trial and error so far, but I don't understand why initialization order is important and how I can get to start calling the analyze function from the metacircular repl to continue towards eval and the compiler. Maybe you could have a look at some point why either the reader regex constants or the vtables are broken (assert errors) depending on the initialization order in driver.c ?

@whilo
Copy link
Author

whilo commented Sep 24, 2013

Oh, I just saw your comment after posting.

"Symbols and keywords: Why didn't you use intern_symbol() and intern_keyword()? They make symbols and keywords unique, so they can be compared via pointer comparison. I don't know why we would want uninterned symbols or keywords."

Ok. I just don't understand that properly yet, so I have a look into it again and revert to the static intern methods. Making them unique for pointer comparison sounds reasonable.

"I'm not sure what you mean by porting compiler.cljs. ClojureC has "ported" compiler.cljs to emit C instead of JavaScript."

Yes, in fact it would not emit C-sources, but runtime objects directly for eval (not some byte-/source-code). If I am getting this totally wrong, I have to reinvestigate. I would like to have a simple Clojure REPL (without macros and anything complicated) as soon as possible as this makes development much more interesting. I thought I port the analyzer and use compiler.cljs then to create (emit) runtime primitives from the analyzed output.

For redefining I guess we need to create the "namespaces" map and environment model in Clojure primitives which map to the C symbols during runtime, but can be altered on runtime as well. This is not crucial to get some REPL (with an empty/hacky env) running though, so I would try to focus on that first.

@schani
Copy link
Owner

schani commented Sep 28, 2013

I'm not sure what you mean by "runtime primitives". ClojureC doesn't have runtime primitives for interpreted code, unless what you mean is standard closures, like SICP does in 4.1.7: http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-26.html#%_sec_4.1.7

My guess is that you can probably turn compiler.clj into something that generates those closures, but note that it's currently not even self-hosting: It requires Clojure to run, not ClojureC. Which doesn't mean that it can't be made to run on ClojureC, there's just probably quite a few parts where ClojureC is still missing stuff.

I'll take a look at the initialization order issue.

@schani
Copy link
Owner

schani commented Sep 29, 2013

I fixed the initialization order bug and wrote a build script for your interpreter: https://github.com/schani/clojurec/tree/ghubber-no-analyzer

How did you make the reader build without the changes I made (referring to cljc.string by full name)? That's a bug I haven't fixed yet.

…. Analyzing of

data primitives seems to work.

Merge branch 'ghubber-no-analyzer' of https://github.com/schani/clojurec

Conflicts:
	src/cljc/cljc/reader.cljc
@whilo
Copy link
Author

whilo commented Sep 30, 2013

Thanks! I have compiled cljc.string from core.cljc separately with leiningen as well and then build this c-file and linked it in. Maybe that helped, I am not sure. I have also changed order in driver.c manually and added the init routines for cljc.string and cljc.reader, I just wanted to see how far I can go or whether the C environment and code-base will struck me in getting a first version running. Last time I hacked KDE C++ libs with multi-threading years ago I met my mental limits XD (later in Java with multithreading). This is one reason why I came to Clojure. But I see value in a natively integrated VM, similar to https://github.com/halgari/clojure-metal
I also think that the FOSS GNU/Linux environment (I am using) being a comparatively interesting C guest for a VM and an important "frontier" for Clojure. If ClojureC gets support for calling out to C libraries at one point, this could be directly used from the VM/REPL as well.
I have just tried your patch and build-script and added the analyzer. Analyzing seems to work for data primitives.

My plan was to port analyzer.cljs and compiler.cljs (completely separate from the compiler.clj Clojure compiler of ClojureC, it needs to get a new non-conflicting name) from https://github.com/kanaka/clojurescript slowly over with the help of the static ClojureC environment, but using it more as a vehicle at first. I would like to share functions with ClojureC like the reader and core.cljc. Initially I would try to use ClojureC's closures to implement a basic REPL as you mention. After that I would try to allow compiler.cljs/c (not compiler.clj) to emit runtime versions of protocols, etc. and integrate it with ClojureC. If this is not possible, I need to port more over and probably need some feedback with the integration of C primitives. For now I thought it was enough to make the analyze-function accessible from the metacircular REPL, to see what is being emitted from analyze.cljc. In the end if ever something usable comes out of it, the compilers should converge. Self-hosted ClojureScript is waiting for feature expressions in Clojure 1.6, it seems. With it and a self-hosted compiler, I can imagine that ClojureScript and different implementations of runtime primitives could converge again. But I am still a n00b in runtime design and Clojure. Most important for me is to have the ideal hackable environment to play around with the runtime. This is the only use of the implementation of a Clojure VM I can see atm. as it allows to leverage the immutability properties of Clojure to explore new optimizations in JIT design (memoization) from inside the REPL and therefore make environment properties and optimizations adjustable on runtime (imagine e.g. core.logic having specializied JIT routines as a very far off perspective).
It also allows to link in LLVM like halgari did with clojure-metal or build other static primitives with C and ClojureC to model a stack-machine, etc. even in parallel to an unoptimized statically compiled runtime.
For shell hacking on *nix kanaka's self-hosted ClojureScript compiler on node.js is probably better as long as ClojureC has no tight integration with *nix and no serious optimizations (V8 is way above what one can achieve atm.).
Maybe we can talk at some point (no hurry). You don't happen to go to EuroClojure, do you?

Make cljc.string/index-of function return -1 instead of nil for
"not-found" similar to Java and JavaScript. Analyzing of quoted
expressions seems to work basically.
@whilo
Copy link
Author

whilo commented Oct 4, 2013

I have read about https://github.com/Bronsa/CinC/ on planet.clojure and then on the clojure google list. Instead of getting the REPL working first I probably should approach the clojure compiler hackers there, although I feel like an amateur still.

I guess I should post there and ask for feedback. I will ping you here again with a link, once I have done so.

@whilo
Copy link
Author

whilo commented Oct 7, 2013

@whilo
Copy link
Author

whilo commented Dec 3, 2013

I have to take more time to better understand CinC and continue studying SICP. Are you interested in merging the metacircular evaluator as an example, the reader and (some of) the core changes?
I will remove analyzer.cljc, as it is not used atm. except as a playground. If you need any further changes or adjustments, just drop me a note here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants