Skip to content

Commit

Permalink
first chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
Ducasse committed Jun 11, 2011
1 parent 1ccf086 commit b2f1e3f
Showing 1 changed file with 97 additions and 1 deletion.
98 changes: 97 additions & 1 deletion VMIntro/VMIntro.tex
Expand Up @@ -31,8 +31,104 @@
%=================================================================
%\renewcommand{\nnbb}[2]{#2} % Disable editorial comments

\chapter{Traits at Work}
\chapter{Getting Started}

My main goals of this ÒJourney through the VMÓ are that the reader learns about the VM and to spread the VM knowledge, mostly for people who doesnÕt know anything about VM. For these same reasons months ago IÕve created a special mailing list for beginners. Please, ask everything you donÕt understand, correct me when I say something wrong, give your opinion, etc. There are always people ready to help in the mailing lists. So, letÕs start.

\section{Intro to VM}
As you may know, there are different Smalltalk dialects and each of them has its own VM. In this Journey through the VM, we will use the PharoVM. I hope there are newcomers reading this blog, so I will explain for them: Pharo, was a fork of another Smalltalk dialect called Squeak. However, both of them still uses the same VM. To avoid confusions, I will try to just say VM, but know that it is the Pharo/Squeak VM.

As far as I am aware of, most, if not all, Smalltalk VMs are implemented in C (or maybe parts in C++). Fortunately for us, happy smalltalkers, the Pharo/Squeak VM consist of two parts of code: a) VMMaker (written in Smalltalk) ; b) Òplatform codeÓ (C/Smalltalk-C hand-written code)

\subsection{VMMaker}
VMMaker is a normal repository in squeaksource, but its name is very misleading for me. One would imagine that it is a tool for building the VM, but the truth is that it is much more than that. In this VMMaker repository, there are different interesting packages but for the moment, we will concentrate in the most important one: the package VMMaker itself:

It is where the Smalltalk part of the VM is contained. I consider this part of the VM as the ÒcoreÓ, and it is written in Smalltalk. In fact, it is not really in Smalltalk but instead in a subset (limited) of Smalltalk called SLANG, which is then translated to C. So, even if this is much happier thank coding in C, you have to be aware that writing in SLANG has some limitations.

VMMaker has two very important classes: Interpreter and ObjectMemory. As it says in Interpreter class comment, they both together represents a complete implementation of the Smalltalk-80 virtual machine, derived originally from the Blue Book specification but quite different in some areas.

As you may know, every time to save a method in Smalltalk, the Compiler takes such source code, it analyzes, validates, parses, and finally compiles it, generating a CompiledMethod instance. In this object, all the literals and bytecodes are encoded. So, one of the main tasks of the VM is to interpret (thatÕs why its name) and execute Smalltalk bytecodes that reads from the image. In addition, Interpreter implements some VM responsibilities like methods lookup, cache and execution, primitives, arithmetic operations, etc.

ObjectMemory describes the object memory of the VM. Its responsibility includes allocating/deallocating memory, Garbage Collector and memory compaction, it defines the structure and flags of the object header, object enumeration, etc.


\subsection{Platform code}

Fortunately, the VM is decoupled from the issue of the supporting platform specific stuff, hence the Òplatform codeÓ. It contains those parts that depends on the OS like sockets, display and files, where performance is needed, etc. As I told you, I understand the VMMaker part as the ÒcoreÓ. But there are a lot of other features that the VM should provide and they are not part of the ÒcoreÓ but instead they are written like plugins. There several plugins like FilePlugin, SocketPlugin, SoundPlugin, etc. We are not going further now with the plugins since there will be a post later for that (you will be even able to write you own plugin!!!). What is important to understand now is that the platform has code both things: for the VM ÒcoreÓ part, and for plugins.

Another good reason to have the Òplatform codeÓ is to increase VM portabilty. Suppose you need to run the VM on a particular hardware/OS that supports C. In this case, you will probably need to change a little or even anything of the VM core (VMMaker). What you will need is to create a specific platform code for that hardware/OS. I think this is one of the reasons why you see Squeak VM running everywhere.

Most of the times that you want to hack, play or do research (I'm not sure that there is a difference anyway between these verbs) with the VM, you usually modify the VMMaker part. It is not likely that you will need to modify something in the platform code.

\section{How the VM is built?}
So...we have two parts of the VM: VMMaker (written in SLANG) and the platform code written in C (for the new MacOS platform there are parts written in Objective-C in order to talk with the cocoa library). There are two steps:

\begin{enumerate}
\item Translate VMMaker VM entities (ObjectMemory, Interpreter, the plugins, etc) to C. This is done by the classes that are in the category 'VMMaker-Translation to C'.

\item Compile the C source code generated from the previous step, together with the platform code.
We will not go further on this topic now because this is explained in the next chapter.
\end{enumerate}

\section{The VM is written is Smalltalk? really? is that cool?}

Ok, this can be controversial, so as always, this is just my point of view. I love Smalltalk. I love browsing for senders, implementors, references, doing refactorings, browsing methods and classes, etc. If you are reading this blog, you probably understand my feelings.
There are people who love C. Ok, I donÕt. IÕve even contributed (with code and fixes) to the OpenDBX library (written in C), so I can do it. But... do you know how cool is to be able to read and modify the VM from the same image that you usually use ? SLANG is quite limited, and sometimes it looks like C, of course, but it is still much better than coding in C for me:

\begin{enumerate}
\item You use the SAME Smalltalk image you use for other purposes

\item You have all the tools provided by the Smalltalk IDE: senders, implementors, refactorings, monticello, versions, etc. You browse Interpreter and ObjectMemory classes like if they were regular classes (indeed they are!).

\item Most of the cases, you donÕt need to deal with C (sometimes you do need).

\item It can be versioned in Monticello just like any other project.

\item This is cool: you take you image, you modify something in VMMaker, generate sources, go to a command line and using gcc (and the platform code) compiles the vm. Then, you take the compiled VM and run the same image you used for generating it hhehehe. IsnÕt that cool?

\item Finally, and this is the most important for several persons, the VM can be simulated. Since it is Smalltalk, you can run a InterpreterSimulator, inside a host image/VM. But we will talk about this later in the journey.
\end{enumerate}

\section{Cog VM and current status}

Some time ago, Eliot Miranda (thanks to Teleplace company) started to work on a new VM for Squeak (and all its fork, like Pharo). This VM is called Cog VM and it has been already released. In fact, is the default VM included in the PharoOneClick 1.1.1 and beyond. Cog VM includes a lot of new features, but in a glance:

\begin{enumerate}
\item Real and optimized block closure implementation. This is why from the image side blocks are now instances of BlockClosure instead of BlockContext.

\item Context-to-stack mapping.
\item JIT (just in time compiler) that compiles methods to machine code.
\item PIC (polymorphic inline caching).
\item Multi-threading.
\end{enumerate}

If you are a little around Smalltalk you may have heard about Cog VM and Stack VM. What is the big difference? Stack VM implements 1) and 2). And Cog VM is on top of the Stack VM and adds 3) and 4). Finally, there is CogMT VM which is on top of Cog VM and adds multi-threading support for external calls (like FFI for example). Now, to be able to put names to all these VMs and not get confused, we will call our old VM Interpreter VM.

Cog VMs have increased performance around 4x-10x. In addition, it has refactored a couple of things. For example, in Interpreter VM, the Interpreter was a subclass of ObjectMemory. But that was necessary in order to easily translate to C. In Cog, there are new classes like CoInterpreter and NewObjectMemory. But the good news is that we can have composition!! The CoInterpreter has an instance variable that is the object memory (in this case an instance of NewObjectMemory). This is awesome for us, and of course, it required changes in the SLANG to C translator. So as you can see, Cog is really important for the Squeak and Pharo community.

\section{Links}
One of the problems I found when trying to learn the VM is that there is little documentation and even more, it is not very known. This is why in every post I will try to put the links I know to the related topics. Keep and share them. They may be useful if you want to go further in the VM or in certain areas. Notice that it is probable that some information in there is outdated, but thatÕs better than nothing

\begin{itemize}
\item Smalltalk-80: The Language and its Implementation. Also knowns as Blue Book or THE book. You can download it from \url{http://stephane.ducasse.free.fr/FreeBooks.html}.

\item Chapter in the Pharo collaborative book. http://book.pharo-project.org/book/Virtual-Machine/
\item Official website of the Squeak VM. \url{http://www.squeakvm.org/index.html}

\item A Tour of the Squeak Object Engine. This is a chapter in the book 'Squeak, Open Personal Computing and Multimedia'. \url{http://stephane.ducasse.free.fr/FreeBooks/CollectiveNBlueBook/Rowledge-Final.pdf}

\item Screencasts related to the VM. \url{http://www.pharocasts.com/search/label/vm}

\item Eliot Miranda Cog Blog. \url{http://www.mirandabanda.org/cogblog/}

\item Back to the Future: The Story of Squeak, A Practical Smalltalk Written in Itself.

\item VMMaker in Squeak swiki. \url{http://wiki.squeak.org/squeak/2105}

\item The Squeak Virtual Machine in squeak swiki. \url{http://wiki.squeak.org/squeak/1447}
\end{itemize}
\section{End words}
So, that was all for today. I tried to give you a first overview of the VM. The first posts may be ÒboringÓ but they will get better If you have questions, remarks, or any kind of feedback, let me know.

%=========================================================
\ifx\wholebook\relax\else
Expand Down

0 comments on commit b2f1e3f

Please sign in to comment.