New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap code loader #29

Merged
merged 28 commits into from Jun 14, 2017

Conversation

Projects
None yet
3 participants
@arrdem
Member

arrdem commented Jun 3, 2017

This PR is gonna take a while, just because it being "done" basically means that I've nailed the design for the analyzer & analysis state. This also requires having figured out the internal representation for types, macros, functions and a whole lot of fun things.

I think that the way this is gonna work out is that Ox will have to be a two-pass compiler
system. From a simple data availability & phasing perspective, it's not possible to support macros in an load-order insensitive language, and load-order sensitivity is one of the things I was hoping to kill in this project by doing whole-program compiles.

The only way I see to unify a language which features global type inference with a language
featuring macros is to first provide a load-order sensitive interpretive phase, during which time types are dynamically checked (but still present). This makes macro expansion possible in the interpretive phase. Once all code has been loaded and all macros have been expanded, then it is possible to do static type checking, code generation & optimization from the interpretive environment state.

The implication for the design of Ox's internal state is that it must provide abstract rather than concrete bindings, as for the most part concrete bindings won't be established yet.

Typing Haskell In Haskell is deeply influential in this design.

@arrdem

This comment has been minimized.

Show comment
Hide comment
@arrdem

arrdem Jun 4, 2017

Member

Twitter thread of rubber-ducking with @lojikil https://twitter.com/arrdem/status/871098267727151104

TL;DR the way to move forwards here seems to be by using phasing and delayed evaluation, combined with imposing the restriction that all top level forms establish bindings. If we restrict the grammar of top level forms to

(ns foo
  ; require is restricted to as, and bare requirements.
  ; refer and rename could be supported, but qualified imports are largely better.
  (require (as bar b))
  (require (referring bar [+]))
  (require (renaming bar {div /})))

; scheme style syntax-rules & define-syntax are nice.
; clojure-style macro functions would also work.
; The essential property is that a module level binding is established either way.
(define-syntax <name> body)

; defs are used to name types, functions & arbitrary expressions.
; Again, a module level binding is established this way.
(def <name> &optional docstring body)

; providing allows macro uses to occur at the top level.
; Note that at present, macros cannot occur at the top level because it's not possible
; to statically determine what symbol(s) are provided by a given form (un-computed
; macroexpansion). The providing form solves this problem by allowing "top-level"
; macros to emit definitions only in a context where the set of defs to be emitted is
; explicitly associated with the form which will emit them.
(providing [& names] & body)

This design allows a trivial lex-time static analyzer to compute the full set of defs in the program WITHOUT having to actually perform macroexpansion. After the initial lex & analysis has been performed, then definitions can be lazily macroexpanded as required since the set of definitions is already fully determined. This design also has the advantage of fully supporting mutually recursive modules AND load-order independence.

Member

arrdem commented Jun 4, 2017

Twitter thread of rubber-ducking with @lojikil https://twitter.com/arrdem/status/871098267727151104

TL;DR the way to move forwards here seems to be by using phasing and delayed evaluation, combined with imposing the restriction that all top level forms establish bindings. If we restrict the grammar of top level forms to

(ns foo
  ; require is restricted to as, and bare requirements.
  ; refer and rename could be supported, but qualified imports are largely better.
  (require (as bar b))
  (require (referring bar [+]))
  (require (renaming bar {div /})))

; scheme style syntax-rules & define-syntax are nice.
; clojure-style macro functions would also work.
; The essential property is that a module level binding is established either way.
(define-syntax <name> body)

; defs are used to name types, functions & arbitrary expressions.
; Again, a module level binding is established this way.
(def <name> &optional docstring body)

; providing allows macro uses to occur at the top level.
; Note that at present, macros cannot occur at the top level because it's not possible
; to statically determine what symbol(s) are provided by a given form (un-computed
; macroexpansion). The providing form solves this problem by allowing "top-level"
; macros to emit definitions only in a context where the set of defs to be emitted is
; explicitly associated with the form which will emit them.
(providing [& names] & body)

This design allows a trivial lex-time static analyzer to compute the full set of defs in the program WITHOUT having to actually perform macroexpansion. After the initial lex & analysis has been performed, then definitions can be lazily macroexpanded as required since the set of definitions is already fully determined. This design also has the advantage of fully supporting mutually recursive modules AND load-order independence.

@lojikil

This comment has been minimized.

Show comment
Hide comment
@lojikil

lojikil Jun 4, 2017

Love it (of course), and interested to see where you take it for sure (also, now watching this, lol dunno why I wasn't before).

lojikil commented Jun 4, 2017

Love it (of course), and interested to see where you take it for sure (also, now watching this, lol dunno why I wasn't before).

@arrdem

This comment has been minimized.

Show comment
Hide comment
@arrdem

arrdem Jun 4, 2017

Member

This isn't quite legal Haskell but it's pretty close to what I'm playing with once you allow for the PackageLoader class to be an interface instance serving to map PackageIdentifiers to resources and load them, returning a realized Package.

-- Ox.hs

data Form = Form
  { expr     :: LispVal
  , provides :: [Atom]
  } deriving (Show, Eq)

data Either3 a b c = A a
                   | B b
                   | C c

data FormId = FormId Integer

data Namespace = Namespace
  { name        :: Atom
  , forms       :: (Map FormId Form)
  , aliases     :: (Map Atom Atom) -- (require (as a b))
  , definitions :: (Map Atom (Either FormId
                                     Definition))
  } deriving (Show, Eq)

data Definiton = Alias Atom -- (require (referring a [b c]))
               | Data Any -- FIXME: Type definition
               | Function { documentation :: String
                          , macrop        :: Bool
                          , value         :: Any
                          }

data PackageIdentifier v = PackageIentifier Atom v

data Package v = Package
  { id           :: PackageIdentifier
  , namespaces   :: (Map Atom Namespace)
  , dependencies :: (Set PackageIdentifier)
  }

data VM = VM
  { loader   :: PackageLoader
  , packages :: (Map PackageIdentifier Package)
  }
Member

arrdem commented Jun 4, 2017

This isn't quite legal Haskell but it's pretty close to what I'm playing with once you allow for the PackageLoader class to be an interface instance serving to map PackageIdentifiers to resources and load them, returning a realized Package.

-- Ox.hs

data Form = Form
  { expr     :: LispVal
  , provides :: [Atom]
  } deriving (Show, Eq)

data Either3 a b c = A a
                   | B b
                   | C c

data FormId = FormId Integer

data Namespace = Namespace
  { name        :: Atom
  , forms       :: (Map FormId Form)
  , aliases     :: (Map Atom Atom) -- (require (as a b))
  , definitions :: (Map Atom (Either FormId
                                     Definition))
  } deriving (Show, Eq)

data Definiton = Alias Atom -- (require (referring a [b c]))
               | Data Any -- FIXME: Type definition
               | Function { documentation :: String
                          , macrop        :: Bool
                          , value         :: Any
                          }

data PackageIdentifier v = PackageIentifier Atom v

data Package v = Package
  { id           :: PackageIdentifier
  , namespaces   :: (Map Atom Namespace)
  , dependencies :: (Set PackageIdentifier)
  }

data VM = VM
  { loader   :: PackageLoader
  , packages :: (Map PackageIdentifier Package)
  }

arrdem added some commits Jun 4, 2017

@vendethiel

This comment has been minimized.

Show comment
Hide comment
@vendethiel

vendethiel Jun 4, 2017

AFAIK, "delay static type checking after macros are expanded" is hiw Nim does it, but I don't know how it works internally

vendethiel commented Jun 4, 2017

AFAIK, "delay static type checking after macros are expanded" is hiw Nim does it, but I don't know how it works internally

@arrdem

This comment has been minimized.

Show comment
Hide comment
@arrdem

arrdem Jun 5, 2017

Member

2017-06-05 01 03 48
2017-06-05 01 03 15

Today's notes... trying to hammer out how to phase the loading of packages given that packages need to be loaded namespace-at-a-time (files / readable resources are the loading unit) while packages and namespaces in packages may be mutually referencing/recursive and I'd like to choose an architecture which preserves a maximum of parallelism at least if possible.

Either packages need to, like namespaces, have explicit provides semantics to solve this problem or packages kinda need to go away and I just drop the whole multiple simultaneous versions goal of the project. Maybe there's a third road...

Member

arrdem commented Jun 5, 2017

2017-06-05 01 03 48
2017-06-05 01 03 15

Today's notes... trying to hammer out how to phase the loading of packages given that packages need to be loaded namespace-at-a-time (files / readable resources are the loading unit) while packages and namespaces in packages may be mutually referencing/recursive and I'd like to choose an architecture which preserves a maximum of parallelism at least if possible.

Either packages need to, like namespaces, have explicit provides semantics to solve this problem or packages kinda need to go away and I just drop the whole multiple simultaneous versions goal of the project. Maybe there's a third road...

@arrdem arrdem changed the title from Implement runtime & types to Bootstrap code loader Jun 9, 2017

arrdem added some commits Jun 9, 2017

@arrdem

This comment has been minimized.

Show comment
Hide comment
@arrdem

arrdem Jun 14, 2017

Member

SHIP AND ITERATE.

Member

arrdem commented Jun 14, 2017

SHIP AND ITERATE.

@arrdem arrdem merged commit 9f603e8 into develop Jun 14, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@arrdem arrdem deleted the runtime branch Jun 14, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment