Way to split code into multiple files #40

oxinabox · 2019-12-19T23:54:24Z

Per brief discussion at NeurIPS.
Sometime Input is not really IO,
In that it is deterministic, e.g. the loading of data.
So just code in another format, in another file.
This kind of input is thus actually more closely linked to metaprogramming than normal IO.

But right now, we can't actually load code in the Dex language, from another file. AFAIK.

I propose the additional of :include file/path.dx
as a command allowed to occur at top level.

Then I will do some metaprogramming in some other language (obs. Julia), in order to generate some Dex code that contains my data.
Which I will :include at the top of my script.

I don't want to put it directly in my script as it's probably going to be thousands of lines long.
Also I might want to regenerate it.

A possible generalisation of this would be includeby::(String->AST)->Path->Nothing
Which would take in a function to do the metaprogramming.
But I don't think we are anywhere near there yet?

The text was updated successfully, but these errors were encountered:

dougalm · 2020-01-01T19:24:56Z

Yes, we definitely want an import system. I think I'll just follow Haskell here, starting with an unqualified import Foo that gives you access to the whole top-level namespace of foo.dx.

Then I will do some metaprogramming in some other language (obs. Julia), in order to generate some Dex code that contains my data.

That's a good place to start for now, but I think we'd ideally have a dedicated serialization format (or two) rather than actually executing Dex code containing mostly literals. You'd still load it as if you were importing a module, but our implementation would just need to parse it and put the data in memory, rather than compile and execute it. I'm imagining two formats: textual and binary. The textual one might be a subset of Dex syntax (like JSON is a subset of javascript). The binary one could be close to Dex's internal runtime data structures, so that it could just be memmapped.

oxinabox · 2020-01-01T20:07:59Z

Some time this week, I intend to write a few slides explaining that seperating files from namespaces, and import (access things from namespace) from include (near direct-text transfer) is (against common wisdom) a good thing.
Title of that part of the talk: "namespaces are overrated, let's have less of them".
Which is not the whole argument, at all.
But I am yet to write it.
Key points other points include:

making it easy to make and use local packages vs full on packages managed by package manager means they don't take that jump and thus do not release things nor benefit from dependency management.
overly long files, or overly empty namespaces.

The short is: I think should have an include right now, not an import.

I think a binary formay would be good.
When I had mere 200 examples from fashionMNIST as text constants it was taking ages to reload in web.
Though an include would help there as can avoid reparsing to check for changes i guess.

dougalm · 2020-01-03T01:36:37Z

Implementing include is also quite a bit simpler than import, and we can use ordinary unix file paths (foo/bar.dx) rather than inventing our own name resolution mechanisms (Foo.Bar). You've convinced me! 3087694 still has a few rough edges but it's a start.

This still isn't a good way to load data, since compiling huge literals will be very slow (several LLVM instructions per scalar). I'll keep this issue open until we have a dedicated text or binary serialization solution.

oxinabox · 2020-01-07T10:57:41Z

A change also needs to be made to WebOutput.hs,
so that it knows it needs to watch out for changes in any include'd files.

This means we can efficiently(ish) load "dex object" files and write them to memory directly rather than compiling them as huge LLVM literals (see #40). It's still not very fast: loading a length-100k vector of integers takes 4 seconds. It doesn't hit LLVM, but it still uses the general program parser and type inference. We can probably make it faster with literal-specific ones.

dougalm · 2020-01-10T18:58:01Z

I just made a little binary data format (0c7f0fa to 70742cb). It's a mmappable format, so it may also help us build a zero-copy FFI. There's a Python module to dump NumPy arrays into it. I'd welcome a Julia one ;).

oxinabox · 2020-01-10T20:03:40Z

I'd welcome a Julia one ;).

Seems fun.

dougalm · 2020-02-26T20:41:43Z

I think load and include are good enough for now. Closing.

oxinabox mentioned this issue Dec 28, 2019

WIP: neural network demo #47

Closed

dougalm added a commit that referenced this issue Jan 3, 2020

Add an include directive, starting to address #40.

3087694

oxinabox changed the title ~~Way to split code I to multiple files~~ Way to split code into multiple files Jan 7, 2020

dougalm closed this as completed Feb 26, 2020

oxinabox mentioned this issue Jun 20, 2020

include seems not to be bringing things into scope #107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Way to split code into multiple files #40

Way to split code into multiple files #40

oxinabox commented Dec 19, 2019

dougalm commented Jan 1, 2020

oxinabox commented Jan 1, 2020 •

edited

Loading

dougalm commented Jan 3, 2020

oxinabox commented Jan 7, 2020

dougalm commented Jan 10, 2020

oxinabox commented Jan 10, 2020

dougalm commented Feb 26, 2020

Way to split code into multiple files #40

Way to split code into multiple files #40

Comments

oxinabox commented Dec 19, 2019

dougalm commented Jan 1, 2020

oxinabox commented Jan 1, 2020 • edited Loading

dougalm commented Jan 3, 2020

oxinabox commented Jan 7, 2020

dougalm commented Jan 10, 2020

oxinabox commented Jan 10, 2020

dougalm commented Feb 26, 2020

oxinabox commented Jan 1, 2020 •

edited

Loading