Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] added script for Literate.jl #59

Merged
merged 12 commits into from Oct 24, 2018
Merged

Conversation

DhairyaLGandhi
Copy link
Member

WIP script for converting annotated scripts with their notebook counterparts using Literate.jl.

@MikeInnes
Copy link
Member

Ok so a few things we should do to make this easily scalable:

  • There should be an explicit listing of .jl files to convert (toml file?), since many of the things in this repo are supporting files. Rather than walking the directory we can just convert those alone
  • We should preserve the model zoo's existing folder structure in output, so that we can copy supporting files (especially project and manifest files)
  • Part of the listing should include any support jl or data files to be copied over without conversion. The manifest and project files are copied over by default if they exist.

So for example, if we start with model-zoo/text/lang-detection/{model.jl,scrape.jl,Project.toml,...} we want to end up with model-zoo/notebooks/text/lang-detection/{model.ipynb,scrape.jl,Project.toml,...}.

A couple of smaller things: this script can go in a scripts folder with its own project and manifest file for dependencies. Also, we will want all notebooks to have a header that activates the current environment; not sure if Literate.jl provides something for that, but we should figure out how best to do it.

@DhairyaLGandhi
Copy link
Member Author

Example Conf.toml might look like this:

[[Project]]
	name = "mnist"
	root = "vision/mnist"

	[[Project.conf]]

		name = "mnist with MLP"
		script = "mlp.jl" # "name_of_file.jl"
		deps = ["vae.jl"] # say

	[[Project.conf]]

		script = "conv.jl"
		deps = ["mlp.jl"]

[[Project]]

	name = "char-rnn"
	root = "text/char-rnn"

	[[Project.conf]]
		script = "char-rnn.jl"
		deps = []

@MikeInnes
Copy link
Member

Can we just commit that (maybe as Notebooks.toml) into the script folder? The whole thing may as well be self-contained.

How about

[MNIST]
root="vision/mnist"
notebook = ["mlp.jl", "conv.jl"]
deps = ["vae.jl", "input.txt"]

or is there a particular reason for splitting the mnist config like that?

@DhairyaLGandhi
Copy link
Member Author

That was done to be explicit about the config for a particular script that is converted.

Otherwise mapping it like you suggested is a good idea. We would need to add a dict to map the scripts to the final names.

@MikeInnes
Copy link
Member

MikeInnes commented Oct 9, 2018

I realised it would potentially be problematic to activate and execute multiple notebooks in the same process (which could end up with conflicting versions). So I had a quick go at a version that can be run easily as a subprocess, like julia convert.jl MNIST.

Something we'll also have to figure out – fredrikekre/Literate.jl#37

@DhairyaLGandhi
Copy link
Member Author

This adds a notebook.jl script which reads the TOML file and starts subprocesses for all the projects using the convert.jl script. I also have the convert.jl script remove any files that the notebooks generate during execution.

notebook = "mlp.jl"
deps = ["vae.jl"]

[CHAR-RNN]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should just lower case these so they're easier to type (mnist was only all caps because it's an acronym anyway).

@MikeInnes
Copy link
Member

Great. I think the last thing is to add a quick preprocessing step so that the mnist notebook looks good. Firstly we should add an extra # in using CuArrays so that it shows up as commented code in the notebook. Secondly we should add the ] activate .; instantiate line to the top of the script.

script/convert.jl Outdated Show resolved Hide resolved
@MikeInnes
Copy link
Member

Are we using the convert_to_nb script now, or should we delete that file? Otherwise I think this is good to go.

@DhairyaLGandhi
Copy link
Member Author

Removed the unused file.
Thanks for all the help!

@MikeInnes
Copy link
Member

Great work @dhairyagandhi96!

@MikeInnes MikeInnes merged commit fa85dd2 into FluxML:master Oct 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants