-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
/
ipnb_google_soc.lyx
645 lines (472 loc) · 17 KB
/
ipnb_google_soc.lyx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
#LyX 1.3 created this file. For more info see http://www.lyx.org/
\lyxformat 221
\textclass article
\begin_preamble
\usepackage{hyperref}
\usepackage{color}
\definecolor{orange}{cmyk}{0,0.4,0.8,0.2}
\definecolor{brown}{cmyk}{0,0.75,0.75,0.35}
% Use and configure listings package for nicely formatted code
\usepackage{listings}
\lstset{
language=Python,
basicstyle=\small\ttfamily,
commentstyle=\ttfamily\color{blue},
stringstyle=\ttfamily\color{brown},
showstringspaces=false,
breaklines=true,
postbreak = \space\dots
}
\end_preamble
\language english
\inputencoding auto
\fontscheme palatino
\graphics default
\paperfontsize 11
\spacing single
\papersize Default
\paperpackage a4
\use_geometry 1
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\leftmargin 1in
\topmargin 0.9in
\rightmargin 1in
\bottommargin 0.9in
\secnumdepth 3
\tocdepth 3
\paragraph_separation skip
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default
\layout Title
Interactive Notebooks for Python
\newline
\size small
An IPython project for Google's Summer of Code 2005
\layout Author
Fernando P
\begin_inset ERT
status Collapsed
\layout Standard
\backslash
'{e}
\end_inset
rez
\begin_inset Foot
collapsed true
\layout Standard
\family typewriter
\size small
Fernando.Perez@colorado.edu
\end_inset
\layout Abstract
This project aims to develop a file format and interactive support for documents
which can combine Python code with rich text and embedded graphics.
The initial requirements only aim at being able to edit such documents
with a normal programming editor, with final rendering to PDF or HTML being
done by calling an external program.
The editing component would have to be integrated with IPython.
\layout Abstract
This document was written by the IPython developer; it is made available
to students looking for projects of interest and for inclusion in their
application.
\layout Section
Project overview
\layout Standard
Python's interactive interpreter is one of the language's most appealing
features for certain types of usage, yet the basic shell which ships with
the language is very limited.
Over the last few years, IPython
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://ipython.scipy.org}
\end_inset
\end_inset
has become the de facto standard interactive shell in the scientific computing
community, and it enjoys wide popularity with general audiences.
All the major Linux distributions (Fedora Core via Extras, SUSE, Debian)
and OS X (via fink) carry IPython, and Windows users report using it as
a viable system shell.
\layout Standard
However, IPython is currently a command-line only application, based on
the readline library and hence with single-line editing capabilities.
While this kind of usage is sufficient for many contexts, there are usage
cases where integration in a graphical user interface (GUI) is desirable.
\layout Standard
In particular, we wish to have an interface where users can execute Python
code, input regular text (neither code nor comments) and keep inline graphics,
which we will call
\emph on
Python notebooks
\emph default
.
This kind of system is very popular in scientific computing; well known
implementations can be found in Mathematica
\begin_inset ERT
status Collapsed
\layout Standard
\backslash
texttrademark
\end_inset
\SpecialChar ~
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.wolfram.com/products/mathematica}
\end_inset
\end_inset
and Maple
\begin_inset ERT
status Collapsed
\layout Standard
\backslash
texttrademark
\end_inset
\SpecialChar ~
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.maplesoft.com}
\end_inset
\end_inset
, among others.
However, these are proprietary (and quite expensive) systems aimed at an
audience of mathematicians, scientists and engineers.
\layout Standard
The full-blown implementation of a graphical shell supporting this kind
of work model is probably too ambitious for a summer project.
Simultaneous support for rich-text editing, embedded graphics and syntax-highli
ghted code is extremely complex, and likely to require far more effort than
can be mustered by an individual developer for a short-term project.
\layout Standard
This project will thus aim to build the necessary base infrastructure to
be able to edit such documents from a plain text editor, and to render
them to suitable formats for printing or online distribution, such as HTML,
PDF or PostScript.
This model follows that for the production of LaTeX documents, which can
be edited with any text editor.
\layout Standard
Such documents would be extremely useful for many audiences beyond scientists:
one can use them to produce additionally documented code, to explore a
problem with Python and maintain all relevant information in the same place,
as a way to distribute enhanced Python-based educational materials, etc.
\layout Standard
Demand for such a system exists, as evidenced by repeated requests made
to me by IPython users over the last few years.
Unfortunately IPython is only a spare-time project for me, and I have not
had the time to devote to this, despite being convinced of its long term
value and wide appeal.
\layout Standard
If this project is successful, the infrastructure laid out by it will be
immediately useful for Python users wishing to maintain `literate' programs
which include rich formatting.
In addition, this will open the door for the future development of graphical
shells which can render such documents in real time: this is exactly the
development model successfully followed by the LyX
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.lyx.org}
\end_inset
\end_inset
document processing system.
\layout Section
Implementation effort
\layout Subsection
Specific goals
\layout Standard
This is a brief outline of the main points of this project.
The next section provides details on all of them.
The student(s) working on the project would need to:
\layout Enumerate
Make design decisions for the internal file structure to enable valid Python
notebooks.
\layout Enumerate
Implement the rendering library, capable of processing an input notebook
through reST or LaTeX and producing HTML or PDF output, as well as exporting
a `pure-code' Python file stripped of all markup calls.
\layout Enumerate
Study existing programming editor widgets to find the most suitable one
for extending with an IPython connector for interactive execution of the
notebooks.
\layout Subsection
Complexity level
\layout Standard
This project is relatively complicated.
While I will gladly assist the student with design and implementation issues,
it will require a fair amount of thinking in terms of overall library architect
ure.
The actual implementation does not require any sophisticated concepts,
but rather a reasonably broad knowledge of a wide set of topics (markup,
interaction with external programs and libraries, namespace tricks to provide
runtime changes in the effect of the markup calls, etc.)
\layout Standard
While raw novices are welcome to try, I suspect that it may be a bit too
much for them.
Students wanting to apply should keep in mind, if the money is an important
consideration, that Google only gives the $4500 reward upon
\emph on
successful completion
\emph default
of the project.
So don't bite more than you can chew.
Obviously if this doesn't matter, anyone is welcome to participate, since
the project can be a very interesting learning experience, and it will
provide a genuinely useful tool for many.
\layout Section
Technical details
\layout Subsection
The files
\layout Standard
A basic requirement of this project will be that the Python notebooks shall
be valid Python source files, typically with a
\family typewriter
.py
\family default
extension.
A renderer program can be used to process the markup calls in them and
generate output.
If run at a regular command line, these files should execute like normal
Python files.
But when run via a special rendering script, the result should be a properly
formatted file.
Output formats could be PDF or HTML depending on user-supplied options.
\layout Standard
A reST markup mode should be implemented, as reST is already widely used
in the Python community and is a very simple format to write.
The following is a sketch of what such files could look like using reST
markup:
\layout Standard
\begin_inset ERT
status Open
\layout Standard
\backslash
lstinputlisting{nbexample.py}
\end_inset
\layout Standard
Additionally, a LaTeX markup mode should also be implemented.
Here's a mockup example of what code using the LaTeX mode could look like.
\layout Standard
\begin_inset ERT
status Open
\layout Standard
\backslash
lstinputlisting{nbexample_latex.py}
\end_inset
\layout Standard
At this point, it must be noted that the code above is simply a sketch of
these ideas, not a finalized design.
An important part of this project will be to think about what the best
API and structure for this problem should be.
\layout Subsection
From notebooks to PDF, HTML or Python
\layout Standard
Once a clean API for markup has been specified, converters will be written
to take a python source file which uses notebook constructs, and generate
final output in printable formats, such as HTML or PDF.
For example, if
\family typewriter
nbfile.py
\family default
is a python notebook, then
\layout LyX-Code
$ pynb --export=pdf nbfile.py
\layout Standard
should produce
\family typewriter
nbfile.pdf
\family default
, while
\layout LyX-Code
$ pynb --export=html nbfile.py
\layout Standard
would produce an HTML version.
The actual rendering will be done by calling appropriate utilities, such
as the reST toolchain or LaTeX, depending on the markup used by the file.
\layout Standard
Additionally, while the notebooks will be valid Python files, if executed
on their own, all the markup calls will still return their results, which
are not really needed when the file is being treated as pure code.
For this reason, a module to execute these files turning the markup calls
into no-ops should be written.
Using Python 2.4's -m switch, one can then use something like
\layout LyX-Code
$ python -m notebook nbfile.py
\layout Standard
and the notebook file
\family typewriter
nbfile.py
\family default
will be executed without any overhead introduced by the markup (other than
making calls to functions which return immediately).
Finally, an exporter to clean code can be trivially implemented, so that:
\layout LyX-Code
$ pynb --export=python nbfile.py nbcode.py
\layout Standard
would export only the code in
\family typewriter
nbfile.py
\family default
to
\family typewriter
nbcode.py
\family default
, removing the markup completely.
This can be used to generate final production versions of large modules
implemented as notebooks, if one wants to eliminate the markup overhead.
\layout Subsection
The editing environment
\layout Standard
The first and most important part of the project should be the definition
of a clean API and the implementation of the exporter modules as indicated
above.
Ultimately, such files can be developed using any text editor, since they
are nothing more than regular Python code.
\layout Standard
But once these goals are reached, further integration with an editor will
be done, without the need for a full-blown GUI shell.
In fact, already today the (X)Emacs editors can provide for interactive
usage of such files.
Using python-mode in (X)Emacs, one can pass highlighted regions of a file
for execution to an underlying python process, and the results are printed
in the python window.
With recent versions of python-mode, IPython can be used instead of the
plain python interpreter, so that IPython's extended interactive capabilities
become available within (X)Emacs (improved tracebacks, automatic debugger
integration, variable information, easy filesystem access to Python, etc).
\layout Standard
But even with IPython integration, the usage within (X)Emacs is not ideal
for a notebook environment, since the python process buffer is separate
from the python file.
Therefore, the next stage of the project will be to enable tighter integration
between the editing and execution environments.
The basic idea is to provide an easy way to mark regions of the file to
be executed interactively, and to have the output inserted automatically
into the file.
The following listing is a mockup of what the resulting file could look
like
\layout Standard
\begin_inset ERT
status Open
\layout Standard
\backslash
lstinputlisting{nbexample_output.py}
\end_inset
\layout Standard
Basically, the editor will execute
\family typewriter
add(2,3)
\family default
and insert the string representation of the output into the file, so it
can be used for rendering later.
\layout Section
Available resources
\layout Standard
IPython currently has all the necessary infrastructure for code execution,
albeit in a rather messy code base.
Most I/O is already abstracted out, a necessary condition for embedding
in a GUI (since you are not writing to stdout/err but to the GUI's text
area).
\layout Standard
For interaction with an editor, it will be necessary to identify a good
programming editor with a Python-compatible license, which can be extended
to communicate with the underlying IPython engine.
IDLE, the Tk-based IDE which ships with Python, should obviously be considered.
The Scintilla editing component
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.scintilla.org}
\end_inset
\end_inset
may also be a viable candidate.
\layout Standard
It will also be interesting to look at the LyX editor
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.lyx.org}
\end_inset
\end_inset
, which already offers a Python client
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://wiki.lyx.org/Tools/PyClient}
\end_inset
\end_inset
.
Since LyX has very sophisticated LaTeX support, this is a very interesting
direction to consider for the future (though LyX makes a poor programming
editor).
\layout Section
Support offered to the students
\layout Standard
The IPython project already has an established Open Source infrastructure,
including CVS repositories, a bug tracker and mailing lists.
As the main author and sole maintainer of IPython, I will personally assist
the student(s) funded with architectural and design guidance, preferably
on the public development mailing list.
I expect them to start working by submitting patches until they show, by
the quality of their work, that they can be granted CVS write access.
I expect most actual implementation work to be done by the students, though
I will provide assistance if they need it with a specific technical issue.
\layout Standard
If more than one applicant is accepted to work on this project, there is
more than enough work to be done which can be coordinated between them.
\layout Section
Licensing and copyright
\layout Standard
IPython is licensed under BSD terms, and copyright of all sources rests
with the original authors of the core modules.
Over the years, all external contributions have been small enough patches
that they have been simply folded into the main source tree without additional
copyright attributions, though explicit credit has always been given to
all contributors.
\layout Standard
I expect the students participating in this project to contribute enough
standalone code that they can retain the copyright to it if they so desire,
as long as they accept all their work to be licensed under BSD terms.
\layout Section
Acknowledgements
\layout Standard
I'd like to thank John D.
Hunter, the author of matplotlib
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://matplotlib.sf.net}
\end_inset
\end_inset
, for lengthy discussions which helped clarify much of this project.
In particular, the important decision of embedding the notebook markup
calls in true Python functions instead of specially-tagged strings or comments
was an idea I thank him for pushing hard enough to convince me of using.
\layout Standard
My conversations with Brian Granger, the author of PyXG
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/pyxg.html}
\end_inset
\end_inset
and braid
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/braid.html}
\end_inset
\end_inset
, have also been very useful in clarifying details of the necessary underlying
infrastructure and future evolution of IPython for this kind of system.
\layout Standard
Thank you also to the IPython users who have, in the past, discussed this
topic with me either in private or on the IPython or Scipy lists.
\the_end