forked from mikeckennedy/talk-python-transcripts
-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy path044-jupyter.vtt
2021 lines (1347 loc) · 72.7 KB
/
044-jupyter.vtt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
WEBVTT
00:00:00.001 --> 00:00:03.160
One of the fastest growing areas in Python is scientific computing.
00:00:03.160 --> 00:00:07.700
In scientific computing with Python, there are a few key packages that make it really special.
00:00:07.700 --> 00:00:11.420
These include NumPy, SciPy, and the related packages.
00:00:11.420 --> 00:00:16.800
But the one that brings it all together, visually, is IPython, now known as Project Jupyter.
00:00:16.800 --> 00:00:20.780
And that's the topic of episode 44 of Talk Python to Me.
00:00:20.780 --> 00:00:25.540
You'll learn about the big split, plans for the recent $6 million in funding,
00:00:25.540 --> 00:00:30.700
Jupyter at CERN and Large Hadron Collider with Min Arkay and Matthias Boutonnier.
00:00:54.080 --> 00:01:01.200
Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities.
00:01:01.200 --> 00:01:05.320
This is your host, Michael Kennedy. Follow me on Twitter where I'm @mkennedy.
00:01:05.320 --> 00:01:09.200
Keep up with the show and listen to past episodes at talkpython.fm.
00:01:09.200 --> 00:01:11.800
And follow the show on Twitter via at Talk Python.
00:01:11.800 --> 00:01:15.380
This episode is brought to you by Hired and SnapCI.
00:01:15.380 --> 00:01:22.080
Thank them for supporting the show on Twitter via at Hired underscore HQ and at Snap underscore CI.
00:01:23.080 --> 00:01:25.200
Hi, folks. No news this week.
00:01:25.200 --> 00:01:29.100
I do have a big announcement coming, and I'm really looking forward to sharing it with you all,
00:01:29.100 --> 00:01:31.900
but I'm not quite ready to talk about it yet, so stay tuned.
00:01:31.900 --> 00:01:36.280
For now, let's get right to the interview with the Project Jupyter core devs,
00:01:36.280 --> 00:01:38.200
Min Arkay and Matthias Boutonnier.
00:01:38.200 --> 00:01:40.840
Matthias, Min, welcome to the show.
00:01:40.840 --> 00:01:41.500
Thanks.
00:01:41.500 --> 00:01:43.120
Thanks, Mike, for having us here.
00:01:43.540 --> 00:01:49.660
Yeah, I'm really super excited to talk about Python intersected with science in this thing called
00:01:49.660 --> 00:01:51.780
IPython, or what's become Project Jupyter.
00:01:51.780 --> 00:01:53.800
So that's going to be really great.
00:01:53.800 --> 00:01:58.000
And before we get to that, though, let's just talk about how you got involved.
00:01:58.000 --> 00:01:59.320
How do you get into programming?
00:01:59.320 --> 00:02:03.040
How do you get involved with IPython and all that stuff?
00:02:03.040 --> 00:02:03.740
What's your background?
00:02:03.740 --> 00:02:04.760
Min, you want to go first?
00:02:04.760 --> 00:02:05.200
Sure.
00:02:05.200 --> 00:02:11.520
Yeah, so I was an undergrad in physics at Santa Clara University, working with Brian Granger,
00:02:11.520 --> 00:02:13.980
one of the founders of the IPython project.
00:02:13.980 --> 00:02:21.720
And I was interested in computing and simulation and things, and ended up working on the interactive
00:02:21.720 --> 00:02:28.640
parallel computing part of IPython as my undergrad thesis, and started doing my numerical simulation
00:02:28.640 --> 00:02:34.000
homework stuff in Python, even though the classes were taught in MATLAB and Octave and things,
00:02:34.000 --> 00:02:39.980
and enjoying the scientific Python ecosystem of NumPy and Matplotlib and things.
00:02:39.980 --> 00:02:44.360
And that's kind of how I came to the project and scientific Python in general.
00:02:44.360 --> 00:02:45.280
Yeah, that's really cool.
00:02:45.280 --> 00:02:48.000
And was IPython already a thing when you got started?
00:02:48.000 --> 00:02:48.660
Yeah.
00:02:48.660 --> 00:02:54.260
Fernando created IPython in 2001, and I was doing my undergrad a few years after that.
00:02:54.260 --> 00:02:59.440
And so I joined the project after it had been around for about five years in 2006, and I've
00:02:59.440 --> 00:03:02.000
been working on it for the past 10 years, I guess, now.
00:03:02.000 --> 00:03:03.480
Yeah, about 10 years.
00:03:03.480 --> 00:03:04.560
So how time flies.
00:03:04.560 --> 00:03:05.700
Matthias, how about you?
00:03:05.700 --> 00:03:11.600
Oh, so I've been to the project much later than Min.
00:03:11.600 --> 00:03:16.620
I actually started programming a long time ago and came across one of the huge refactoring
00:03:16.620 --> 00:03:18.100
of IPython, MinDead.
00:03:18.100 --> 00:03:22.140
I think it finished in summer 2011.
00:03:22.920 --> 00:03:27.820
Just after that, they released the Qt console, so the current IPython team at the time.
00:03:27.820 --> 00:03:35.080
And the project was much more friendly and in a good shape for beginners Python programmers.
00:03:35.080 --> 00:03:42.560
At the time, I was beginning my PhD in biophysics in Paris, and I started contributing to the project.
00:03:42.760 --> 00:03:45.560
It was my first big contribution to an open source project.
00:03:45.560 --> 00:03:52.540
And I started to spend my night and weekend doing my PhD, improving IPython, which was helping
00:03:52.540 --> 00:03:55.140
me really a lot for my PhD.
00:03:55.140 --> 00:04:00.940
And I quickly became a core contributor, and I've stayed in the team since then.
00:04:01.100 --> 00:04:07.500
Maybe a good place to start talking about this whole project is maybe we can start with the
00:04:07.500 --> 00:04:07.760
history.
00:04:07.760 --> 00:04:12.060
Originally, this project was called IPython and IPython Notebooks, right?
00:04:12.400 --> 00:04:19.440
Yeah, IPython was around for a good 10 years before we got a version of the notebook out, although
00:04:19.440 --> 00:04:24.960
we had been working on various versions of notebooks for about five years that most attempts kind
00:04:24.960 --> 00:04:25.940
of didn't go anywhere.
00:04:25.940 --> 00:04:27.780
So what did it look like?
00:04:27.780 --> 00:04:31.620
What did the product look like in the early days?
00:04:31.620 --> 00:04:38.020
Yeah, so initially, Fernando created IPython as just a better interactive shell for Python,
00:04:38.360 --> 00:04:44.980
so giving you some better tab completions, nice colorful tracebacks, things like that.
00:04:44.980 --> 00:04:50.560
Also, Python's a nice, verbose language, but when you're doing interactive stuff, some of
00:04:50.560 --> 00:04:55.320
the bash shell syntax is nicer to type when you're doing LS and CD and everything.
00:04:55.320 --> 00:05:01.280
So one of the things that Fernando did early on was add this notion of magics for extending
00:05:01.280 --> 00:05:06.060
the Python language to give convenient commands for interactively typed things.
00:05:06.060 --> 00:05:13.340
like you can type CD in IPython, which you can't type in a regular Python environment.
00:05:13.340 --> 00:05:20.360
And then there are magics that are particularly useful for the scientific visualization things
00:05:20.360 --> 00:05:27.880
and things like the time at magic for profiling and good map.lib integration for the event loop
00:05:27.880 --> 00:05:28.500
and things like that.
00:05:29.000 --> 00:05:31.280
I know what the shell looks like today.
00:05:31.280 --> 00:05:35.820
You know, you can load it up and it kind of looks like a shell and you type in there,
00:05:35.820 --> 00:05:41.300
but you can do things like plot graphs and those will pop up into separate windows and things like that, right?
00:05:41.300 --> 00:05:45.280
Yeah, and most of that is provided by tools like MapPotlib,
00:05:45.280 --> 00:05:50.880
but often those tools need a little bit of help to make sure that the terminal stays responsive.
00:05:50.880 --> 00:05:58.020
And that's one of the things that IPython helps with in terms of what Python calls the input hook
00:05:58.020 --> 00:06:04.300
to ensure that the terminal remains responsive while a GUI event loop is also running.
00:06:04.300 --> 00:06:05.620
Right, yeah, very cool.
00:06:05.620 --> 00:06:12.340
So how did it go from that to the more, I want to think of it as like articles or published style,
00:06:12.460 --> 00:06:18.280
these notebooks that you can use to communicate almost like finished work rather than something interactive?
00:06:18.280 --> 00:06:24.700
Notebook style interfaces, since a lot of the IPython folks come from a physics background.
00:06:24.700 --> 00:06:32.520
So Brian and Fernando, who started the project, were doing their graduate work in physics at Boulder at the same time.
00:06:32.520 --> 00:06:34.520
And then I was Brian's physics student.
00:06:34.520 --> 00:06:37.800
And notebook environments are pretty common.
00:06:37.980 --> 00:06:43.940
There are various commercial and non-commercial products that have kind of notebook processing environments,
00:06:43.940 --> 00:06:47.400
especially for math analysis.
00:06:47.400 --> 00:06:51.460
So often code is not the best representation of math,
00:06:51.460 --> 00:06:54.420
but there are rich, you know, rendered mathematical expressions that are nice.
00:06:54.420 --> 00:07:00.600
Brian and Fernando knew that they wanted a notebook type interface fairly early on,
00:07:00.600 --> 00:07:04.220
but the tools just weren't there to build it.
00:07:04.220 --> 00:07:07.120
And IPython wasn't in a shape to really support it.
00:07:07.260 --> 00:07:10.840
So slowly at first, and then it kind of picked up speed.
00:07:10.840 --> 00:07:14.400
We added the pieces for putting that together.
00:07:14.400 --> 00:07:20.180
But it was kind of in, it was on the horizon for many years before it actually happened.
00:07:20.180 --> 00:07:20.600
Sure.
00:07:20.600 --> 00:07:23.280
It took a while to build the maturity into it.
00:07:23.280 --> 00:07:27.820
What are some of those building blocks that it was waiting on?
00:07:27.820 --> 00:07:36.820
Yeah, I think that web technology and web socket were one of the technologies that was missing to actually use a notebook.
00:07:36.820 --> 00:07:44.800
If I remember correctly, one of the latest prototypes that we did not release was using AJAX polling.
00:07:45.820 --> 00:07:53.320
But the ability to actually push a result to the web front end once, as soon as the kernel gets a result,
00:07:53.320 --> 00:07:59.000
is one of the key factors that pushed the notebook forward and allowed us to do the notebook.
00:07:59.000 --> 00:08:08.660
Actually, the notebook that you know nowadays, the first prototype, was actually using still draft of web sockets that stayed in draft state a long time.
00:08:08.660 --> 00:08:15.600
And so we were really bleeding edge on this technology and adopting a lot of everything, everything in browser,
00:08:15.600 --> 00:08:19.720
and everything to just rely on what current browser can do for the notebook.
00:08:19.720 --> 00:08:27.200
There is, by the way, we can put that in the note of the podcast, a really nice blog of Fernando that recaps the history of IPython.
00:08:27.200 --> 00:08:38.100
And even 150 lines of Python, which is a version of IPython when it was like a few weeks old, which is IPython 0.1,
00:08:38.100 --> 00:08:43.620
that we can dig up for people who are interested in trying really early prototype.
00:08:43.620 --> 00:08:45.660
Yeah, go back and see the history.
00:08:45.660 --> 00:08:54.340
That's a really interesting point, Matthias, because it's easy to think of the web as being this very rich, powerful, capable platform,
00:08:54.340 --> 00:08:58.140
because it has been for the last five years or so.
00:08:58.140 --> 00:09:03.600
But 10 or even more than 10 years ago, it was not, right?
00:09:03.600 --> 00:09:07.080
It was basically just documents on the web, right?
00:09:07.080 --> 00:09:10.340
You had a little bit of JavaScript, and that was about it, right?
00:09:10.340 --> 00:09:11.940
Yeah, I think so.
00:09:11.940 --> 00:09:13.840
I haven't used that much.
00:09:13.840 --> 00:09:17.580
I was not developing on the web that much 10 years ago.
00:09:17.580 --> 00:09:20.320
I was more a C, C++ person.
00:09:20.320 --> 00:09:23.900
You mean maybe it was more developing web at the time?
00:09:23.900 --> 00:09:33.780
Yeah, I wrote an early version of the web-based notebook for IPython during the summer of 2006, 2007, I think.
00:09:33.780 --> 00:09:37.840
Even then, the tools available really weren't...
00:09:37.840 --> 00:09:41.620
It was not a particularly pleasant thing to work with.
00:09:42.120 --> 00:09:43.340
I bet it wasn't.
00:09:43.340 --> 00:09:52.540
Did you end up in a lot of situations where you're like, oh, this only works in Firefox, and this one only works in IE, and just partly working in a lot of places?
00:09:52.540 --> 00:09:54.580
It's frankly still like that.
00:09:54.580 --> 00:09:56.740
It still never works in IE.
00:09:56.740 --> 00:09:58.860
Yeah.
00:09:58.860 --> 00:10:01.520
Yeah, it's hard to love IE, I know.
00:10:02.060 --> 00:10:16.320
Well, but I mean, recent versions of Internet Explorer are actually really nice and have good standards implementations and everything, but the reputation of IE6 kind of overshadows.
00:10:16.840 --> 00:10:18.720
Yeah, it definitely casts a long shadow.
00:10:18.720 --> 00:10:30.200
And, you know, Microsoft, I think just last week, possibly, like very recently, just ended support for all versions of IE other than, I think, in 11 and onward.
00:10:30.520 --> 00:10:33.840
Maybe 10 and onward, but certainly knocked out a whole bunch of them.
00:10:33.840 --> 00:10:38.580
And, you know, once that kicks in, that's going to be a good day for everyone that has to work on the web.
00:10:38.580 --> 00:10:41.380
Microsoft has done a lot of things recently.
00:10:41.380 --> 00:10:51.020
Last week also, if I remember correctly, they did release as open source the JavaScript engine that will power the next version of their browser.
00:10:51.660 --> 00:11:10.360
So Google has V8, which both power Chrome and Node.js, which is actually one of the technologies that helped the notebook become reality because JavaScript was painfully slow 10 years ago and is now really, really fast thanks to V8.
00:11:10.360 --> 00:11:20.140
And so it's really nice to see nowadays Microsoft actually releasing open source software and contributing to the community.
00:11:20.620 --> 00:11:35.560
And I hope that in the next few years, Microsoft will lose some fact that everybody is complaining about IE and everything and get actually nice software, not that many security bugs and so on and so forth.
00:11:35.560 --> 00:11:44.500
It will be really nice if that comes along because a lot of people run their software and it would be, you know, the world would be a better place if it works really well.
00:11:44.500 --> 00:11:46.900
I certainly think they're on the right path.
00:11:46.900 --> 00:11:48.200
I think it's pretty interesting.
00:11:48.480 --> 00:11:57.920
So one thought I had while you guys were talking about this is how does, what's the cross-platform story or IPython and Jupyter in general?
00:11:57.920 --> 00:12:05.120
Does it work kind of equally well on Windows, Linux, OS X or are there places that are more equal than others?
00:12:05.120 --> 00:12:11.880
Linux and OS X are a little bit more equal than Windows, but it should work.
00:12:12.380 --> 00:12:13.580
It should work everywhere.
00:12:13.580 --> 00:12:28.400
And even though all of our developers and everything are working exclusively on Linux and OS X, when we do user surveys and things, we find that roughly half or even slightly more than half of our users are running Windows.
00:12:28.820 --> 00:12:46.880
So even though it often doesn't work quite as well or we frequently during the development process will introduce bugs that we don't notice for a while, Windows really is a first-class platform for the kind of local desktop app that happens to use a web browser for UI case of the notebook.
00:12:46.880 --> 00:12:55.480
There are certain aspects of installation that are often more challenging on Windows, especially in terms of installing kernels other than the Python one.
00:12:55.480 --> 00:13:01.020
So installing multi-language kernels is more challenging on Windows.
00:13:01.020 --> 00:13:05.240
And I think that's not necessarily a specific deficiency of Windows.
00:13:05.240 --> 00:13:10.020
It's more just the kind of developer maintainers don't tend to use Windows.
00:13:10.020 --> 00:13:15.760
So the documentation and education often just don't cover what you need to do for Windows as well.
00:13:15.760 --> 00:13:16.540
Right.
00:13:16.540 --> 00:13:25.700
If you don't develop and test deploying your packages in the underlying compilers that have to make them go, well, you're more likely to run into problems, right?
00:13:25.700 --> 00:13:26.360
Yeah.
00:13:26.360 --> 00:13:31.560
I would say also that Trevis CI, so continuous integration, is often on Linux only.
00:13:31.560 --> 00:13:33.800
Setting up on Windows is painful.
00:13:33.800 --> 00:13:41.180
So we catch up bugs with continuous integration, much often with continuous integration on Linux.
00:13:41.180 --> 00:13:44.820
So less prone to bug on Linux.
00:13:44.820 --> 00:13:55.640
And the other thing is, I don't always like to say good things about half proprietary tools, but Konda changed a lot of things for the last few years.
00:13:55.900 --> 00:13:59.240
It was really painful to install Python on many systems.
00:13:59.240 --> 00:14:10.280
And now it's one of the solutions, especially at Software Carpentry Bootcamp, where we ask people to just install Konda and Konda install Jupyter, which now even come vended in it.
00:14:10.280 --> 00:14:14.980
And it's almost always works out of the box.
00:14:14.980 --> 00:14:19.440
And especially for beginners, it's a really, really nice tool.
00:14:19.660 --> 00:14:26.560
Yeah, Konda has really moved the bar for how easy it is to get set up, especially on Windows.
00:14:26.560 --> 00:14:32.920
There are lots of different ways to install things on Unix-y platforms that work fairly reliably.
00:14:32.920 --> 00:14:42.680
But the binaries provided by Konda and Anaconda are extremely valuable for beginners, especially on Windows, where people don't tend to have a working compiler set up.
00:14:42.680 --> 00:14:47.880
And a lot of the scientific packages won't build on people's Windows machines.
00:14:48.880 --> 00:14:51.500
So having binaries is extremely important.
00:14:51.500 --> 00:14:58.800
And the binaries provided by Konda and Anaconda have been extremely valuable, especially for people getting started in scientific Python.
00:14:58.800 --> 00:15:07.380
Yeah, I still think I have scars from the vcvars.bat was not found sort of errors trying to do stuff on Windows.
00:15:07.380 --> 00:15:14.580
And we had Travis Oliphant on Show 34, who is behind Konda and Continuum and all that.
00:15:14.780 --> 00:15:26.080
And I think it's a really cool thing that those guys are doing, sort of taking that build configuration step and just pre-building it and shipping the binaries, like you say.
00:15:26.080 --> 00:15:29.260
That really helps people when they're getting started, I think.
00:15:29.660 --> 00:15:47.980
Yeah, it's made a huge difference, especially, as Matthias mentioned, in the workshop, the kind of software carpentry and Python boot camp type environments, which often, you know, just a few years ago, where you spend the first day on installation, basically.
00:15:48.280 --> 00:15:51.340
Which is a high price to pay in a two-day workshop.
00:15:51.340 --> 00:15:53.860
And now it's often down to an hour.
00:15:53.860 --> 00:15:54.580
It's awesome.
00:15:54.580 --> 00:15:55.780
It's a super high price to pay.
00:15:55.780 --> 00:15:58.540
And it's also super discouraging, right?
00:15:58.540 --> 00:16:01.700
People come not because they want to learn how to configure their compiler.
00:16:01.700 --> 00:16:04.140
They want to come build something amazing, right?
00:16:04.140 --> 00:16:08.080
And they've got to, like, plow through all these nasty configuration edge cases.
00:16:08.480 --> 00:16:09.480
And, yeah, very, very cool.
00:16:09.480 --> 00:16:19.640
So, before we move farther, you know, just the other day, I was trying to describe IPython as somebody in, like, one or two sentences.
00:16:19.640 --> 00:16:21.920
And I didn't do a super job, I think.
00:16:21.920 --> 00:16:29.420
Could you guys maybe give me your elevator pitch for what is Jupyter or IPython, which becomes Jupyter?
00:16:29.420 --> 00:16:30.560
It's really tough.
00:16:30.560 --> 00:16:32.740
Have you seen the Lego movie?
00:16:32.740 --> 00:16:35.560
Do you know the song Everything is Awesome?
00:16:35.560 --> 00:16:36.660
Yes.
00:16:37.940 --> 00:16:39.580
That would be my pitch.
00:16:39.580 --> 00:16:42.000
Yeah.
00:16:42.000 --> 00:16:43.400
Everything is awesome.
00:16:43.400 --> 00:16:43.720
Okay.
00:16:43.720 --> 00:16:44.520
Yeah.
00:16:44.520 --> 00:16:58.520
So, I would say it's IPython and Jupyter projects together provide tools for interactive computing and reproducible research and software-based communication.
00:16:58.520 --> 00:16:59.700
Okay.
00:16:59.700 --> 00:17:01.360
It's kind of the high-level gist.
00:17:01.360 --> 00:17:06.440
It's fairly different than a lot of what's out there from a programmer's perspective.
00:17:06.660 --> 00:17:08.680
So, it does take a little explaining, doesn't it?
00:17:08.680 --> 00:17:10.060
Yeah.
00:17:10.060 --> 00:17:16.280
So, we have things like an environment in which to do the interactive programming and do the exploratory work.
00:17:16.940 --> 00:17:23.360
And then we also have things like the notebook document format, which are for distributing the communication and sharing it with other people.
00:17:23.720 --> 00:17:26.240
So, those are kind of the two aspects.
00:17:26.240 --> 00:17:26.240
So, those are kind of the two aspects.
00:17:26.240 --> 00:17:31.220
And Fernando likes to say we have tools for the life cycle of a computational idea.
00:17:31.220 --> 00:17:33.080
That's a very cool way to put it.
00:17:33.080 --> 00:17:34.080
It's a very cool tagline.
00:17:34.080 --> 00:17:34.620
I like it.
00:17:34.620 --> 00:17:39.340
We're talking about IPython because that's the historical place.
00:17:39.340 --> 00:17:42.840
And we're talking about Jupyter because that's the present and the future.
00:17:42.840 --> 00:17:46.320
Could you guys maybe talk about how it went from one to the other?
00:17:46.400 --> 00:17:47.080
What's the story there?
00:17:47.080 --> 00:17:47.940
Yeah.
00:17:47.940 --> 00:18:03.940
So, when we started working on building these UIs with rich media displays, the first one of which was the Qt console, the first step of that was separating the front end from what we call the kernel, which is where code runs.
00:18:04.360 --> 00:18:10.520
That meant essentially establishing a network protocol for a REPL, basically.
00:18:10.520 --> 00:18:19.440
And with that, we have the ability, an expression of, okay, I'm going to send an execute request that has some code for the kernel to evaluate.
00:18:19.440 --> 00:18:25.440
And then the kernel sends messages back that are display formats of various types.
00:18:25.440 --> 00:18:28.980
So, it can send back PNGs or HTML or text.
00:18:29.460 --> 00:18:44.040
We realized, not entirely on purpose, this wasn't what we set out to do, but we realized when we had this protocol that there was nothing Python-specific about it, that any language that understands a REPL can talk this protocol.
00:18:44.040 --> 00:18:52.860
And because the UI and the code execution were in different processes, there's no reason that the two need to be in the same language.
00:18:53.860 --> 00:19:07.160
Communities like, the first big one was the Julia language community, essentially saw the UI, specifically the notebook UI, and said, you know, we like that, we want to use that, we'd rather not reimplement it.
00:19:07.160 --> 00:19:09.680
So, what they implemented was the protocol.
00:19:09.680 --> 00:19:12.980
And once they implemented the protocol, they got the UI for free.
00:19:13.640 --> 00:19:25.760
The result of that, since we didn't set out to design that, there were a bunch of rough edges where we had assumed Python, but they were kind of incidental, smaller assumptions to work around.
00:19:25.760 --> 00:19:40.420
And so, since that started, we've been kind of refining protocols and things to remove Python and IPython assumptions, so that the UI is separate from the language in which execution happens.
00:19:40.620 --> 00:19:49.780
Because we don't really, you know, a lot of the benefits of the protocol and the display stuff, there's no reason it should be confined to code executing in Python.
00:19:49.780 --> 00:19:53.220
Yeah, that's a really happy coincidence, isn't it?
00:19:53.220 --> 00:19:54.200
That's excellent.
00:19:54.200 --> 00:19:54.640
Yeah.
00:20:05.520 --> 00:20:07.700
This episode is brought to you by Hired.
00:20:07.700 --> 00:20:13.300
Hired is a two-sided, curated marketplace that connects the world's knowledge workers to the best opportunities.
00:20:13.300 --> 00:20:20.800
Each offer you receive has salary and equity presented right up front, and you can view the offers to accept or reject them before you even talk to the company.
00:20:20.800 --> 00:20:26.520
Typically, candidates receive five or more offers within the first week, and there are no obligations, ever.
00:20:26.520 --> 00:20:28.100
Sounds awesome, doesn't it?
00:20:28.100 --> 00:20:29.760
Well, did I mention the signing bonus?
00:20:30.040 --> 00:20:33.140
Everyone who accepts a job from Hired gets a $1,000 signing bonus.
00:20:33.140 --> 00:20:35.940
And as Talk Python listeners, it gets way sweeter.
00:20:35.940 --> 00:20:41.660
Use the link Hired.com slash Talk Python To Me, and Hired will double the signing bonus to $2,000.
00:20:41.660 --> 00:20:43.800
Opportunity's knocking.
00:20:43.800 --> 00:20:47.240
Visit Hired.com slash Talk Python To Me and answer the call.
00:20:53.240 --> 00:20:55.780
Matthias, where did Jupyter come from?
00:20:55.780 --> 00:20:57.620
It used to be called IPython.
00:20:57.620 --> 00:21:00.260
Obviously, that doesn't make sense if you're not using Python.
00:21:00.260 --> 00:21:11.180
We've been thinking about renaming part of the project for much longer times than when we actually announced that we will be renaming to Jupyter.
00:21:11.180 --> 00:21:16.640
Of course, we were aware that users, especially non-Python users, were confused.
00:21:16.640 --> 00:21:20.280
Like, I want to use a notebook with R.
00:21:20.280 --> 00:21:22.060
Why should I install IPython?
00:21:22.060 --> 00:21:30.040
And you have to understand that many users even don't make the difference between Python and IPython.
00:21:30.040 --> 00:21:34.740
And many users also write IPython with lowercase i, and everyone knows that it's uppercase i.
00:21:34.740 --> 00:21:37.500
It's not made by Apple.
00:21:37.500 --> 00:21:37.920
Come on.
00:21:37.920 --> 00:21:38.480
Yeah.
00:21:38.480 --> 00:21:50.280
And so, yeah, we were searching for another name to actually something that is easy to Google, that is not already taken, where we can get a domain name.
00:21:50.280 --> 00:21:54.980
And that would have a connotation, a scientific connotation.
00:21:54.980 --> 00:22:03.800
And we wanted to do thanks to the Astro community that has been using IPython for a long, long time, almost since the beginning.
00:22:04.620 --> 00:22:12.480
And I still remember one day, Fernando wrote a mail to the whole team and said, hey, I just found this name.
00:22:12.480 --> 00:22:14.000
What do you think?
00:22:14.000 --> 00:22:16.860
And it was everybody agreed.
00:22:16.860 --> 00:22:26.080
And almost in a couple of days, we decided to grab all the domain name and start working on actually separating the project and everything.
00:22:26.080 --> 00:22:29.260
It has been a really tough transition.
00:22:29.840 --> 00:22:33.060
People were really, really confused about the renaming.
00:22:33.060 --> 00:22:35.020
People are still confused.
00:22:35.020 --> 00:22:42.900
But especially for new users, the distinction Jupyter-IPython is really, really useful.
00:22:42.900 --> 00:22:47.340
And also, it allowed Jupyter to become something slightly bigger.
00:22:47.340 --> 00:22:49.700
That was also in our mind in the back.
00:22:49.700 --> 00:22:56.400
Which was like, Jupyter is more a specification that you have a protocol and you have a set of tools.
00:22:56.800 --> 00:23:05.160
What is part of Jupyter is much broader and it can allow anybody to basically say, hey, I implement the Jupyter protocol.
00:23:05.160 --> 00:23:10.440
And so, it's easier to say, hey, I have a Jupyter Atom plugin.
00:23:11.580 --> 00:23:17.700
There are also legal issues around that, that using trademarks that are really close to Python is difficult.
00:23:17.700 --> 00:23:24.320
And Jupyter, being a brand new namespace, and you know that namespaces are great, we should use more of them.
00:23:24.320 --> 00:23:34.420
Allow people to use that and say that they are multi-language in a much better way than when saying we are compatible with IPython.
00:23:34.680 --> 00:23:38.060
Because IPython is also highly connected as a shell.
00:23:38.060 --> 00:23:40.960
And Jupyter is more than just a notebook.
00:23:40.960 --> 00:23:46.380
So, having Jupyter is much better and we are happy with that.
00:23:46.380 --> 00:23:48.080
Yeah, it makes perfect sense.
00:23:48.080 --> 00:23:54.180
I'm sure the transition was a little confusing for people who have been doing IPython or they've heard about IPython.
00:23:54.180 --> 00:23:55.300
They were going to look into it.
00:23:55.300 --> 00:23:56.500
Now it's this other thing.
00:23:57.180 --> 00:24:01.020
But there's more than just a couple of languages that are supported, right?
00:24:01.020 --> 00:24:01.920
How many are supported?
00:24:01.920 --> 00:24:06.420
It depends on when you want to be supported.
00:24:06.420 --> 00:24:17.420
We have a wiki page which is still on the IPython repository, which lists, if I remember correctly, 50 or almost 60 languages.
00:24:17.420 --> 00:24:22.000
It means that you can have languages that have many kernels.
00:24:22.000 --> 00:24:29.260
It means that someone at some point wrote a kernel or a toy kernel that works with IPython.
00:24:29.260 --> 00:24:32.800
And if I remember correctly, we have around 60.
00:24:32.800 --> 00:24:34.020
60.
00:24:34.020 --> 00:24:40.200
That means probably if you have a language you care about, it probably works with Jupyter, right?
00:24:40.200 --> 00:24:41.300
Or it's very edge.
00:24:41.300 --> 00:24:44.180
Most kernels won't have all the features.
00:24:44.700 --> 00:24:53.420
I would say that the one I know works with most of the features are the Python one because we maintain it.
00:24:53.420 --> 00:24:55.820
So you can see it as a reference implementation.
00:24:55.820 --> 00:25:02.900
There are other Python ones, like toys that are only a few hundred lines to show you how to implement that.
00:25:02.900 --> 00:25:07.540
The Julia kernel is a pretty feature complete.
00:25:07.540 --> 00:25:18.240
It's actually many of the features that we have in the IPython kernel were actually, by the Julia team, moved into the Julia language itself.
00:25:18.240 --> 00:25:30.200
So actually having implemented the protocol, having seen the Nullbook UI, allowed them to make much better abstraction for the Julia language and actually improve performance in some small area.
00:25:30.400 --> 00:25:32.400
So that's almost a thing.
00:25:32.400 --> 00:25:38.080
The Haskell kernel also have a really good maintainer and have really nice features.
00:25:39.120 --> 00:25:47.840
Like if you write some code in Haskell and you can rewrite it in a more compact form, the Haskell kernel will tell you that after running your code.
00:25:47.840 --> 00:25:49.840
They say, hey, you can rewrite it this way.
00:25:49.840 --> 00:25:52.560
It will be more compact and more readable by someone who does Haskell.
00:25:53.440 --> 00:25:57.280
So Ruby had some activity at some point.
00:25:57.280 --> 00:26:00.580
I'm not sure now how much activity there is.
00:26:00.580 --> 00:26:03.440
And we definitively have people from the R kernel.
00:26:03.440 --> 00:26:08.160
The R kernel was created by Thomas Clover, who is now back in the UK, still working with us.
00:26:08.160 --> 00:26:20.480
And I've been taken over by some R people who are actively contributing to the R kernel and are also reporting bugs and fixing bugs a lot in IPython itself.
00:26:20.660 --> 00:26:33.680
Yeah, and another active kernel author community is the Calico project, which is from the CS department of Bryn Mawr by Doug Blank, where it's kind of a multi-language.
00:26:33.680 --> 00:26:37.040
The kernel itself is a multi-language environment.
00:26:37.040 --> 00:26:39.440
They can actually switch between different runtimes.
00:26:39.440 --> 00:26:41.900
That does some pretty cool stuff.
00:26:41.900 --> 00:26:46.080
And they've been very helpful with implementation and protocol testing and things.
00:26:46.080 --> 00:26:46.920
Oh, that's cool.
00:26:47.040 --> 00:26:58.420
Yeah, and that's kind of related to what I was going to ask you next is if I want to write something in Python and then something in C++ and then something in R, can I do that in like one notebook and have the data work together?
00:26:58.420 --> 00:27:00.240
Well, yes.
00:27:01.420 --> 00:27:03.600
So there are a couple things to that.
00:27:03.600 --> 00:27:10.760
One is we have chosen in the notebook to associate one notebook with one kernel.
00:27:10.760 --> 00:27:16.020
So there's one process determining how to interpret the code cells and produce output as a result.
00:27:16.260 --> 00:27:34.400
There's another project derived from IPython called Beaker notebook that doesn't do this, that associates each cell with a kernel and then defines a data interchange for moving data around that allows running code and passing data around from JavaScript to R, Python, and like this.
00:27:34.940 --> 00:27:43.380
However, a kernel, from Jupyter perspective, a kernel can itself define semantics for running code in other languages.
00:27:43.380 --> 00:27:49.780
And IPython, this is where sort of the distinction between IPython and Jupyter comes up.
00:27:49.780 --> 00:27:53.660
That as far as Jupyter is concerned, there's one kernel associated with the notebook.
00:27:53.660 --> 00:28:09.980
But the IPython kernel can define these things called cell magics that say this is shorthand for actually compiling a block of C++ code with Cython and then running that, or the R magic that actually hands off code to an R interpreter.
00:28:09.980 --> 00:28:17.460
As far as Jupyter is concerned, there's only one kernel per notebook, but the kernels themselves can actually provide some of this multi-language functionality.
00:28:17.460 --> 00:28:18.800
And IPython does.
00:28:18.800 --> 00:28:31.540
Yeah, to extend on what Min said, there is another kernel, one of the Calico kernels, actually, which actually is one kernel that implements many languages, won't go exactly into details.
00:28:31.540 --> 00:28:41.780
And there is this nice distinction that a kernel is not always one language, it can be many languages, and in particular the Calico kernel uses triple person syntax to say,
00:28:41.780 --> 00:28:46.120
hey kernel, change how you parse the next string.
00:28:46.660 --> 00:28:50.820
And so you can actually switch in between three or four languages.