small edits to pydata berlin

blaze · May 29, 2015 · a3db472 · a3db472
1 parent 954c3db
commit a3db472
Show file tree

Hide file tree

Showing 6 changed files with 29 additions and 17 deletions.
diff --git a/docs/source/_static/presentations/images/xeon-phi.jpg b/docs/source/_static/presentations/images/xeon-phi.jpg
diff --git a/docs/source/_static/presentations/markdown/dask-array.md b/docs/source/_static/presentations/markdown/dask-array.md
@@ -20,19 +20,19 @@ Continuum Analytics
 
 ### Related work
 
-*  Parallel BLAS implementations - ScaLAPACK, Plasma, ...
-*  Distributed arrays - PETSc/Trillinos, Elemental, HPF
-*  Parallel collections - Hadoop/Spark (Dryad, Disco, ...)
-*  Task scheduling frameworks - Luigi, swift-lang, ...
-*  Python big-numpy projects: Distarray, Spartan, Biggus
+*  Parallel BLAS implementations -- ScaLAPACK, Plasma, ...
+*  Distributed arrays -- PETSc/Trillinos, Elemental, HPF
+*  Parallel collections -- Hadoop/Spark (Dryad, Disco, ...)
+*  Task scheduling frameworks -- Luigi, swift-lang, ...
+*  Python big-numpy projects -- Distarray, Spartan, Biggus
 *  Custom solutions with MPI, ZMQ, ...
 
 <hr>
 
 ### Distinguishing features of `dask.array`
 
-*  Full ndarray support, no serious linear algebra
-*  Shared memory parallelism, not distributed
+*  Full ndarray support, instead of serious linear algebra
+*  Focus on shared memory parallelism (workstation, not cluster)
 *  Immediately usable - `conda/pip` installable
 *  Dask includes other non-array collections
 

diff --git a/docs/source/_static/presentations/markdown/dask-core.md b/docs/source/_static/presentations/markdown/dask-core.md
@@ -27,7 +27,7 @@ Dead simple task scheduling
 ![](images/embarrassing.gif)
 
 
-## Useful for more than just arrays
+## Dask works for more than just arrays
 
 
 ## `dask.bag`

diff --git a/docs/source/_static/presentations/markdown/foundations.md b/docs/source/_static/presentations/markdown/foundations.md
@@ -1,3 +1,6 @@
+### PyData builds off of NumPy and Pandas
+
+
 ### NumPy and Pandas provide foundational data structures
 
 <img src="images/jenga.png" width="100%">
@@ -37,7 +40,7 @@ Date:   Thu Feb 1 08:32:30 2001 +0000
 ### These limitations affect the PyData ecosystem
 
 
-### Hardware has changed since 1999
+### Hardware has changed since 2001
 
 ![](images/multicore-cpu.png)
 
@@ -48,7 +51,18 @@ Date:   Thu Feb 1 08:32:30 2001 +0000
 * Fast Solid State Drives (disk is now extended memory)
 
 
-### Problems have changed since 1999
+### Hardware has changed since 2001
+
+![](images/xeon-phi.jpb)
+
+* Multiple cores
+   *  4 cores -- cheap laptop
+   *  32 cores -- workstation
+* Distributed memory clusters in big data warehousing
+* Fast Solid State Drives (disk is now extended memory)
+
+
+### Problems have changed since 2001
 
 *  Larger datasets
 *  Messier data
@@ -67,12 +81,9 @@ Date:   Thu Feb 1 08:32:30 2001 +0000
 
 *  The Global Interpreter Lock (GIL) stops two Python threads from
    manipulating Python objects simultaneously
-*  Can use multiple processes in simple cases
-*  PyData could cheat the GIL
-
-   because we rely on C/Fortran code
-
-   but we don't take advantage of this
+*  Solutions:
+    * Compute in separate processes (hard to share data)
+    * Release the GIL and use C/Fortran code
 
 
 ### PyData rests on single-threaded foundations

diff --git a/docs/source/_static/presentations/markdown/pydata-berlin-fin.md b/docs/source/_static/presentations/markdown/pydata-berlin-fin.md
@@ -26,6 +26,7 @@
 
 *   [Bottleneck issue](https://github.com/kwgoodman/bottleneck)
 
+
 ### Final thoughts
 
 [http://dask.pydata.org](http://dask.pydata.org)

diff --git a/docs/source/_static/presentations/markdown/pydata-berlin.md b/docs/source/_static/presentations/markdown/pydata-berlin.md
@@ -34,4 +34,4 @@ Continuum Analytics
 
 *  Gigabyte - Fits in memory, need one core  (laptop)
 *  Terabyte - Fits on disk, need ten cores  (workstation)
-*  Petabyte - Fits on many disks, need 1000 cores (distributed cluster)
+*  Petabyte - Fits on many disks, need 1000 cores (cluster)