python · petered · Apr 5, 2018 · Apr 5, 2018 · Apr 5, 2018 · Apr 5, 2018
diff --git a/pep-9999.rst b/pep-9999.rst
@@ -0,0 +1,136 @@
+=========
+Reduce-Map Comprehensions (and the "last" builtin)
+=========
+
+Abstract
+--------
+
+This PEP proposes the addition of new "Reduce-Map" comprehension.  The Reduce-Map comprehension would perform a reduction while keeping intermediate elements.  We also propose a built-in function "last", which, combined with a Reduce-Map comprehension, would allow us to replace the :code:`reduce` builtin with a call to get the last element of a Reduce-Map generator.
+
+Rationale
+--------
+
+One current awkwardness of Python comes about when you want to perform a chain of operations and save all intermediate processing steps.  Let's use the common example of computing an Exponential Moving Average.
+
+.. code-block:: python
+
+    def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.):
+        running_average = []
+        average = initial_value
+        for xt in signal:
+            average = (1-decay)*average + decay*xt
+            running_average.append(average)
+        return running_average
+
+    signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)]
+    smooth_signal = exponential_moving_average(signal, decay=0.05)
+
+A more savvy Pythonista may rewrite this as a generator:
+
+.. code-block:: python
+
+    def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.):
+        average = initial_value
+        for xt in signal:
+            average = (1-decay)*average + decay*xt
+            yield average
+
+    signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)]
+    smooth_signal = list(exponential_moving_average(signal, decay=0.05))
+
+Still, this seems like an awful lot of lines to express such a simple thing.
+
+
+Proposal
+--------
+
+**Reduce-Map Comprehensions**
+
+We propose the syntax:
+
+.. code-block:: python
+
+    smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
+
+- Like in a reduce, the :code:`average` variable would be updated and carried forward through iterations.  
+- The :code:`from` keyword, normally reserved for imports, would be repurposed here to specify an initial value. 
+- If :code:`average` were not specified in the initial conditions, it could (1) be taken from the scope, or (2) raise a :code:`NameError`.
+
+**The "last" builtin**
+
+Because it may be a common case to only want the last element in this expression (as it is when you use reduce), we propose an additional :code:`last` builtin function to handle this nicely.  For example, if you just wanted the last element of the moving average, you could go: 
+
+.. code-block:: python
+
+    last_smooth_element = last(average = (1-decay)*average + decay*x for x in signal from average=0.)
+
+When used on a ReduceMap generator, the :code:`last` builtin would
+
+- Take the final value produced by the iterable (if the iterable yields any items) OR
+- Take the initial value (defined in the :code:`from ...` block) otherwise.
+
+When used on a normal generator (or a Reduce-Map generator without a :code:`from ...` initializer), :code:`last` would behave like:
+
+.. code-block:: python
+
+    def last_on_normal_generator(generator):
+        """The proposed `last` builtin, as it would behave on a non-ReduceMap generator"""
+        x = next(generator)
+        for x in generator:
+            pass
+        return x
+
+Like :code:`next`, this would throw a :code:`StopIteration` if given an empty generator with no :code:`from ...` initializer.
+
+
+Extension: Specifying the value to keep
+--------
+
+It may be the case that there are variables in the loop that you want to carry forward through the reduction, but that you do not want in the result.  An example that comes to mind is running a Recurrent Neural Network (RNN).  In an RNN, we have an update function:
+
+.. code-block:: python
+
+    def rnn_step(input_data, last_hidden_state):
+        """
+        :param Array[n_samples,n_input_dim] input_data: The input data at the current time step
+        :param Array[n_samples,n_hidden_dim] last_hidden_state: The hidden state from the last time step
+        :return Tuple[Array[n_samples,n_output_dim], Array[n_samples,n_hidden_dim]]: The output and hidden state.
+        """
+        ...  # Compute update here
+        return output_data, new_hidden_state
+
+
+Using the aformentioned Reduce-Map comprehension, we may run this as:
+
+.. code-block:: python
+
+   output_and_hidden_generator = (current_output, hidden = rnn_step(current_input, hidden) for current_input in input_timeseries from hidden=np.zeros((n_samples, n_hidden_dim)))
+
+
+Which would return a generator of 2-tuples.
+
+However, we generally do not want to keep the hidden values (they are just use to carry forward internal state).  The proposed extension is to enable the optional syntax: 
+
+.. code-block:: python
+
+   output_generator = (current_output, hidden = rnn_step(current_input, hidden) -> current_output for current_input in input_timeseries from hidden=np.zeros((n_samples, n_hidden_dim)))
+
+Where the :code:`-> current_output` at the end signifies that we only want to keep the :code:`current_output` at each iteration in the result.
+
+This saves us from the somewhat less readable alternatives of 
+
+- :code:`output_timeseries, _ = zip(*output_and_hidden_generator)` (which also prevents garbage collection of hidden states)
+- :code:`output_timeseries = list(oh[0] for oh in output_and_hidden_generator)` 
+
+And allows the persion building the generator to control which variables should be internal, rather than the user of the generator.  
+
+Just as before, if we're only interested in the last output, we can use:
+
+.. code-block:: python
+
+   final_output = last(current_output, hidden = rnn_step(current_input, hidden) -> current_output for current_input in input_timeseries from hidden=np.zeros((n_samples, n_hidden_dim)))
+
+Conclusion
+-----------
+
+I think there are many situations where the ReduceMap comprehension would be useful.  This construct is easily readable, would save many lines of code, and as a bonus would allow us to replace the :code:`reduce` builtin with a more readable, Pythonic comprehension.