# A2. Vectorized thinking

<div class="alert alert-warning">**Goal of this notebook:**
Understand some more of the systems context behind vectorized thinking.
</div>

Vectorized thinking is great for conciseness. Surely no one would prefer
```
assert len(xs) == len(ys)
[xs[i] + ys[i] for i in range(len(xs))]
```
when they can just write
```
xs + ys
```

But vectorized coding is perhaps more important from the point of view of performance. Suppose `xs` and `ys` are both very large vectors. With list comprehension, it's almost impossible for the compiler to figure out that the operation can be parallelized, and each core of a multicore machine could be working on a chunk of the data.

Here's another example. Suppose we have a vector `xs` where each entry is the contents of a web page, and the entire vector is 100 TB and it's sharded across machines, and we want to compute
```
m = -inf
for x in xs:
    m = max(m, f(x))
```
where `f` is some scoring function that is slow to evaluate. Again, it's hard for a compiler to see how to achieve parallelism when the function is written out as an explicit iteration. (Maybe not hard in this case, but hard in the general case!) But it's easy to see how the computation _could_ be distributed: tell each machine in the cluster to work on the fragment of data it can reach and compute the maximum over its fragment, then assemble the maximums from each machine. If we write the computation abstractly, e.g.
```
reduce(pairwise_max, -inf, map(f, xs))
```
then the compiler can figure out how to distribute it.

Vectorized thinking means avoiding `for` loops and instead writing our computations in a way that shows our intention more clearly, to give the compiler a chance to figure out what can be distributed and parallelized.