---
numbering:
  title:
    offset: 1
---

(ch3.2)=
# Function Operations

:::{caution} Under Construction
We have not integrated the demos into the text yet. To start interacting with them, go to this [Notebook](https://datahub.berkeley.edu/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fds-modules%2FDATA-89&branch=main&urlpath=tree%2FDATA-89%2Fbasic_functions_week_3.ipynb).
:::

A function operation is a procedure we can apply to transform or combine functions. Function operations are the essential tools that make mathematical modeling expressive. They are also the key to breaking complicated functions down into bite-sized pieces. The better you get at recognizing functions, and the richer your album of mental images, the more efficiently you will be able to break down formula into their pieces, visualize each piece, then visualize their combinations. 

:::{tip} Strategy
Try to expand your function as a transformation, or combination, of simpler pieces. Since visualizing transforms and combinations is hard, use as few pieces as you can.
:::

## Linear Transformations...
1. **to the Input:** These transforms are used to generalize almost every distribution family. They need to be instinctive. 
    - **Horizontal Translation:** Replace $f(x)$ with $f(x - s)$ to **translate** the function horizontally by a shift $s$. For example, $f(x - 3)$ looks like $f(x)$ shifted horizontally to the right by 3 units. 

    ![Horizontal Translation](normal_shift_example.svg "Horizontal Translation")

    - **Dilation:** Replace $f(x)$ with $f(x/a)$ for some $a > 0$ to **dilate** the function. You can think of $a$ as controlling a zoom factor on the horizontal axis.
        - Using $a$ less than 1 compresses the function by making it narrower. 
        - Using $a$ greater than 1 expands the function by making it wider. For example, setting $a = 3$ makes the function three times wider.

        ![Horizontal Dilation](normal_dilation_example.svg "Horizontal Dilation")

        - If $a < 0$ then the result also reflects $f$ about $x = 0$. 

    - **Generic:** Replace $f(x)$ with $f((x - s)/a)$.

1. **to the Output:** We can apply the same operations to the outputs of functions. 
    - **Vertical Translation:** Replace $f(x)$ with $f(x) + h$ to **translate** the function vertically by a height $h$. For example, $f(x) + 2$ looks like $f(x)$ shifted vertically by 2 units.

    ![Vertical Translation](vertical_translation_example.svg "Vertical Translation")

    - **Vertical Scaling:** Replace $f(x)$ with $c f(x)$ to **scale** the function. You can think of $c$ as controlling a zoom factor on the vertical axis. 
        - Using $c < 1$ shrinks the function by making it shorter.  For example, replacing $f(x)$ with $\frac{1}{3} f(x)$ compresses $f$ vertically by a factor of 3.

        ![Vertical Scaling](vertical_scaling_example.svg "Vertical Scaling")

        - Using $c > 1$ expands the function by making it taller.
        - If $c < 0$ then the function reflects about the horizontal axis.

    - **Generic:** Replace $f(x)$ with $c f(x) + h$.

:::{important} Maintaining Normalization

All distribution functions must be normalized. For instance, in [Section 2.4](#ch2.4) we saw that, for any PDF:

$$\int_{x = -\infty}^{\infty} \text{PDF}(x) dx = 1. $$

It is standard practice to define a generic family of models by setting $\text{PDF}(x) \propto g((x - s)/a)$ for some nonnegative function $g$, shift $a$, and horizontal dilation $a$. These parameters are often called a **location** and a **scale** parameter. The location parameter controls the horizontal position of the distribution. The scale parameter controls its breadth. 

The $\propto$ notation hides the normalization constant. This is useful, since the essence of a distribution is its shape, which is determined by the functional form $g$. However, when $g$ depends on some free parameters, then we should always remember that:

$$\text{PDF}(x) = c(s,a) g((x - s)/a) $$

for some constant $c(s,a)$ that also depends on the parameters. 

The location parameter has no effect on the normalizing constant since it just shifts the distribution. The scale parameter does. It can make the distribution wider or narrower. Just like a rectangle, if we make a distribution twice as wide, we double its area. So, to keep the distribution normalized, we must also always make it twice as short. 

Generically:

$$\text{PDF}(x) = \frac{C}{|a|} g \left(\frac{(x - s)}{a} \right) $$

where $C$ is just a number determined by $g$ ($C = 1/(\int_{x = -\infty}^{\infty} g(x) dx)$). That way, if we make the distribution wider by adjusting the scale parameter we also make it shorter.

:::

Run the code cell below to visualize linear transformations of the inputs and outputs of a function.

ðŸ¦ºðŸ”¨ðŸ§± Under construction. Here soon!

## Function Combinations:

:::{tip} Decomposing Functions
Every function studied in this class can be broken into simpler parts that are combined either by an algebraic operation or via a composition. Look for ways to break functions into recognizable parts. 
:::

1. **Algebraic Combination:** 
    - **Function Addition and Multiplication:** As they sound, $f(x) + g(x)$ or $f(x) \times g(x)$. 
        - Visualize the function sums like a stacked plot where the two functions sit on top of one another. 
        - Visualizing function products takes practice, and is often best left to the tools from [Sections 3.1](#ch3.1) and [Section 3.3](#ch3.3). When given a product, always check the roots and sign of each term separately. Unfortunately, many distributions are expressed as products of functions. 
    - **Linear Combination:** This is a special version of function addition. It looks like $a f(x) + b g(x)$ for some coefficient $a$ and $b$ that scale each term in the combination.  
        - You can visualize a linear combination either by drawing its two component functions, $a f(x)$ and $b g(x)$ separately, then adding them together to produce the combo.The green and blue bumps are the component functions. The red curve is their linear combination. Varying $a$ or $b$ makes the associated bumps taller or shorter.

        ![Linear Combination](Linear_Combination.png "Linear Combination")

        - Alternately, you can use a stacked plot convention where you first draw $a f(x)$, then you draw $a f(x) + b g(x)$ where the difference between your first curve and your second curve is $b g(x)$. Here's the same combination, using a stacked convention. In this example we drew the blue bump first, then added the green bump on top of it.

        ![Linear Combination Stacked](Linear_Combination_Stacked.png "Linear Combination Stacked")

        - Important examples in probability are mixture distributions.

        :::{tip} Mixture Distributions
        :class: dropdown

        To construct a mixture, sample in stages. For example, suppose that we have two large populations. We first pick a population to sample from at random, then, from that population, draw a sample of $n$ individuals. Count the number of the sampled individuals who have a characteristic of interest. Suppose that, in the first population, 1 in 5 individuals have the characteristic, and in the second, 2 of 3 do. 
        
        This process can be modelled as follows. First, draw a Bernoulli random variable $I \sim \text{Bernoulli}(p)$ where $p$ is the chance we select the second sample population. Then, if $I = 0$, draw $X \sim \text{Binomial}(n,1/5)$. If $I = 1$, draw $X \sim \text{Binomial}(n,2/3)$. 
        
        :::{caution} A Minor Caveat
        These shouldn't be exactly Binomial since we usually sample without replacement, but, if the population is much larger than $n$, its not a bad estimate.
        :::

        Then, what is the PMF for $X$? Well, we can find the chance that $X = x$ by partitioning, then using the multiplication rule. Alternately, draw an outcome tree. In either case:

        $$\begin{aligned}
        \text{PMF}(x) & = \text{Pr}(X = x) = (1 - p) \left(\begin{array} n \\ x \end{array} \right) \left(\frac{1}{5} \right)^x \left(\frac{4}{5} \right)^(n - x) +  p \left(\begin{array} n \\ x \end{array} \right) \left(\frac{2}{3} \right)^x \left(\frac{1}{3} \right)^(n - x)  \\
        & = (1 - p) \text{PMF}_{X|I = 0}(x) + p \text{PMF}_{X|I = 1}(x)
        \end{aligned}$$

        where $\text{PMF}_{X|I = 0}(x)$ is the PMF when we draw from the first population, and $\text{PMF}_{X|I = 1}(x)$ is the PMF when we draw from the second. The resulting PMF is a *mixture* of the two PMF's since it is a linear combination of the two.

        :::


Run the code cell below to visualize function addition and multiplication. 

ðŸ¦ºðŸ”¨ðŸ§± Under construction. Here soon!

2. **Function Composition:** The **composition** of $h$ and $g$ is $h \circ g(x) = h(g(x))$. Many distributions are expressed as compositions. 
    - To visualize an arbitrary function composition, proceed as follows: 
        1. Draw the inner function, $g(x)$, and the outer function $h(x)$. Clearly distinguish them with different colors or markers so we don't mix them up. 
        1. Add to your plot the line $y = x$. This line is useful since we can use it to exchange inputs and outputs.
        1. Work an input at a time. Pick some $x$. Add a point at $(x,0)$ on the x-axis. Trace a lightly dashed line vertically upwards so we can remember where we started.
            - Next, add a point at $(x,g(x))$ where your dashed vertical meets $g(x)$. We've now produced the output of the inner function. 
            - To pass the output of the inner function into the input of the outer function, trace horizontally across from $(x,g(x))$ to $(g(x),g(x))$. This is the intercept between the horizontal line passing through $(x,g(x))$ and the $y = x$ line. Then, trace vertically from $(g(x),g(x))$ to $(g(x),h(g(x)))$. This is the intercept of a vertical line leaving $(g(x),g(x))$ and intersecting the outer function $f$. 
            -  We've now recovered $h(g(x))$. To plot it at the correct input, trace horizontally until you intercept the lightly dashed line leaving the original $x$. That is, from $(g(x),h(g(x)))$ to $(x,h(g(x)))$.
    - This process is a bit involved at first, but its a nice visual procedure. Once you get the hang of it, you can use it to very quickly evaluate compositions of arbitrary $f$ and $g$. Just repeat the process for a bunch of different $x$ values. It is good practice to try this by hand at least once. 
    
Run the code cell below to visualize the composition of two functions. The dashed orange lines represent the procedure provided above. Try building up an example composition. A good place to start is $f(x) = e^{-\frac{1}{2} x^2 + 1}$ where the inner function $g$ is a negated quadratic and the outer function is an exponential. You'll practice with this example in discussion.
    - This is an example where the inner function is concave, and the outer function is both monotonic and nonnegative. This recipe *inner concave*, *outer monotonically increasing and nonnegative* is a good procedure for building density functions. The outer function ensures that the composition returns a nonnegative number. It is usually selected so that it converges to zero in the limit as its input approaches the smallest possible output of the inner function. 
    - As another example, try $f(x) = h(g(x))$ with $g(x) = 0.2 \times(1  + x^2)$ and $h(x) = 1/x$. 

![Composite](Composite.png "Composite")

ðŸ¦ºðŸ”¨ðŸ§± Under construction. Here soon!

## Inverses:

If $f$ is monotonic, then its inverse, $f^{-1}$ is the function that accepts outputs of $f$ and returns the matching input.
- In other words, given $f(x) = y$, $f^{-1}(y) = x$. 

- Inverse are constructed by reflecting $f$ about the $x = y$ line (exchange inputs and outputs).
    - To reflect, do the following:
        1. Draw $f(x)$.
        1. Draw the line $y = x$ which exchanges inputs and outputs.
        1. Sketch the reflection of $f$ across $y = x$. 
        1. If you struggle to sketch the reflection, work one input at a time:
            - Select some $x$. Add a point at $(x,f(x)).$
            - Trace horizontally to $(f(x),f(x))$. This is the intercept of the horizontal line through $(x,f(x))$ with the $y = x$ line. 
            - Trace vertically from $(x,f(x))$ to $(x,x)$. 
            - You now have two sides, and three corners, of a square. Complete the square by adding in the missing corner at $(f(x),x)$. You have now swapped the inputs and outputs of $f$. This new point is the reflection. 
        1. Repeat this process for many inputs. The resulting curve is the inverse function since it accepts outputs of $f$, and returns the matching inputs. 
            - The image below shows an example. The blue function if $f$, the dashed grey line is the $y = x$ line that matches inputs and outputs, and the red curve is the inverse produced by reflecting across $y = x$. The orange filled square is the square produced by the process described above.

![Inverse Example](Inverse_function.png "Example Inverse")

- The most important example in probability are the exponential and logarithm functions.
- Run the code cell below to visualize function inverses. Start with a linear function, and see how the inverse varies as we vary the initial function. The square you see represents the graphical construction outlined above. It is good practice to try this by hand at least once. 
    - Once you've run the code below, go back to the composition demo provided above, and pick inner and outer functions that are rekated by an inverse. For example, $e^x$ and $\log(x)$. Then, the graphical construction used to create the composite will trace the boundary of the reflecting square, always returning the $(x,x)$ corner. In other words, $f^{-1}(f(x)) = x$. 

ðŸ¦ºðŸ”¨ðŸ§± Under construction. Here soon!