# 3.2 Asymptotic notation: formal definitions

(3.2notation)=
## Remarks on abuses of notation

We should make a few considerations of $O$-notation, $\Omega$-notation and $\Theta$-notation, as the abuse of writing a formula in place of a function leads to some confusion.

We denote by $\mathbb{Z}=\left\{\ldots,-2,-1,0,1,2,\ldots\right\}$ the set of natural numbers, and $\mathbb{R}$ the set of real numbers. Subscripts with unary predicates (possibly curried from many-ary predicates) are used to denote the restricted sets. So, for example, $\mathbb{Z}_{\geq 0} = \left\{n\in\mathbb{Z}\mid n\geq 0\right\}$ is a way to denote the set of ***natural numbers***. From now on, we only consider functions which are defined at least on a set of the form $\mathbb{Z}_{\geq a}$ for some $a\in\mathbb{Z}$ and which are asymptotically nonnegative (as defined in the book). As all definitions deal with limits at infinity, we do not need to worry too much about domains of functions being the same in our setting .

First, recall that a function does not depend on any variable. So a proper mathematical definition of $O$-notation would be as follows:

> Given a (asymptotically nonnegative) function $f$, we let
> \begin{equation*}O(f) = \left\{ g : \limsup_{n\to\infty}\dfrac{g(n)}{f(n)}<\infty\right\}.\end{equation*}

The other asymptotic notation ($\Omega$, $\Theta$, $o$ and $\omega$) are defined similarly.

Of course, when there is no confusion, we may always specify a variable to be used for all formulas, and identify a function with a formula in the usual manner, and we will not refrain from doing this. However, this ought to be avoided when more than one variable is being used, as inconsistencies could arise. For example, as done in the book (p. 58), we allows ourselves to use $O(f)$ in place of a function $g\in O(f)$. This is well and good for simple expressions. In particular, if we were to interpret the expression "$\sum_{i=1}^n O(i)$", this ought to be interpreted as a function of the form
\begin{equation*}\sum_{i=1}^n f_i,\end{equation*}
where $f_i\in O(i)$ for each $i$. That is, $f_1\in O(1)$, $f_2\in O(2)$, $\ldots$. However, the exact opposite is said in the book (p. 58), where it is stated that "*in the expression $\sum_{i=1}^n O(i)$ there is a single anonymous function (a function of $i$)*".

This problem arises from another convention: "*The number of anonymous functions in an expression is understood to be equal to the number of times the asymptotic notation appears.*". So, in fact, sometimes "$O(f(n))$" should be read as the **value** $g(n)$ for some $g\in O(f)$ and some parameter $n$, and not as the function $g$. This ought to be more precisely written as $[O(f)](n)$.

With this interpretation, $\sum_{i=1}^n O(f(i))$ actually stands for $\sum_{i=1}^n g(i)$ (which is a value depending on $n$) for some function $g\in O(f)$. This applies in particular for $\sum_{i=1}^n O(i)$, as long as we want this to denote the value (depending on $n$) of some function $g$, by taking $f$ the identity function in the previous phrase.

Similar problems appear in items (f) and (g) of Problem 3-5, as the question statements are not well-formed (using undefined notation).

## Remarks on running times

Recall that ***running time*** is a notion that depends on an specific input, and not solely on the algorithm being executed (as is explicitly stated in p. 30). For an algorithm (or procedure) $A$ and a valid input $x$, we denote by $\operatorname{time}(A,x)$ the time that algorithm $A$ takes to evaluate $A(x)$.

However, we are interested in analysing running time with respect to some notion of "size" of an input, which itself depends on context: If an input is an array (as in sorting algorithms), the "size" is (usually) the length of the array; if the input is an integer (as in an algorithm which checks whether a number is prime or not), its "size" could be its absolute value.

So, to properly define "*running time*", we need a notion of "size", which should be clear (even if implicit) from context. The following definitions are the ones used throughout the book:

> The **worst-case running time** of an algorithm $A$ is the function
> \begin{equation*}n\longmapsto \sup\left\{\operatorname{time}(A,x):x\text{ has size }n\right\},\end{equation*}
> and the **best-case running time** of $A$ is the function
> \begin{equation*}n\longmapsto \inf\left\{\operatorname{time}(A,x):x\text{ has size }n\right\}.\end{equation*}
>
> We say that algorithm $A$ has **running time $O(f)$** if its *worst*-case running time is so. Similarly, algorithm $A$ has **running time $\Omega(f)$** if its *best*-case running time is so, and **running time $\Theta(f)$** if it has both running time $O(f)$ and $\Omega(f)$.

Let us finish this discussion with a simple theorem. It has obvious analogues for $\Omega$ and $\Theta$ asymptotics, which we ommit.

> **Theorem**: Algorithm $A$ has running time $\Omega(f)$ if and only if for any sequence of inputs $(x_n)_{n=1}^\infty$ with each $x_n$ of size $n$, we have $\operatorname{time}(A,x_n)=\Omega(f(n))$.

Indeed, the "only if" part is immediate from the definition of supremum, so we take care of the "if" part. Assume that the latter property is true. Let $w\colon n\mapsto w(n)$ denote the worst-case running time function (with paremeter an input size $n$).

First, note that we can assume that $f(n)>0$ for all sufficiently large $n$. If this is not the case, then there are infinitely many integers $n$ such that $f(n)=0$. Then for all sufficiently large such $n$ and all choices $x_n$ of inputs of sizes $n$ we have $\operatorname{time}(A,x_n)=0$, by the definition of $O(f)$, from which it follows that $w(n)=0$ for all such $n$ as well. Thus, we can simply ignore such $n$ for which $f(n)=0$.

Similarly, we can also assume that $w(n)<\infty$ for all $n$, since this implies $f(n)=\infty$ as well for sufficiently large $n$.

For any $n\in\mathbb{N}$, choose an input $x_n$ of size $n$ such that $\operatorname{time}(x_n)>w(n)-\dfrac{f(n)}{n}$. Then

\begin{align*}
\limsup_{n\to\infty}\dfrac{w(n)}{f(n)}
  & \leq\dfrac{\operatorname{time}(x_n)+f(n)/n}{f(n)}\\
  &= \dfrac{\operatorname{time}(x_n)}{f(n)}+\dfrac{1}{n} 
  \\&<\infty.
\end{align*}

Therefore, $w(n)=\Omega(f(n))$, which means that $A$ has running time $O(f)$.

## 3.2-1

> Let $f(n)$ and $g(n)$ be asymptotically nonnegative functions. Using the basic definition of $\Theta$-notation, prove that $\max\left\{f(n),g(n)\right\} = \Theta(f(n)+g(n))$.

For large $n$,
\begin{equation*}\max\{f(n),g(n)\}\leq f(n)+g(n)\leq 2*\max\{f(n),g(n)\}.\end{equation*}

## 3.2-2

> Explain why the statement "The running time of algorithm $A$ is at least $O(n^2)$" is meaningless.

The dry answer is that the phrase "$f$ is at least $O(g)$" ($f$ being a function, e.g. the running time of algorithm $A$) has never been defined and does not make sense, not even in terms of its components. So there we go: it is meaningless.

If we want a more interesting answer, we can recall that, in common usage, saying that "*$x$ is at least $y$*" means that $x$ is greater than $y$ in some sense depending on context. As we are comparing functions asymptotically, to say that "_$f$ is at least $O(g)$_" should mean that "$f\geq O(g)$", or that "_$f\geq g'$ for some $g'\in O(g)$_" (again using the convention that "$O(g)$" denotes an anonymous function in this class).

However, ***every*** function satisfies this: simply take $g'=0$. So the phrase is vacuous and useless with this interpretation.

This is in contrast to what happens if we use the same reasoning to $\Omega$, which becomes meaningful (although unnecessary): Saying that "*$f$ is at least $\Omega(g)$*" ought to mean that "*$f\geq g'$ for some $g'\in\Omega(g)$*", which - by transitivity - simply means that $f\in\Omega(g)$. This is exactly what the book means by saying that "_$f$ grows **at least as fast** as a certain rate_" (in this case, the rate of growth of $g$).

## 3.2-3

> Is $2^{n+1}=O(2^n)$? Is $2^{2^n}=O(2^n)$?

We have
\begin{equation*}\limsup_{n\to\infty}\dfrac{2^{n+1}}{2^n}=\limsup_{n\to\infty}2 = 2<\infty,\end{equation*}
so $2^{n+1}=O(2^n)$. (Actually, $2^{n+1}=\Theta(2^n)$.)

As for the second part,
\begin{equation*}\limsup_{n\to\infty}\dfrac{2^{2^n}}{2^n}=\limsup_{n\to\infty}2^{2^n-n}=\infty,\end{equation*}
as $\lim_{n\to\infty}2^n-n=\infty$. So $2^{2^n}\neq O(2^n)$.

## 3.2-4

> Prove Theorem 3.1.

This is immediate from the definitions.

## 3.2-5

> Prove that the running time of an algorithm if $\Theta(g(n))$ if and only if its worst-case running times is $O(g(n))$ and its best-case running time is $\Omega(g(n))$.

This was discussed above, before the exercise solutions, as "running time" of an algorithm is not really well-defined in the book.

## 3.2-6

> Prove that $o(g(n))\cap \omega(g(n))$ is the empty set.

We can do even better: $O(g(n))\cap \omega(g(n))$ is the empty set (and, similarly, $o(g(n))\cap \Omega(g(n))$ is also the empty set).

Indeed, if $f(n)=O(g(n))$, then there exists $c>0$ such that, for sufficiently large $n$,
\begin{equation*}f(n)\leq cg(n).\end{equation*}
But if also $f(n)=\omega(g(n))$, then also for all sufficiently large $n$ we have
\begin{equation*}c g(n) < f(n),\end{equation*}
which is a contradiction to the previous inequality.

## 3.2-7

> We can extend our notation to the case of two parameters $n$ and $m$ that can go to $\infty$ independently at different rates. For a given function $g(n,m)$, we denote by $O(g(n,m))$ the set of functions
> \begin{equation*}\begin{array}{rcl}
    O(g(n,m))=\left\{ f(n,m) \right.&:
        &\text{there exist positive constants} c, n_0, \text{ and }m_0\\
        &&\text{such that }0 \leq f(n,m) \leq c g(n,m)\\
        &&\left.\text{for all }n\geq n_0\text{ or }m\geq m_0\right\}.\end{array}\end{equation*}
> Give corresponding definitions for $\Omega(g(n,m))$ and $\Theta(g(n,m))$.

\begin{equation*}\begin{array}{rcl}
    \Omega(g(n,m))=\left\{ f(n,m) \right.&:
        &\text{there exist positive constants} c, n_0, \text{ and }m_0\\
        &&\text{such that }0 \leq c g(n,m) \leq f(n,m)\\
        &&\left.\text{for all }n\geq n_0\text{ or }m\geq m_0\right\},\end{array}\end{equation*}
and $\Theta(g(n,m))=O(g(n,m))\cap\Omega(g(n,m))$.