ch1.tex

% Copyright (c) 2023 Carl Martin Ludvig Sinander.

% This program is free software: you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation, either version 3 of the License, or
% (at your option) any later version.

% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.

% You should have received a copy of the GNU General Public License
% along with this program. If not, see <https://www.gnu.org/licenses/>.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


There is a single agent (or `buyer') and one indivisible unit of a good.
The agent's valuation $t \in [0,1]$ for the good is privately known to her.
She has quasi-linear expected-utility preferences, meaning that her payoff from getting the good with probability $q \in [0,1]$ and paying $p \in \R$ is $t q - p$.

(This entails an assumption of risk-neutrality over monetary lotteries, but that is not important for most results. What \emph{is} important for several results is that preferences are strongly separable between the good and money. See §\ref{sec:ch1:risk-neutrality} for a discussion.)

A \emph{principal} can design any mechanism she likes:
a mechanism specifies various (observable) actions the agent can take,%
	\footnote{The actions are often called `messages'.}
and as a function of these actions, the probability $q$ with which the agent receives the good and the payment $p$ that she makes.
The agent chooses optimally among these actions.
Let's write $q(t)$ for the probability with which the agent of type $t$ gets the good, given her actions, and $p(t)$ for the payment she makes.
The map $q : [0,1] \to [0,1]$ is called the induced \emph{allocation} of the good, while $p : [0,1] \to \R$ is the induced \emph{payment rule.}

We'd first like to know which allocations and payment rules the principal can achieve,
given that only the agent knows her valuation $t$ and that she can be relied upon to choose optimally.
(We'll also assume that the agent can be relied upon to break any indifferences she might experience in whatever way we'd like her to. That's not a big deal when there's just one agent.)


%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%
\section{The revelation principle}
\label{sec:ch1:revelation}
%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%

We may drastically reduce the space of mechanisms we consider.
For any allocation $q : [0,1] \to [0,1]$ and payment rule $p : [0,1] \to \R$,
we may define a simple `direct revelation mechanism' (DRM):
the agent is asked to make a report $r \in [0,1]$ of her type,
and is then given the good with probability $q(r)$ and pays $p(r)$.
This DRM is denoted simply by $(q,p)$.
(So `$(q,p)$' can denote an allocation--payment rule pair,
\emph{or} a direct mechanism; mathematically the same, but psychologically distinct.)

A DRM is called \emph{truthful,} or \emph{incentive-compatible (IC),}
iff every type of the agent weakly prefers to report her own type.
Clearly if $(q,p)$ is an IC DRM, then it induces the allocation $q$ and payment rule $p$.

\begin{namedthm}[Revelation principle.]
	%
	\label{observation:rev}
	%
	If the allocation and payment rule $(q,p)$ are induced by some mechanism,
	then $(q,p)$ (viewed as a DRM) is incentive-compatible
	(and thus induces the allocation $q$ and the payment rule $p$).
	%
\end{namedthm}

The revelation principle is both deep and trivial.
By the latter, I mean that fully understanding it is tantamount to considering it obvious.

\begin{proof}
	%
	Let $(q,p)$ be induced by some mechanism.
	Fix a type $t$.

	Since type $t$ of the agent behaves optimally,
	her payoff from her outcome $(q(t),p(t))$ is better than the payoff she could get from any deviation.

	One deviation would be for her to take whatever actions some other type $r \in [0,1]$ takes; we call this `mimicking type $r$'.
	This would obviously give type $t$ the outcome $(q(r),p(r))$.
	We infer that type $t$ likes her own outcome $(q(t),p(t))$ weakly better than the outcome $(q(r),p(r))$ of any other type $r \neq t$.

	Now consider the DRM $(q,p)$.
	The \emph{only} deviations available to $t$
	are to mimic type $r$, for some $r \neq t$.
	We just showed that each such deviation is unprofitable;
	thus $(q,p)$ is IC.
	%
\end{proof}

The revelation principle allows us to restrict attention to incentive-compatible direct revelation mechanisms for analytical purposes, and that's what we'll do.
(But once we've finished analysing and have found an optimal mechanism,
we shall have occasion to consider whether it admits a natural \emph{indirect implementation:} i.e. a more natural mechanism that induces the same allocation and payment rule.)


%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%
\section{The envelope theorem}
\label{sec:ch1:env}
%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%

Fix a (direct revelation) mechanism $(q,p)$.
Type $t$'s problem is to choose an action (her report) $r \in [0,1]$, where her objective $f(r,t)$ is given by
%
\begin{equation*}
	f(r,t) = t q(r) - p(r) .
\end{equation*}
%
So programmatically, the agent's problem is to choose her report $r$ to maximise $f(r,t)$.

Suppose that our mechanism is an incentive-compatible one,
meaning that the report $r=t$ yields a global maximum of $f(\cdot,t)$.
One implication of incentive-compatibility is that no type $t$ wishes to mimic a \emph{nearby} type $r$; this is called \emph{local incentive-compatibility.}
Intuitively, local IC is captured by the first-order condition
%
\begin{equation*}
	\left. \frac{\dd}{\dd m} f( t+m, t ) \right|_{m=0} = 0 ,
	\quad \text{or more compactly,} \quad
	f_1(t,t)=0 .
\end{equation*}
%
Formally, we'll say that $(q,p)$ is locally IC if this expression holds for a.e. type $t \in [0,1]$.


Now let's re-express that.
Write $V(t) \coloneqq f(t,t)$ for the value of truthful reporting.
Differentiating on both sides yields
%
\begin{equation*}
	V'(t)
	= f_1(t,t) + f_2(t,t) .
\end{equation*}
%
The first term on the RHS is exactly the thing that's zero if our mechanism is a locally IC one.
So local IC is equivalent to
%
\begin{equation*}
	V'(t) = f_2(t,t)
	\quad \text{for a.e. $t \in [0,1]$,}
\end{equation*}
%
or (integrating)
%
\begin{equation*}
	V(t) = V(0) + \int_0^t f_2(s,s) \dd s
	\quad \text{for every $t \in [0,1]$.}
\end{equation*}
%
This is called the \emph{envelope formula.}
Recalling that $f(r,t) = t q(r) - p(r)$ by definition,
the envelope formula more explicitly reads
%
\begin{equation*}
	t q(t) - p(t) = - p(0) + \int_0^t q .
\end{equation*}
%
(Here `$\int_0^t q$' is a shorter way of writing `$\int_0^t q(s) \dd s$'.)

The fact that IC mechanisms satisfy the envelope formula is an instance the \emph{envelope theorem.}
(The envelope theorem is more general: it applies to \emph{any} parametrised maximisation problem. Other applications are Shephard's lemma, Roy's identity and Hotelling's lemma.)


The above argument has a big hole: given our completely arbitrary mechanism $(q,p)$, there is no reason why the agent's reporting payoff $f(r,t) = t q(r) - p(r)$ should be a differentiable function of $r$, in which case the derivative
%
\begin{equation*}
	\left. \frac{\dd}{\dd m} f( t+m, t ) \right|_{m=0}
	\equiv f_1(t,t)
\end{equation*}
%
simply does not exist.%
	\footnote{There's another issue:
	the step from `$V'(t) = f_2(t,t)$ for a.e. $t \in [0,1]$' to `$V(t) = V(0) + \int_0^t f_2(s,s) \dd s$ for every $t \in [0,1]$'
	is valid iff $V$ is \emph{absolutely continuous---}see §\ref{sec:ch1:envelope_rigorous}.}
But the above argument \emph{can} be made rigorous: it really is true that the envelope formula is a fancy way of rewriting `local optimality' (suitably defined; see \textcite{Sinander2022}).
Alternatively and more traditionally,
one can use a completely different (unintuitive but elegant) argument to derive the envelope formula directly from (global) IC \parencite{MilgromSegal2002}---see §\ref{sec:ch1:envelope_rigorous} at the end of this chapter.
Either way, we have learned:

\begin{namedthm}[Mirrlees envelope theorem.]
	%
	\label{proposition:ic_env}
	%
	Any IC mechanism $(q,p)$ satisfies the envelope formula:
	%
	\begin{equation*}
		t q(t) - p(t) = - p(0) + \int_0^t q
		\quad \text{for every $t \in [0,1]$.}
	\end{equation*}
	%
\end{namedthm}


\begin{remark}[revenue/payoff equivalence]
	%
	\label{remark:rev_equivalence}
	%
	It follows that any two indirect mechanisms
	that induce the same allocation $q$
	\emph{and} induce a payment of $p(0)=0$ from type $t=0$
	must actually induce the exact same payment rule $p$.
	(Why? Prove it!)
	Thus any two such mechanisms provide the same payoff to every type of the agent,
	and also the same payment (or `revenue' from the perspective of the principal/seller collecting the money).
	%
\end{remark}


\begin{remark}
	%
	\label{remark:env_powerful}
	%
	Why is the envelope formula so restrictive (and thus the envelope theorem so powerful)?
	Fix a type $t \in (0,1)$.
	The IC constraint $V(r) \geq f(t,r)$
	deterring a higher type $r>t$ from mimicking $t$
	may be rewritten (by subtracting $V(t)$ from both sides and dividing by $r-t$) as
	%
	\begin{equation}
		\frac{V(r) - V(t)}{r-t} \geq \frac{f(t,r) - f(t,t)}{r-t} ,
		\label{eq:env_ineq_above}
	\end{equation}
	%
	so that letting $r \downarrow t$ yields $V'(t) \geq f_2(t,t)$.
	Similarly, the IC constraint $V(r') \geq f(t,r')$
	deterring a \emph{lower} type $r'<t$ from mimicking $t$
	may be rewritten
	%
	\begin{equation}
		\frac{V(t) - V(r')}{t-r'} \leq \frac{f(t,t) - f(t,r')}{t-r'} ,
		\label{eq:env_ineq_below}
	\end{equation}
	%
	which as $r' \uparrow t$ yields $V'(t) \leq f_2(t,t)$.
	Putting together our two conclusions, we have
	%
	\begin{equation*}
		f_2(t,t) \leq V'(t) \leq f_2(t,t) ,
		\quad \text{or} \quad
		V'(t) = f_2(t,t) .
	\end{equation*}
	%
	Economically, the fact that we obtain an equality comes from the fact that we must deter $t$ from being mimicked by other types $r,r'$ that are \emph{arbitrarily} close to her. (Formally, because the set $[0,1]$ of types is a connected set.)
	If types were spaced out discretely, then we would obtain only the more permissive `inequality' envelope formula comprising \eqref{eq:env_ineq_above}--\eqref{eq:env_ineq_below}.
	%
\end{remark}


\paragraph{The literature.}
The intuitive treatment in this section can be made formal: see \textcite{Sinander2022}.
In §\ref{sec:ch1:envelope_rigorous} below, we'll give a rigorous treatment of the envelope theorem from a different perspective \parencite{MilgromSegal2002}.


%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%
\section{Characterisation of incentive-compatibility}
\label{sec:ch1:ic}
%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%

\begin{namedthm}[Spence--Mirrlees lemma.]
	%
	\label{proposition:SM_lemma}
	%
	A mechanism $(q,p)$ is IC if and only if
	it satisfies the envelope formula
	and $q$ is increasing.
	%
\end{namedthm}

\begin{proof}
	%
	Fix a mechanism $(q,p)$.
	Write $V(t) \coloneqq t q(t) - p(t)$ for the value of truthful reporting.

	Observe first that \emph{if} $(q,p)$ satisfies the envelope formula, then the payoff loss of type $t$ from mimicking type $r$ instead of reporting truthfully is
	%
	\begin{align}
		V(t) - [ t q(r) - p(r) ]
		&= V(t) - V(r)
		+ [ r q(r) - p(r) ]
		- [ t q(r) - p(r) ]
		\nonumber
		\\
		&= \int_r^t q(s) \dd s
		- (t-r) q(r)
		\nonumber
		\\
		&= \int_r^t \left[ q(s) - q(r) \right] \dd s ,
		\label{eq:dev_payoff}
		\tag{$\star$}
	\end{align}
	%
	where the final equality holds by the fundamental theorem of calculus.

	Suppose that $(q,p)$ is IC.
	We've seen (the \hyperref[proposition:ic_env]{Mirrlees envelope theorem} above) that it satisfies the envelope formula.
	So by IC and \eqref{eq:dev_payoff}, we must have 
	%
	\begin{equation*}
		\int_r^t \left[ q(s) - q(r) \right] \dd s \geq 0
		\quad \text{for all $r,t \in [0,1]$,}
	\end{equation*}
	%
	which is only possible if $q$ is increasing.
	(Right? Convince yourself.)

	Suppose that $(q,p)$ satisfies the envelope formula and that $q$ is increasing.
	Then type $t$'s payoff loss from mimicking $r$ is given by \eqref{eq:dev_payoff} as
	%
	\begin{equation*}
		\int_r^t \left[ q(s) - q(r) \right] \dd s ,
	\end{equation*}
	%
	which is non-negative since $q$ is increasing.
	Thus $(q,p)$ is IC.
	%
\end{proof}


\paragraph{The literature.}
The Spence--Mirrlees lemma has been extended in various ways.
At the highest level of generality, the `outcome' $q$ belongs to an abstract space $Q$ ($=[0,1]$ in the text) equipped with a partial order,
and the agent's payoff is some function $f(q,p,t)$ ($=tq-p$ in the text).
Versions of the result go through whenever $f$ satisfies the \emph{Spence--Mirrlees (`single-crossing') condition,}
defined in terms of how different types' indifference curves in $q$--$p$ space cross each other.%
	\footnote{When preferences have the quasi-linear form $f(q,p,t) = g(q,t) - p$, Spence--Mirrlees requires precisely that $g$ be \emph{supermodular.}
	In general, a broader (ordinal) notion of supermodularity (or `complementarity') characterises the Spence--Mirrlees condition \parencite[][Theorem 3]{MilgromShannon1994}.}
See \textcite[§4]{Sinander2022} for an overview of such results (plus a new, general result).


%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%
\section{Participation}
\label{sec:ch1:part}
%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%

It is natural to assume that the agent can walk away.
In particular, she may consume an outside option worth zero to her (no good, no payment).

It is without loss of generality to focus on mechanisms that induce every type of the agent to participate.
This is because if type $t$ were not participating, then we could invite her to participate and award her the outcome $(q(t),p(t)) = (0,0)$ if she does, which is no better (or worse) than non-participation.

We may thus focus on IC mechanisms $(q,p)$ that induce participation, i.e. those such that every type $t$'s payoff $t q(t) - p(t)$ is non-negative. Such mechanisms are called \emph{individually rational (IR).}

\begin{corollary}
	%
	\label{corollary:ic_ir}
	%
	A mechanism $(q,p)$ is IC and IR if and only if
	it satisfies the envelope formula,
	$q$ is increasing,
	and $p(0) \leq 0$.
	%
\end{corollary}

\begin{proof}
	%
	`$\implies$' direction:
	if $(q,p)$ is IR, then the payoff $0 \times q(0) - p(0) = -p(0)$ of type $t=0$ must be at least zero (the value of the outside option).
	And we've already seen (the \hyperref[proposition:SM_lemma]{Spence--Mirrlees lemma}) that IC requires the other properties.

	`$\Longleftarrow$' direction:
	fix a mechanism $(q,p)$, and
	write $V(t) \coloneqq t q(t) - p(t)$ for the value of type $t$.
	If $(q,p)$ satisfies the first two properties, then we've seen (the \hyperref[proposition:SM_lemma]{Spence--Mirrlees lemma}) that it is IC.
	If it furthermore satisfies $p(0) \leq 0$,
	then $V(0) = -p(0) \geq 0$,
	and so by the \hyperref[proposition:ic_env]{Mirrlees envelope theorem}
	%
	\begin{equation*}
		V(t) = V(0) + \int_0^t q \geq 0 
		\quad \text{for every $t \in [0,1]$,}
	\end{equation*}
	%
	which is to say that $(q,p)$ is IR.
	%
\end{proof}


%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%
\section{The optimality of posting a price}
\label{sec:ch1:post}
%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%

Suppose that the good is owned by a monopolist (principal) who wishes to sell it so as to maximise expected profit.
(Equivalently, re-interpret the agent as a continuum of consumers, each with a privately-known valuation; the monopolist wishes to maximise profit. This problem bears the antiquated name of \emph{second-degree price discrimination.})

The monopolist may use any mechanism she likes, but must take into account that only the agent knows her valuation and that she may choose to walk away.

Here's a very simple (indirect) mechanism that the monopolist could adopt: post a price!
More fully, the monopolist sets a price $\pi \in \R_+$
and gives the agent two options:
purchase the good at price $\pi$ (pay $\pi$ and get the good for sure)
or don't (pay nothing and get nothing).

What allocation and payments does this induce?
Agents of type $t > \pi$ will purchase the good, so $(q(t),p(t)) = (1,\pi)$ for them, while agents of type $t < \pi$ will not, so $(q(t),p(t)) = (0,0)$.

\begin{exercise}
	%
	\label{exercise:posted_price}
	%
	(a) This allocation--payment pair (viewed as a direct mechanism) is IC; why?
	(b) Verify that this mechanism satisfies the envelope formula.
	(c) There are other mechanisms $(q,p')$, i.e. mechanisms with the same allocation but different payments. Describe them.
	%
\end{exercise}

This is about as simple a mechanism as can be devised.
One thing that makes it simple is that it does not make use of the monopolist's power to allocate the good randomly, which is in principle very powerful.
Nonetheless:

\begin{theorem}[\cite{Myerson1981}]
	%
	\label{theorem:Myerson}
	%
	There is a posted-price mechanism that is optimal.
	%
\end{theorem}

The rest of this section is devoted to proving this result.
Myerson's original argument is based on a duality technique called `(ironed) virtual valuations'. This technique is important and useful; indeed, Myerson's proof directly extends to the case of multiple agents. (This yields Myerson's celebrated result that a second-price auction with a reserve price is optimal.)
We shall instead pursue a direct, convexity-based argument that I learned from Eran Shmaya.

We begin by using our preceding results (in particular, \Cref{corollary:ic_ir}) greatly to narrow down the space of mechanisms under consideration.
It is obviously not optimal to subsidise type $t=0$ (and thus by the envelope formula to lower the payments of all types), so $p(0)=0$ is optimal.
The monopolist therefore merely has to choose the allocation, which can be any increasing function $q : [0,1] \to [0,1]$.
The payment rule $p$ is then pinned down by the envelope formula and $p(0)=0$ as
%
\begin{equation*}
	p(t)
	= t q(t) - \int_0^t q 
	\quad \text{for each $t \in [0,1]$.}
\end{equation*}

This narrows our problem down to one of choosing from among the space of all increasing functions $[0,1] \to [0,1]$.
That's still a very large (infinite-dimensional) space, though, so we aren't out of the woods yet.
It isn't obvious, for example, that it won't be optimal sometimes to allocate with interior probability (selling a coin toss), as that could have incentive benefits.

The monopolist's revenue is just the agent's payment, which depends on her type according to the equation above.
The monopolist views the agent's valuation as a random variable; we'll denote it by $T$, and write $F$ for its CDF.
Her expected revenue is thus
%
\begin{equation*}
	\E_{T \sim F}( p(T) )
	= \E_{T \sim F}\left( T q(T) - \int_0^T q \right)
	\eqqcolon R(q) .
\end{equation*}
%
The monopolist's problem is to maximise $R(q)$
by choosing $q$ from the space $\mathcal{Q}$ of all increasing maps $q : [0,1] \to [0,1]$.

Note that $\mathcal{Q}$ is a convex space: if $q$ and $q'$ are both increasing maps $[0,1] \to [0,1]$, then so is $\alpha q + (1-\alpha) q'$ for any scalar $\alpha \in [0,1]$.
Note further that the objective $R$ is linear:
%
\begin{equation*}
	R( \alpha q + (1-\alpha) q' ) = \alpha R(q) + (1-\alpha) R(q') 
	\quad \text{for any $q,q' \in \mathcal{Q}$ and $\alpha \in [0,1]$.}
\end{equation*}
%
$R$ is also continuous in a suitable sense.
Finally, $\mathcal{Q}$ is (in fact) compact in a suitable sense.

Now let's do some convex geometry (see \cref{ch:convexity} for a little overview).
An \emph{extreme point} of $\mathcal{Q}$ is an element of $\mathcal{Q}$ that cannot be expressed as the convex combination of two \emph{distinct} elements of $\mathcal{Q}$.

\begin{observation}
	%
	\label{observation:bauer}
	%
	Any convex and suitably continuous function $\phi : \mathcal{Q} \to \R$
	is maximised at an extreme point of $\mathcal{Q}$.
	%
\end{observation}

\begin{proof}
	%
	It is intuitive, and in fact true, that any element of the compact convex set $\mathcal{Q}$ can be written as an (infinite) convex combination of the extreme points of $\mathcal{Q}$.
	Results like this constitute a little field called Choquet theory,
	and Choquet's theorem%
		\footnote{Or its generalisation, the Choquet--Bishop--de Leeuw theorem. See e.g. \textcite{Phelps2001}.}
	says that
	we may for any $q \in \mathcal{Q}$ find a probability measure $\mu$ defined on $\ext \mathcal{Q}$ such that $q = \int_{\ext \mathcal{Q}} q' \mu(\dd q')$.%
		\footnote{That's a Lebesgue integral;
		it's a fancy way of saying that $q$ is a convex combination of $q'$s in $\ext \mathcal{Q}$, with $\mu(\{q'\}) \in [0,1]$ being the weight placed on $q' \in \ext \mathcal{Q}$.}

	Now, let $q \in \mathcal{Q}$ maximise a convex and suitably continuous function $\phi$ on $\mathcal{Q}$.
	Then since we have $q = \int_{\ext \mathcal{Q}} q' \mu(\dd q')$ for some probability measure $\mu$ on $\ext \mathcal{Q}$, we have
	%
	\begin{equation*}
		\phi(q) \leq \int_{\ext \mathcal{Q}} \phi(q') \mu(\dd q') 
	\end{equation*}
	%
	by Jensen's inequality,
	and thus $\phi(q) \leq \phi(q')$ for some $q' \in \ext \mathcal{Q}$.
	Since $q$ is optimal, $q'$ must be, too.
	%
\end{proof}


\Cref{observation:bauer} permits us to conclude that the monopolist's problem $\max_{q \in \mathcal{Q}} R(q)$
admits a solution that is an extreme point of the space $\mathcal{Q}$ of the space of increasing functions $[0,1] \to [0,1]$.
I claim that the extreme points of $\mathcal{Q}$
are exactly those functions
 $q : [0,1] \to [0,1]$
that satisfy
%
\begin{equation*}
	q(t) =
	\begin{cases}
		0	& \text{for $t<t^\star$} \\
		1	& \text{for $t>t^\star$} 
	\end{cases}
	\quad \text{and} \quad q(t^\star) \in \{0,1\}
	\quad
	\text{for some $t^\star \in [0,1]$.}
\end{equation*}
%
I will call such functions \emph{impulses} (my term).
It is a fact that all and only impulses are extreme points of $\mathcal{Q}$.

\begin{exercise}
	%
	\label{exercise:incr_ext_pnts}
	%
	Prove it! That is,
	(a) show that every impulse is an extreme point of $\mathcal{Q}$, and
	(b) show that every extreme point of $\mathcal{Q}$ is an impulse.%
		\footnote{Part (b) is harder, so here's a hint.
		Prove the contra-positive: fix any non-impulse $q \in \mathcal{Q}$, and try to show that it isn't an extreme point, by constructing distinct $q^-,q^+ \in \mathcal{Q}$ such that $q = \alpha q^- + (1-\alpha) q^+$ for some $\alpha \in (0,1)$.
		Be careful to ensure that your $q^-$ and $q^+$ are legitimately elements of $\mathcal{Q}$, i.e. that they take values in $[0,1]$ and are increasing.}
		% A solution: $q^-(t) \coloneqq q(t) - \min\left\{ q(t), 1-q(t) \right\}$, $q^+(t) \coloneqq q(t) + \min\left\{ q(t), 1-q(t) \right\}$ and $\alpha=1/2$.}
	%
\end{exercise}


We conclude that there is a revenue-maximising allocation $q$
that is an impulse.
In other words, all types above a threshold get the good for sure,
while those below do not get the good.
This is a remarkably simple allocation rule; for one thing, it is deterministic!

What are the implied payments? You can calculate the payment rule from the envelope formula and the condition $p(0)=0$. (Try it!)
Or from scratch: although each type $t$ can make many different reports in the direct mechanism, they fall into two categories: reports $< t^\star$, which yield $q=0$,
and reports $>t^\star$, which yield $q=1$.
Clearly IC requires that all types $<t^\star$ make the same payment;
and similarly for types $>t^\star$.
By $p(0)=0$, the former group pay zero;
let's write $\pi$ for what the latter group pay.
Evidently type $t^\star$ must be indifferent; so $\pi = t^\star$.

We have shown that whatever the distribution $F$ of the agent's type,
there is a posted-price mechanism that is optimal.
We have not described the optimal price $\pi$; this depends on the distribution $F$, and is easily characterised via a first-order condition.%
	\footnote{You may have derived this first-order condition when studying monopoly pricing in introductory microeconomics.}


\paragraph{The literature.}
The theorem is due to \textcite{Myerson1981}; his result is actually more general, as it covers the case of several agents.
He proved it by developing a duality technique based on `(ironed) virtual valuations' that has proved useful in other environments.
The direct convexity-based approach here is more intuitive to me,
and has other applications: \textcite{KleinerMoldovanuStrack2021} is a nice recent example.
More generally, much of the structure of modern economic theory is simply convex structure, as described by convex analysis. The standard reference here is \textcite{Rockafellar1970}; for Choquet theory in particular, see e.g. \textcite{Phelps2001}.


\begin{exercise}
	%
	\label{exercise:ambiguity}
	%
	Suppose that the monopolist is not a Bayesian decision-maker:
	instead of having a single belief $F$ about the agent's valuation,
	she entertains an entire set $\mathcal{F}$ of beliefs $F$.
	For each given belief $F$, let us write
	%
	\begin{equation*}
		R_F(q) \coloneqq \E_{T \sim F}( p(T) )
		= \E_{T \sim F}\left( T q(T) - \int_0^T q \right)
	\end{equation*}
	%
	for the monopolist's expected revenue from an increasing allocation $q$.

	\begin{enumerate}[label=(\alph*)]
	
		\item
		Suppose to begin with that the monopolist is a `maxmax' decision-maker, meaning that she evaluates an allocation $q$ according to the most optimistic of the beliefs $F$ in the set $\mathcal{F}$:%
			\footnote{This could be interpreted as `motivated reasoning'.}
		%
		\begin{equation*}
			\mathcal{U}(q) = \max_{F \in \mathcal{F}} R_F(q) .
		\end{equation*}
		%
		Prove that there is a posted-price mechanism that is optimal.
		(Hint: $\mathcal{U}$ is not linear. But\dots?)

		\item
		Suppose instead that the monopolist has `maxmin' preferences:
		she evaluates an increasing allocation $q$ according to the pessimistic criterion
		%
		\begin{equation*}
			\mathcal{V}(q) = \min_{F \in \mathcal{F}} R_F(q) .
		\end{equation*}
		%
		(This captures `uncertainty-aversion', as exemplified by the Ellsberg paradox.)
		Can our argument above be salvaged? Explain.

		\item
		Maintain the maxmin assumption,
		and further suppose that $\mathcal{F}$ has a least element $\underline{F}$, in the sense of first-order stochastic dominance.
		(That is, every $F \in \mathcal{F}$ first-order stochastically dominates $\underline{F}$.)

		\emph{Reminder: $F$ first-order stochastically dominates $G$
		if and only if $\E_{T \sim F}( \phi(T) ) \geq \E_{T' \sim G}( \phi(T') )$ for every increasing $\phi : [0,1] \to \R$.}

		\begin{enumerate}[label=(\roman*)]
		
			\item Fix an increasing allocation $q$.
			Let $p$ be the payment rule induced by $q$ via the envelope formula and the condition $p(0)=0$.
			Show that $p$ is increasing.

			\item 
			Prove that
			$\mathcal{V}(q) = R_{\underline{F}}(q)$
			for every increasing allocation $q$.

			\item
			Show that there is a posted-price mechanism which is optimal.
		
		\end{enumerate}
	
	\end{enumerate}
	%
\end{exercise}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The role of commitment}
\label{sec:ch1:commitment}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The monopolist's ability to commit to a mechanism is the backbone of the above analysis. To see why it matters, consider an optimal posted-price mechanism with price $\pi$. Although it maximises \emph{expected} revenue, the monopolist might get unlucky ex post: if the agent's valuation turns out to be less than $\pi$, then the good stays with the monopolist, and she earns no revenue at all.

Having observed the agent's failure to purchase, the monopolist may reasonably infer that the agent's valuation is less than $\pi$, but quite possibly still positive.
Were she able to, the monopolist would now very much like to offer the good for sale once more, this time at a lower price.

Were the agent to expect the monopolist to behave in this way, however,
it would change the agent's behaviour in the first place: even if her valuation exceeds $\pi$, she may decline to purchase at this price since doing so will secure her a better deal.
Continuing this reasoning suggests, correctly, that the revelation principle is invalid absent commitment by the monopolist.
The trouble is that there are now IC constraints not just for the agent, but also for the monopolist.

The monopolist's inability to commit not to try to sell the good again is an inherently dynamic problem,
and so we need a dynamic model.
(We were able to avoid this previously because the revelation principle made static mechanisms without loss.)
We'll let the `length' of a period be $\Delta>0$,
so that the periods are $n \in \{0,\Delta,2\Delta,\dots\}$.
The discount rate (for both the agent and the monopolist) is $r>0$.
(So between adjacent periods, the agent discounts by factor $\delta = e^{-r\Delta}$.)
Note that we are assuming the good to be durable (non-perishable): its value does not diminish over time.

Previously, we allowed the monopolist to commit (in each period) to either (a) hand over the good (immediately) or (b) to retain the good (forever).
As discussed above, it is plausible to suppose that the monopolist is unable to commit to (b): if she fails to sell the good, then she cannot bind her future selves not to try to sell it again.
It remains reasonable (do you agree?) to assume that the monopolist has the power to commit to (a) hand over the good: this means that the she is able to prevent her future selves from clawing back the good from the agent.%
	\footnote{This commitment power on the monopolist's part presumably comes from an external system of enforced property rights. Where such enforcement or rights are absent, a monopolist may not be able to prevent her future self from expropriating the agent.}

This suggests the following interaction within each period: the monopolist offers the good for sale at a price, and the agent accepts or rejects.
If the agent accepts, then the monopolist gives her the good (and thus commits permanently to relinquish the good---she cannot claw it back).
If not, then the monopolist keeps the good until the next period.%
	\footnote{The monopolist might wish to offer a new price right away, instead of waiting until the next period. We are ruling this out as infeasible---the (possibly very short) `period length' $\Delta>0$ captures constraint on how long it takes to carry out one round of bargaining between buyer and monopolist.}
This is a bargaining game, and we'll be interested in its (perfect Bayesian) equilibria. (Let's not get bogged down in defining PBE formally, though.)

Let's assume that the agent's valuation $T$ is supported on an interval $S \subseteq [0,1]$. Recall that we normalise the monopolist's valuation of the good to zero.

The \emph{Coase conjecture} asserts that the monopolist's bargaining power is small when offers are frequent ($\Delta$ is small): in equilibrium, she sells at a low price.
More formally, the claim is roughly that the good sells a.s. in finite time,
at a price $p^\Delta$ that approaches $\inf S$ as $\Delta \to 0$.
The classical intuition for this conjecture is that when the period length $\Delta$ is small, the monopolist in period $n$ engages in near-perfect competition with her period-$(n+1)$ self.

This intuition is incomplete, it seems to me, at least when $\inf S > 0$, because `perfect competition' surely means selling at marginal cost (which we've normalised to zero), rather than at price $\inf S$!
Here's a better intuition that I learned from Francesco Nava: the seller's lack of commitment constrains her eventually to sell the good whatever the buyer's value, and constrains her no further than that; thus she sells the good at the highest market-clearing price, which is $\inf S$.

Is the conjecture true?
If we restrict attention to stationary equilibria,
then given some technical assumptions,
it is indeed true \parencite{GulSonnenscheinWilson1986}.
This can be proved easily, in about a page: see the nice and short note by Qingmin Liu (\citeyear{Liu2015}).

What about non-stationary equilibria?
(The results below involve some technical assumptions.)
If $\inf S > 0$ (the `gap case'), there is an (essentially) unique equilibrium, and it is stationary;
but if $\inf S = 0$ (the `no-gap case'), a folk theorem holds, 
meaning that there exist non-stationary `reputational' equilibria
which support trade at any price between zero and the full-commitment monopoly price, provided $\Delta>0$ is small enough \parencite{GulSonnenscheinWilson1986,AusubelDeneckere1989}.


\paragraph{The literature.}
The conjecture is from \textcite{Coase1972}.
The big papers are \textcite{FudenbergLevineTirole1985,GulSonnenscheinWilson1986,AusubelDeneckere1989};
see \textcite{AusubelCramtonDeneckere2002} for a survey.
For the intuition I attributed to Francesco Nava, see \textcite{NavaSchiraldi2019} and the references therein.
For a taste of recent work on this topic, consider \textcite{DovalSkreta2021,BrzustowskiGeorgiadisharrisSzentes2023,LomysYamashita2022}.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Selling several goods}
\label{sec:ch1:multi-d}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Suppose now that the monopolist has two goods to sell.
We consider the simplest case: the agent's payoff is $t_1 q_1 + t_2 q_2 - p$.

The monopolist's revenue-maximisation problem with two goods is an open problem, despite its apparent simplicity!
This is true even when $t_1,t_2$ are assumed to be statistically independent.
In this section, we'll try to get a feel for what the issues are, and why this problem is hard.

Mechanisms are now $(q_1,q_2,p) : [0,1]^2 \to [0,1]^2 \times \R$.
It remains true that an allocation $(q_1,q_2)$ is implementable (i.e. a payment rule $p$ can be found such that $(q_1,q_2,p)$ is IC)
exactly if $(q_1,q_2)$ is suitably monotone;
in particular, \emph{cyclically monotone.}%
	\footnote{The term comes from convex analysis \parencite[see][]{Rockafellar1970}. This link between implementability and convex analysis was spotted by \textcite{Rochet1987}.}

Let's write $\mathcal{Q}_2$ for all cyclically monotone allocations.
$\mathcal{Q}_2$ is convex (and compact),
and revenue is a linear function $R : \mathcal{Q}_2 \to \R$,
so our previous analysis tells us that there must be an optimal mechanism that is an extreme point of $\mathcal{Q}_2$.
But characterising these extreme points is hard, and there are many of them.
More broadly: cyclic monotonicity is intractable.

So what can happen? For one thing, it can easily be that optimal mechanisms feature random assignment.
Here's an example from \textcite{HartReny2015}:

\begin{proposition}
	%
	\label{proposition:hartreny}
	%
	Suppose that the agent's valuation can be either
	$(t_1,t_2)$ can be either $(1,0)$, $(0,2)$ or $(3,3)$, each with equal probability.
	The uniquely optimal direct mechanism is
	%
	\begin{equation*}
		( q_1(t), q_2(t), p(t) )
		=
		\begin{cases}
			\left( \tfrac{1}{2}, 0, \tfrac{1}{2} \right)
			& \text{for $t=(1,0)$} \\
			( 0, 1, 2 )
			& \text{for $t=(0,2)$} \\
			( 1, 1, 5 )
			& \text{for $t=(3,3)$.} 
		\end{cases}
	\end{equation*}
	%
\end{proposition}

There is an IC constraint for each pair of distinct types (so six IC constraints), plus an IR constraint for each type.

\begin{exercise}
	%
	\label{exercise:hartreny_ic}
	%
	Verify that this mechanism is IC and IR.
	%
\end{exercise}

Some of these constraints bind, and others do not.
The `downward' IC constraints that deter type $t=(3,3)$ from pretending to be one of the lower-valuation types both bind:
this type's truthful payoff is
%
\begin{equation*}
	3 \times 1 + 3 \times 1 - 5
	= 1 ,
\end{equation*}
%
while she earns
%
\begin{equation*}
	3 \times \tfrac{1}{2} + 3 \times 0 - \tfrac{1}{2}
	= 1
\end{equation*}
%
by mimicking type $t=(1,0)$
and earns
%
\begin{equation*}
	3 \times 0 + 3 \times 1 - 2
	= 1
\end{equation*}
%
by mimicking type $t=(0,2)$.
You can also easily verify that the IR constraints for types $(1,0)$ and $(0,2)$ both bind.

This is where the random outcome for type $(1,0)$ is valuable to the monopolist.
By randomising, she is able to keep \emph{both} `downward' IC constraints binding, while still giving type $(1,0)$ a payoff of zero.
Using deterministic outcomes, it is typically not possible to satisfy several downward IC constraints with equality without ceding `information rents' to the lower types in question.

If that was too loose for you, here's a proof.
It's a lot of algebra, \emph{but with `economic' remarks added in italics.}

\begin{proof}
	%
	For an arbitrary mechanism $(q_1,q_2,p)$,
	we may write
	$(\alpha_1,\beta_1,\pi_1)$, $(\alpha_2,\beta_2,\pi_2)$ and $(\alpha_3,\beta_3,\pi_3)$ for $( q_1(t), q_2(t), p(t) )$
	for $t = (1,0)$, $=(0,2)$ and $=(3,3)$, respectively.
	Expected revenue is $\frac{1}{3}\pi_1 + \frac{1}{3}\pi_2 + \frac{1}{3}\pi_3$.

	The mechanism in the proposition is easily shown to satisfy all of the IC and IR constraints.
	We shall show that it is uniquely optimal in a relaxed revenue-maximisation problem in which some of these constraints are ignored; that obviously implies that it is uniquely optimal in the original problem.

	So consider the problem of choosing a mechanism subject only to the IR constraints for types $(1,0)$ and $(0,2)$
	and the `downward' IC constraints $(3,3) \to (1,0)$ and $(3,3) \to (0,2)$, i.e.
	%
	\begin{align*}
		\alpha_1 - \pi_1
		&\geq 0 \\
		2 \beta_2 - \pi_2
		&\geq 0 \\
		3 \alpha_3 + 3 \beta_3 - \pi_3
		&\geq 3 \alpha_1 + 3 \beta_1 - \pi_1 \\
		3 \alpha_3 + 3 \beta_3 - \pi_3
		&\geq 3 \alpha_2 + 3 \beta_2 - \pi_2 .
	\end{align*}
	%
	Rewriting yields
	%
	\begin{align*}
		\pi_3 + 3 \alpha_1 + 3 \beta_1 - 3 \alpha_3 - 3 \beta_3
		&\leq \pi_1
		\leq \alpha_1
		\\
		\pi_3 + 3 \alpha_2 + 3 \beta_2 - 3 \alpha_3 - 3 \beta_3
		&\leq \pi_2
		\leq 2 \beta_2 .
	\end{align*}
	%
	Clearly to maximise $\frac{1}{3}\pi_1 + \frac{1}{3}\pi_2 + \frac{1}{3}\pi_3$,
	we must choose $\pi_1 = \alpha_1$ and $\pi_2 = 2 \beta_2$.
	\emph{(IR binds for the two low-valuation types: it is not optimal to give information rents to `low' types.)}
	The remaining constraints are
	%
	\begin{align*}
		\pi_3
		&\leq 3 \alpha_3 + 3 \beta_3 - 2 \alpha_1 - 3 \beta_1
		\\
		\pi_3
		&\leq 3 \alpha_3 + 3 \beta_3 - 3 \alpha_2 - \beta_2 .
	\end{align*}
	%
	It is then clearly optimal to let $\alpha_3 = \beta_3 = 1$ and $\beta_1 = \alpha_2 = 0$.
	\emph{(This is intuitive: the high-valuation type gets both goods, while each of the lower-valuation types get zero of the good that they do not value.)}
	Clearly $\pi_3$ should be set as high as possible subject to these two constraints:
	%
	\begin{equation*}
		\pi_3 = \min\left\{ 6 - 2 \alpha_1, 6 - \beta_2 \right\} .
	\end{equation*}
	%
	The remainder of the problem is to choose $\alpha_1,\beta_2$ to maximise
	%
	\begin{equation*}
		\frac{1}{3}\alpha_1 + \frac{2}{3}\beta_2
		+ \frac{1}{3}\min\left\{ 6 - 2 \alpha_1, 6 - \beta_2 \right\} .
	\end{equation*}
	%
	\emph{The trade-off here is that
	raising $\alpha_1$ allows a higher payment ($\pi_1 = \alpha_1$) to be extracted from type $t=(1,0)$,
	but also tightens the IC constraint \textrm{`$(3,3) \to (1,0)$'}, potentially requiring $\pi_3$ to be lowered.
	Similarly for $\beta_2$.}
	
	Since this expression is increasing in $\beta_2$, it is optimal to set $\beta_2 = 1$.
	We now merely have to choose $\alpha_1$ to maximise
	%
	\begin{equation*}
		\frac{1}{3}\alpha_1 + \frac{2}{3}
		+ \frac{1}{3}\min\left\{ 6 - 2 \alpha_1, 5 \right\} .
	\end{equation*}
	%
	This is (uniquely) achieved by setting $\alpha_1 = 1/2$.

	\emph{The economics of the last step are as follows.
	As we decrease $\alpha_1$ from $1$,
	we decrease the payment $\pi_1 = \alpha_1$ that we may extract from type $(1,0)$,
	but the amount $\pi_3 = 6-2\alpha_1$ that we may charge type $(3,3)$ rises twice as quickly since lowering $\alpha_1$ slackens the IC constraint $(3,3) \to (1,0)$;
	so this is worth it.
	But once $\alpha_1$ hits $1/2$, decreasing it further continues to be costly, but without the benefit, because now the other downward IC constraint $(3,3) \to (0,2)$ binds, preventing us from raising $\pi_3$ any further.
	Thus $\alpha_1 = 1/2$ is uniquely optimal.}
	%
\end{proof}

\paragraph{The literature.}
\textcite{HartReny2015} provide a good guide to why this problem is difficult and why optimal mechanisms are stranger than one might expect; they also provide good references to the literature.
Computer scientists have taken an interest in this slice of economic theory, and much recent progress has come from that corner,
including the closest thing to a recent breakthrough: \textcite{DaskalakisDeckelbaumTzamos2017}.%
	\footnote{Their proof uses fancy optimal-transport techniques; see \textcite{Galichon2016} for an economist's introduction. \textcite{KleinerManelli2019} found an elementary proof.}
Although `simple' mechanisms (e.g. selling each good separately at a posted price, or selling all the goods in one big bundle at a posted price) are typically not optimal, they can sometimes do rather well: see \textcite{HartNisan2017}.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Incentives via side bets}
\label{sec:ch1:CremerMclean}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Return to the case of a single good (with commitment),
and let $\mathcal{T} \subseteq [0,1]$ denote the set of types.
We previously assumed that $\mathcal{T} = [0,1]$,
but in this section, it will sometimes be convenient to suppose that $\mathcal{T}$ is finite.

In defining a `mechanism' at the beginning of this chapter, we implicitly ruled out incentive schemes in which the outcome $(q,p) \in [0,1] \times \R$ depends on factors beyond the agent's control: only the agent's actions in the mechanism were allowed to affect the outcome.

We now relax this assumption. There is a contractible \emph{signal}
whose realisation becomes known
(i) late enough that the agent has already chosen her actions in the mechanism, but
(ii) early enough that monetary payments can be made contingent on it.
(It won't matter whether the allocation of the good can depend on the realisation of the signal.)
Suppose for simplicity that there are only finitely many possible signal realisations, $\mathcal{X}=\{1,\dots,\abs*{\mathcal{X}}\}$.

The signal can arise from a noisy monitoring technology.
(For example, any external information about the market value of the good.)
Alternatively, the signal could be a \emph{second} agent's type,
provided (as will turn out to be the case) that this second agent can costlessly be prevailed upon truthfully to reveal her type.

Each type $t \in \mathcal{T}$ of the agent has a belief $\mu(t)$ about how likely various signal realisations are.
Formally, a \emph{belief} (about the signal) is a vector $\mu \in [0,1]^{\abs*{\mathcal{X}}}$ whose entries sum to unity. We write $\Delta$ for the set of all beliefs.

\begin{remark}
	%
	\label{remark:signal_exante}
	%
	This model is `ex interim', which you may find unintuitive.
	In the (slightly more restrictive) `ex ante' version of the model,
	the signal $X$ and agent type $T$ are random variables, drawn from some joint (`prior') distribution,
	and the belief $\mu(t)$ of type $t \in \mathcal{T}$
	is simply the vector whose $x^\text{th}$ entry (for $x \in \{1,\dots,\abs*{\mathcal{X}}\}$) is $\mu(t)_x = \PP( X=x | T=t )$.

	Insofar as $X$ and $T$ are not independent, the signal is informative about the agent's type.
	In (only) this case, the reverse is also true:
	the agent's type is informative about the signal, meaning that from observing her own type, the agent learns something about the likely realisation of the signal.
	In other words, $\mu(t)$ differs across types $t \in \mathcal{T}$
	if (and only if) $X,T$ are not independent.%
		\footnote{A simple example: if $X$ and $T$ are `positively correlated' in the sense of affiliation, then higher types have `higher' beliefs in the sense of monotone likelihood ratio.}
	%
\end{remark}

As before, an \emph{allocation} is a map $q : \mathcal{T} \to [0,1]$.
A \emph{contingent payment rule} is a vector-valued map $p : \mathcal{T} \to \R^{\abs*{\mathcal{X}}}$;
the interpretation is that if the agent reports that her type is $r \in \mathcal{T}$ and the signal turns out to be $x \in \mathcal{X}$,
then the agent pays $p(r)_x \in \R$.

A \emph{direct revelation mechanism} is now $(q,p)$,
where $q$ is an allocation
and $p$ is a contingent payment rule.
A direct revelation mechanism $(q,p)$ is \emph{incentive-compatible} iff each type $t \in \mathcal{T}$ finds it optimal (in expectation, given her belief) to report truthfully:
%
\begin{equation*}
	q(t) t - p(t) \cdot \mu(t)
	\geq q(r) t - p(r) \cdot \mu(t)
	\quad \text{for all $r,t \in \mathcal{T}$.}
\end{equation*}
%
As before, a revelation principle permits us without loss to restrict attention to incentive-compatible direct revelation mechanisms, henceforth called simply `IC mechanisms'.
An allocation $q$ is called \emph{implementable} iff there is a contingent payment rule $p$ such that $(q,p)$ is incentive-compatible.

Suppose first that all types of the agent have the same belief about the signal ($\mu(r) = \mu(t)$ for all $r,t \in \mathcal{T}$).
(In the `ex ante' model of \Cref{remark:signal_exante}, this means precisely that the signal is uninformative about the agent's type---they are independent random variables.)
In this case, contingent payments cannot help with incentive provision:
it remains true that (all and) only increasing allocations $q$ are implementable.

\begin{exercise}
	%
	\label{exercise:cremer-mclean_samebelief}
	%
	Prove it!
	%
\end{exercise}

As soon as different types of the agent have different beliefs about the signal
(i.e. the signal is informative about type),
contingent payments can be used to provide additional incentives.
The reason is that different types then have different preferences over `side bets' (monetary gambles on the realisation of the signal),
which can be exploited to induce self-selection.

The power of side bets is most clearly illustrated
in the case in which \emph{all} types have `separate' beliefs.
The appropriate notion of `separate' is slightly stronger than `distinct' ($\mu(r) \neq \mu(t)$ for all $r,t \in \mathcal{T}$):
it also requires that different types' beliefs not be linearly dependent in a weak (`convex') sense.
Precisely:

\begin{definition}
	%
	\label{definition:convex_indep}
	%
	When the set $\mathcal{T}$ of types is finite,
	the beliefs $\{ \mu(t) : t \in \mathcal{T} \} \subseteq \Delta$
	are said to satisfy \emph{convex independence}
	iff no type's belief may be written as a convex combination of \emph{other} types' beliefs:
	for each $t \in \mathcal{T}$, $\mu(t)$ lies outside the convex hull of $\{ \mu(r) : r \in \mathcal{T} \setminus \{t\} \}$.
	%
\end{definition}

More explicitly, convex independence requires that
for no $t \in \mathcal{T}$
can we write $\mu(t) = \sum_{r \in \mathcal{T} \setminus \{t\}} \lambda_r \mu(r)$
for a collection $( \lambda_r )_{r \in \mathcal{T} \setminus \{t\}} \subseteq [0,1]$ of weights satisfying $\sum_{r \in \mathcal{T} \setminus \{t\}} \lambda_r = 1$.
This is weaker than \emph{linear independence} of the vectors $\{ \mu(t) : t \in \mathcal{T} \}$ because only weights $( \lambda_r )_{r \in \mathcal{T} \setminus \{t\}}$ that sum to unity are contemplated.

\begin{exercise}
	%
	\label{exercise:convex_indep_ext_point}
	%
	Let $\mathcal{T}$ be finite,
	fix beliefs $M \coloneqq \{ \mu(t) : t \in \mathcal{T} \} \subseteq \Delta$,
	and let $\co M$ denote the convex hull of $M$.
	If beliefs $M$ satisfy convex independence, must $\mu(t)$ be an extreme point of $\co M$, for every $t$?
	If $\mu(t)$ is an extreme point of $\co M$ for every $t$, must $M$ satisfy convex independence?
	%
\end{exercise}

If $\abs*{\mathcal{X}} > \abs*{\mathcal{T}}$, then convex independence is `generic', in the formal sense that if we draw each type's belief independently from the Lebesgue measure on $\Delta$, then the probability that convex independence fails is zero.


\begin{theorem}
	%
	\label{theorem:CremerMclean}
	%
	If the set $\mathcal{T}$ of types is finite
	and the beliefs $\{ \mu(t) : t \in \mathcal{T} \} \subseteq \Delta$
	satisfy convex independence,
	then for any mechanism $(q,p)$,
	there is a contingent payment rule $p' : \mathcal{T} \to \R^{\abs*{\mathcal{X}}}$
	such that $(q,p')$ is incentive-compatible
	and $p(t) \cdot \mu(t) = p'(t) \cdot \mu(t)$.
	%
\end{theorem}

In other words, \emph{any} mechanism can be made incentive-compatible
without changing either the allocation or the expected payment of any type $t \in \mathcal{T}$.

\begin{corollary}
	%
	\label{corollary:CremerMclean_implementability}
	%
	Under the hypotheses of \Cref{theorem:CremerMclean},
	\emph{every} allocation $q : \mathcal{T} \to [0,1]$ is implementable.
	%
\end{corollary}

A much-emphasised consequence of \Cref{theorem:CremerMclean} is the following:

\begin{corollary}
	%
	\label{corollary:CremerMclean_fullextraction}
	%
	Under the hypotheses of \Cref{theorem:CremerMclean},
	a revenue-maximising seller can design a `full-extraction' mechanism that induces every type $t$ to pay, in expectation, her full valuation $t$ for the good.
	%
\end{corollary}

\begin{proof}
	%
	Start with the mechanism $(q,p)$ given by $(q(t),p(t)) \coloneqq (1,t)$ for every $t \in \mathcal{T}$.
	\Cref{theorem:CremerMclean} delivers a payment rule $p'$
	under which type $t$ (still) pays
	$p'(t) \cdot \mu(t) = p(t) \cdot \mu(t) = t$ in expectation,
	and such that $(q,p')$ is incentive-compatible.
	Finally, since $(q,p)$ is individually rational by inspection,
	the same is true of $(q,p')$.
	%
\end{proof}

\begin{proof}[Proof of \Cref{theorem:CremerMclean}]
	%
	Fix an arbitrary mechanism $(q,p)$.
	For each type $t \in \mathcal{T}$,
	by convex independence and the (strict) separating hyperplane theorem, there is a (`side bet') $b(t) \in \R^{\abs*{\mathcal{X}}}$ that satisfies $b(t) \cdot \mu(t) = 0 < b(t) \cdot \mu(r)$ for every $r \in \mathcal{T} \setminus \{t\}$.

	Define a new contingent payment rule $p' : \mathcal{T} \to \R^{\abs*{\mathcal{X}}}$ by $p'(t) \coloneqq p(t) + \alpha b(t)$ for each $t \in \mathcal{T}$,
	where $\alpha > 0$ is a constant large enough that
	%
	\begin{equation*}
		\alpha
		> \max_{\text{$r \neq t$ in $\mathcal{T}$}} \frac{ - [ q(t) - q(r) ] t + [p(t)-p(r)] \cdot \mu(t) }
		{ b(r) \cdot \mu(t) } .
	\end{equation*}
	%
	Clearly $p'(t) \cdot \mu(t) = p(t) \cdot \mu(t)$ for every $t \in \mathcal{T}$.
	And for all $r \neq t$ in $\mathcal{T}$, we have
	%
	\begin{align*}
		q(t) t - p'(t) \cdot \mu(t)
		&= q(t) t - p(t) \cdot \mu(t)
		\\
		&> q(r) t - [ p(r) + \alpha b(r) ] \cdot \mu(t)
		= q(r) t - p'(r) \cdot \mu(t) ,
	\end{align*}	
	%
	where the inequality holds by our choice of $\alpha$.
	%
\end{proof}

In the proof, the `side bet' $b(t)$ is a monetary gamble on the signal; it has zero expected value for type $t$, but strictly positive expected value for all the other types.
By tacking a sufficiently blown-up version of this gamble onto the payment rule $p$,
we ensure that the agent's incentive to choose the right \emph{gamble} overwhelms all other considerations:
she reports truthfully regardless.

The usual interpretation of these results
is that (essentially) \emph{any} correlation between the signal and type sharply alters the model's predictions: incentive-provision becomes trivial.
This is widely felt to be very disturbing.
There are several reasons for this; one of them is large side bets are rarely seen in the wild.

I encourage you to reflect on the significance and implications of these results.
I also encourage you think about what sorts of modifications of the model would break the proof.


\begin{exercise}
	%
	\label{exercise:CremerMclean_riskaversion}
	%
	Suppose that the agent is risk-averse:
	type $t$'s payoff when her (ex-post) payment is $p \in \R$ is
	$t-\psi(p)$ in case she gets the good
	and $-\psi(p)$ otherwise,
	where $\psi : \R \to \R$ is convex and strictly increasing.
	(We have relaxed risk-neutrality, but \emph{not} strong separability between the good and money---see the discussion in §\ref{sec:ch1:risk-neutrality}.)
	Prove that \Cref{corollary:CremerMclean_implementability} remains true.
	(Hint: \Cref{theorem:CremerMclean} is not true as stated\dots but a version of it is.)
	Do you think that \Cref{corollary:CremerMclean_fullextraction} remains true?
	%
\end{exercise}

\paragraph{The literature.}
The extraordinary power of side bets was glimpsed by \textcite[§7]{Myerson1981}, who gave an example,
and then demonstrated at increasing levels of generality by \textcite{CremerMclean1985,CremerMclean1988,McafeeReny1992}.
\textcite{LopomoRigottiShannon2022} provide a careful and insightful synthesis of such results, paying special attention to the case (neglected in this section) in which the set $\mathcal{T}$ of types is infinite (e.g. $\mathcal{T}=[0,1]$).

There is a cottage industry devoted to `debunking' these results.
For example, the logic fails if payments cannot be unboundedly large, e.g. due to an ex-post IR or `limited liability' constraint \parencite{Robert1991},
or if the good and money are not strongly separable in the agent's preferences \parencite{Robert1991,Eso2005}.
More subtly, \textcite{Neeman2004} argues that convex independence is not so `generic' after all.%
	\footnote{This nice paper sparked an arcane literature about whether convex independence is `generic' in the universal type space.
	The answer: it depends (greatly) on what is meant by `generic'.
	\emph{(Universal what?! See my lecture notes from Eddie Dekel's course `Economic theory and methods', available at \href{https://ludvigsinander.net/lecture_notes}{ludvigsinander.net/lecture\_notes}.)}}
Others have highlighted that large side bets have problematic incentive side effects not captured by the model:
for example,
they provide strong incentives to collude when the signal is a second agent's type \parencite{LaffontMartimort2000},
and also to acquire information (at a cost) about the realisation of the signal \parencite{Laohakunakorn2019}.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Addendum: preference assumptions}
\label{sec:ch1:risk-neutrality}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

In discussions of the model of this chapter (and other models like it), there is often much confusion about what `risk-neutrality' means and what role it plays. In this section, I clarify.

In the model of this chapter, the payoff-relevant ex-post outcomes are $(g,p) \in \{0,1\} \times \R$, where $g=1$ iff the agent gets the good.
A \emph{lottery} (over outcomes) is a probability measure on $\{0,1\} \times \R$.
Each type of the agent has a preference over lotteries.
\emph{Expected-utility (EU) preferences} over lotteries $\ell$ are those that admit a \emph{linear} representation:
$U(\ell) = \int_{\{0,1\} \times \R} u \dd \ell$
for some (`utility index') $u : \{0,1\} \times \R \to \R$.
(I have omitted the measure-theoretic details.)
You will recall that $u$ is unique up to affine transformations:
if $U(\ell) = \int_{\{0,1\} \times \R} u \dd \ell$
and $V(\ell) = \int_{\{0,1\} \times \R} v \dd \ell$ represent the same EU preference over lotteries, then $u = \alpha + \beta v$ for some $\alpha \in \R$ and $\beta>0$.

\emph{Risk attitude} is a property of preferences over \emph{monetary} lotteries, not over joint good--money lotteries $\ell$.
We may contemplate the `risk attitude' of the agent's \emph{marginal} preferences, however:
that is, her preferences among lotteries $\ell$ that all assign the good for sure ($\ell( \{1\} \times \R ) = 1$),
and also among those that certainly do not ($\ell( \{0\} \times \R ) = 1$).
The agent's marginal preferences are risk-neutral (risk-averse)
iff $u(0,\cdot)$ and $u(1,\cdot)$ are both affine (concave).
In this case, we shall say that `the agent is risk-neutral (risk-averse)'.

EU preferences are \emph{monotone} iff $u(g,p)$ is strictly increasing in $g$ and strictly decreasing in $p$,
and \emph{strongly separable} iff $u$ is additively separable:
$u(g,p) = \phi(g) - \psi(p)$
for some $\phi : \{0,1\} \to \R$ and $\psi : \R \to \R$.
By inspection, for strongly separable EU preferences,
(i) given monotonicity, we may choose $\phi(g) = gt$ for any $t>0$, and
(ii) under risk-neutrality (risk-aversion), $\psi$ is affine (convex).

Our assumption in this chapter
was that each type $t \in (0,1]$ has \emph{monotone and quasi-linear EU preferences,} meaning an index of the form $u(g,p) = gt - p$;
in other words, strongly separable, monotone and risk-neutral EU preferences.
If risk-neutrality were weakened to risk-aversion,
then we would instead have $u(g,p) = gt - \psi(p)$ for some convex and strictly increasing $\psi : \R \to \R$.

Most of the analysis in this chapter remains valid in the latter (risk-averse) case.
Simply `change currency' from dollars to utils via $p \mapsto \psi(p)$: in this currency, the agent is risk-neutral, so preferences are quasi-linear again,
and thus everything that we said about incentive-compatibility remains true.
Differences arise only when maximising the seller's revenue, since the seller cares about revenue denominated in dollars, not utils.
(In the language of cooperative game theory: when both agent and principal were risk-neutral, utility was perfectly transferable between them; now it is not.)

The analysis in this chapter is less robust to dropping strong separability (e.g. $u(g,p) = \chi( gt - p )$ for some concave and strictly increasing $\chi : \R \to \R$).
Some results remain true (e.g. the \hyperref[proposition:SM_lemma]{Spence--Mirrlees lemma}).
But others do not, for example the results in §\ref{sec:ch1:CremerMclean} about the extraordinary incentive power of side bets \parencite[see][]{Robert1991}.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Addendum: the envelope theorem, rigorously}
\label{sec:ch1:envelope_rigorous}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

In §\ref{sec:ch1:env}, we gave a heuristic treatment of the envelope theorem, arguing that the envelope formula is intuitively a rewritten form of a first-order condition capturing `local optimality'; thus optimality implies (local optimality and thus) the envelope formula.
In this section, we provide a rigorous treatment.%
	\footnote{The text of this section is partly drawn from \textcite{Sinander2022}.}


\begin{namedthm}[Formalities.]
	%
	\label{namedthm:formalities_env}
	%
	We will be working with the unit interval $[0,1]$. We equip it with the Lebesgue $\sigma$-algebra and the Lebesgue measure. All integrals will be Lebesgue integrals (with respect to the Lebesgue measure).

	Recall that for a constant $K \geq 0$, a function $\phi : [0,1] \to \R$ is called \emph{$K$-Lipschitz continuous} iff
	$\abs*{ \phi(t) - \phi(r) } \leq K\abs*{t-r}$ for all $r,t \in [0,1]$. (If $\phi$ is continuously differentiable, then it is $K$-Lipschitz iff $\abs*{\phi'} \leq K$. [Prove it!])
	A function is \emph{Lipschitz continuous} iff it is $K$-Lipschitz for some $K \geq 0$.

	\emph{Absolute continuity} is a continuity concept that is weaker than Lipschitz continuity, but still stronger than ordinary continuity.%
		\footnote{The definition:
		$\phi : [0,1] \to \R$ is absolutely continuous iff for each $\eps > 0$, there is a $\delta > 0$ such that for any finite collection $\{ (r_n,t_n) \}_{n=1}^N$ of disjoint intervals of $[0,1]$,
		$\sum_{n=1}^N ( t_n - r_n ) < \delta$ implies $\sum_{n=1}^N \abs*{ \phi(t_n) - \phi(r_n) } < \eps$.}
	The fundamental theorem of calculus for Lebesgue integrals
	\parencite[see e.g.][§3.5, p. 106]{Folland1999}
	states that
	a function $\phi : [0,1] \to \R$
	is absolutely continuous
	iff it is an integral%
		\footnote{There is an integrable $\psi : [0,1] \to \R$ such that $\phi(t) = \phi(0) + \int_0^t \psi$ for every $t \in [0,1]$.}
	iff it is the integral of its own derivative.%
		\footnote{$\phi$ is differentiable a.e., its (a.e.-defined) derivative $\phi' : [0,1] \to \R$ is integrable, and $\phi(t) = \phi(0) + \int_0^t \phi'$ for every $t \in [0,1]$.}
	%
\end{namedthm}


The setting is a (smoothly) parametrised maximisation problem:
there are \emph{actions} $\mathcal{X}$, \emph{parameters} $[0,1]$, and an \emph{objective function} $f : \mathcal{X} \times [0,1] \to \R$.
The interpretation is that
$f(\cdot,t)$ is the objective to be maximised, when the parameter is $t \in [0,1]$.

The action set $\mathcal{X}$ is completely arbitrary---it need not have any convex, topological or measurable structure, for example.
The assumption that the parameter $t \in [0,1]$ is one-dimensional is less restrictive than it may appear; see \Cref{remark:env_param_higher} at the end of this section.

As for the objective function, our only assumptions will be that it varies smoothly with the parameter and that this derivative is not `very crazy'. Specifically:

\begin{namedthm}[Basic assumptions.]
	%
	\label{assumption:basic}
	%
	$f(x,\cdot)$ is differentiable for every $x \in \mathcal{X}$,
	and the family $\{ f(x,\cdot) \}_{ x \in \mathcal{X} }$ of functions $[0,1] \to \R$ is an \emph{absolutely equi-continuous} family.
	%
\end{namedthm}

See \textcite{Sinander2022} for the definition of absolute equi-continuity.
A simple sufficient condition is that the derivative $f_2$ be bounded (there is a constant $K \geq 0$ such that $\abs*{f_2(x,t)} \leq K$ for all $x \in \mathcal{X}$ and $t \in [0,1]$).

\begin{example}
	%
	\label{example:env_mech}
	%
	In the model considered throughout this chapter,
	the `actions' are outcomes $x = (q,p) \in [0,1] \times \R \eqqcolon \mathcal{X}$, and the (agent's) objective is $f(x,t) = f((q,p),t) \coloneqq qt - p$.
	Clearly $f(x,\cdot)$ is differentiable for each $x \in \mathcal{X}$; furthermore, the derivative is bounded (why?).
	%
\end{example}

A \emph{decision rule} is a map $X : [0,1] \to \mathcal{X}$ that prescribes an action for each parameter.
The \emph{value function} of a decision rule $X$ is $V_X(t) \coloneqq f(X(t),t)$.
A decision rule $X$ is \emph{optimal} iff $V_X(t) \geq f(x,t)$ for all $x \in \mathcal{X}$ and $t \in [0,1]$.

A decision rule $X$ satisfies the \emph{envelope formula} iff
%
\begin{equation*}
	V_X(t) = V_X(0) + \int_0^t f_2(X(s),s) \dd s
	\quad \text{for every $t \in [0,1]$.}
\end{equation*}
%
Equivalently, by Lebesgue's fundamental theorem of calculus,
$X$ satisfies the envelope formula iff
$V_X$ is \emph{absolutely continuous} (hence a.e. differentiable) and satisfies $V_X'(t) = f_2(X(t),t)$ for a.e. $t \in [0,1]$.


\begin{namedthm}[Milgrom--Segal envelope theorem.]
	%
	\label{theorem:MS_envelope}
	%
	If $f$ satisfies the \hyperref[assumption:basic]{basic assumptions},
	then any optimal decision rule satisfies the envelope formula.
	%
\end{namedthm}

This is really a slight refinement of Theorem 2 in \textcite{MilgromSegal2002}, because the assumptions are weaker. You can find this exact version in \textcite{Sinander2022}.


\addtocounter{example}{-1}
\begin{example}[continued]
	%
	\label{example:env_mech_ic}
	%
	In the model considered throughout this chapter,
	`decision rules' are mechanisms $X = (q,p) : [0,1] \to \mathcal{X}$.
	Fix a mechanism $X$,
	and assume without loss that it is onto $\mathcal{X}$;%
		\footnote{This is without loss because we may re-define $\mathcal{X}$ to be the image of $X$.
		(This has no effect on incentives, since it removes from consideration those actions/outcomes $x$ that never arise under the mechanism $X$.)}
	then $X$ is incentive-compatible exactly if it is `optimal' in the sense defined above.
	Thus the \hyperref[theorem:MS_envelope]{Milgrom--Segal envelope theorem} implies the \hyperref[proposition:ic_env]{Mirrlees envelope theorem} from §\ref{sec:ch1:env}:
	every incentive-compatible mechanism satisfies the envelope formula.
	%
\end{example}


The \hyperref[theorem:MS_envelope]{Milgrom--Segal envelope theorem} may be decomposed into two lemmata:
the former asserts that the derivatives $V_X'(t)$ and $f_2(X(t),t)$ are equal whenever they (both) exist,
and the latter says that $V_X$ may be written as the integral of its derivative.


\begin{lemma}
	%
	\label{lemma:deriv_value}
	%
	If $X$ is optimal,
	then for any $t \in (0,1)$
	at which $V_X'(t)$ and $f_2(X(t),t)$ exist,
	we have $V_X'(t) = f_2(X(t),t)$.
	%
\end{lemma}

\begin{proof}
	%
	For any $r<t$, we have
	%
	\begin{equation*}
		V_X(t) - V_X(r)
		= f(X(t),t) - f(X(r),r)
		\leq f(X(t),t) - f(X(t),r) ,
	\end{equation*}
	%
	so that dividing both sides by $t-r$ and sending $r \uparrow t$ yields $V_X'(t) \leq f_2(X(t),t)$.
	Similarly, for any $r>t$ we have
	$V_X(r) - V_X(t) \geq f(X(t),r) - f(X(t),t)$,
	so that dividing by $r-t$ and sending $r \downarrow t$
	delivers $V_X'(t) \geq f_2(X(t),t)$.
	%
\end{proof}


\begin{lemma}
	%
	\label{lemma:AC}
	%
	If $f$ satisfies the \hyperref[assumption:basic]{basic assumptions},
	then for any optimal decision rule $X$,
	the value $V_X$ is absolutely continuous.
	%
\end{lemma}

We'll prove a special case.
(For a proof of the general case, see \textcite{Sinander2022}---the relevant result is labelled `the necessity lemma'.)

\begin{proof}[Proof of a special case]
	%
	Assume a stronger version of the \hyperref[assumption:basic]{basic assumptions}: $f_2$ exists and is \emph{bounded.}
	Write $K \geq 0$ for the constant that bounds $f_2$.
	Fix an optimal $X$; we will show that $V_X$ is \emph{$K$-Lipschitz continuous} (hence absolutely continuous).

	So fix any $r<t$ in $[0,1]$,
	and let $B \coloneqq \sup_{x \in \mathcal{X}} \abs*{ f(x,t) - f(x,r) }$.
	Then
	%
	\begin{equation*}
		V_X(t) - V_X(r)
		\leq f(X(t),t) - f(X(t),r)
		\leq \sup_{x \in \mathcal{X}}
		[ f(x,t) - f(x,r) ]
		\leq B 
	\end{equation*}
	%
	and (similarly)
	%
	\begin{equation*}
		V_X(t) - V_X(r)
		\geq f(X(r),t) - f(X(r),r)
		\geq \inf_{x \in \mathcal{X}}
		[ f(x,t) - f(x,r) ]
		\geq - B ,
	\end{equation*}
	%
	so
	%
	\begin{equation*}
		\abs*{ V_X(t) - V_X(r) }
		\leq B
		= \sup_{x \in \mathcal{X}}
		\abs*{ \int_r^t f_2(x,s) \dd s }
		\leq K(t-r) 
	\end{equation*}
	%
	by the fundamental theorem of calculus
	and the bound $\abs*{f_2} \leq K$.
	%
\end{proof}


\begin{proof}[Proof of {the \hyperref[theorem:MS_envelope]{Milgrom--Segal envelope theorem}}]
	%
	Let $f$ satisfy the \hyperref[assumption:basic]{basic assumptions}, and let $X : [0,1] \to \mathcal{X}$ be optimal.
	Then $V_X$ is absolutely continuous by \Cref{lemma:AC},
	so (by Lebesgue's fundamental theorem of calculus)
	$V_X'$ exists a.e.
	and $V_X(t) = V_X(0) + \int_0^t V_X'$ for every $t \in [0,1]$.
	For every $t \in [0,1]$ at which $V_X'(t)$ exists,
	$f_2(X(t),t)$ also exists by the \hyperref[assumption:basic]{basic assumptions},
	so $V_X'(t) = f_2(X(t),t)$ by \Cref{lemma:deriv_value}.
	%
\end{proof}


\paragraph{The literature.}
The envelope theorem originated in the theories of the consumer and firm \parencite{Hotelling1932,Roy1947,Shephard1953}, and was then considered more abstractly by \textcite{Samuelson1947}. 
The perspective on the envelope theorem presented in this section,
and the \hyperref[theorem:MS_envelope]{Milgrom--Segal envelope theorem} in which it culminates, are due to \textcite{MilgromSegal2002}.

The \emph{textbook} perspective on the envelope theorem
\parencite[e.g.][§M.L]{MascolellWhinstonGreen1995}
is quite different.
The argument is typically that optimality implies \emph{local optimality} in the form of a first-order condition,
and that this first-order condition may be re-expressed as the envelope formula.
(We saw a version of this reasoning in §\ref{sec:ch1:env}.)
A nice feature of this perspective is that it tells you what the envelope formula \emph{means} economically: namely, `local optimality'.
The downside is that the textbook argument requires very strong assumptions---far too strong for mechanism-design applications.

But it is in fact possible to obtain the general \hyperref[theorem:MS_envelope]{Milgrom--Segal envelope theorem} from the textbook argument (with some frills), as long as one uses a suitable notion of `local optimality'.
It is therefore quite generally legitimate to interpret the envelope formula as `local optimality'.
The details are in \textcite{Sinander2022}.


\begin{remark}
	%
	\label{remark:env_param_higher}
	%
	Suppose that instead of a one-dimensional parameter $t \in [0,1]$,
	there is a parameter $\theta$ drawn from a convex subset $\Theta$ of a normed vector space (e.g. $\R^n$, or a higher-dimensional space).
	Write $F : \mathcal{X} \times \Theta \to \R$ for the objective function.
	Then along any smooth path $[0,1] \ni t \mapsto \theta_t \in \Theta$ through the parameter space $\Theta$,
	we may define $f(x,t) \coloneqq F(x,\theta_t)$;
	the envelope theorem applies along this path,
	and that is true for any such smooth path $t \mapsto \theta_t$.
	%
\end{remark}