Skip to content

Commit

Permalink
First paper draft for Passwords'15 conference
Browse files Browse the repository at this point in the history
  • Loading branch information
ept committed Sep 9, 2015
1 parent c147cba commit d79cc75
Show file tree
Hide file tree
Showing 2 changed files with 333 additions and 1 deletion.
12 changes: 11 additions & 1 deletion Makefile
@@ -1,6 +1,6 @@
.SUFFIXES = .tex .bib .aux .bbl .dvi .ps .pdf

all: octokey.pdf
all: octokey.pdf pass15.pdf

octokey.pdf: octokey.bbl
pdflatex octokey
Expand All @@ -12,5 +12,15 @@ octokey.bbl: references.bib octokey.aux
octokey.aux: *.tex
pdflatex octokey

pass15.pdf: pass15.bbl
pdflatex pass15
pdflatex pass15

pass15.bbl: references.bib pass15.aux
bibtex pass15

pass15.aux: *.tex
pdflatex pass15

clean:
rm -f *.{log,aux,out,bbl,blg,dvi,ps,pdf}
322 changes: 322 additions & 0 deletions pass15.tex
@@ -0,0 +1,322 @@
\documentclass{llncs}
%\usepackage[utf8]{inputenc}
\usepackage{amsmath} % for \mod
\usepackage[hyphens]{url}
%\usepackage{doi}
\usepackage{hyperref}
%\usepackage[hyphenbreaks]{breakurl} % Fix URL line breaking when using dvips (e.g. arxiv.org)

\newcommand*{\concat}{\mathbin{\|}}
\hyphenation{time-stamp}

\begin{document}
\title{Strengthening Public Key Authentication against Key Theft}
\subtitle{Short Paper}
\author{Martin Kleppmann\inst{1} \and Conrad Irwin\inst{2}}
\institute{
\email{martin@kleppmann.com} \and \email{conrad.irwin@gmail.com}
}
\maketitle

\begin{abstract}
Authentication protocols based on an asymmetric keypair (e.g.\ SSH public key authentication, TLS
client certificates, FIDO UAF and U2F) can provide strong authentication provided that the private
key is adequately protected. Use of dedicated cryptographic hardware helps, but does not solve all
risks of key theft. In this paper we discuss algorithms for further protecting private key material
against theft, based on mediated RSA (mRSA) signatures. We show how users can revoke lost or stolen
devices and provision new devices without relying on a trusted authority. When private key material
is encrypted with a password, we show how to prevent offline brute-force attacks using a
zero-knowledge proof.
\end{abstract}

\section{Public Key Authentication}\label{sec:intro}

In a public key authentication system, each username $r$ is associated with a public key. For
example, when RSA~\cite{RSA} is used,\footnote{In this paper we focus on RSA. We hope to extend our
approach to support other public-key cryptosystems such as ECC in future work.} a user's public key
$(n, e)$ consists of the modulus $n$ and the public exponent $e$. A service that needs to
authenticate users may store a set of known public keys for a given username $r$, or it may rely on
a certificate authority (CA) to associate usernames with public keys.

Whenever a user wishes to log in, they must prove ownership of the corresponding private key
$(n, d)$, where $n$ is the same modulus as in the public key, and $d$ is the private exponent. This
ownership proof is often implemented by constructing an authentication request (consisting of the
username, a session identifier or challenge, and other properties), signing it on the client using
the private key, and verifying the signature in the service. Variations of this pattern are used in
SSH~\cite{SSH}, TLS client certificates~\cite{TLS}, and FIDO U2F~\cite{FIDOOverview}.

In this paper we focus on the computation of the signature using an RSA private key. For clarity, we
omit full protocol details, and describe a simple abstract protocol for website authentication. Our
technique can be adapted to operate within any of the aforementioned protocols.

\subsection{Constructing a Signature}\label{sec:mandate}

To log in or sign up to a service, the user's client first requests a challenge $c$ from the
service. It then calculates the RSA signature $s$:
\begin{equation}
s = m^d = H(c \concat u \concat r)^d \mod n
\end{equation}
where $u$ is the URL of the service, $r$ is the username, and $(d, n)$ is the private key. The
symbol $\concat$ denotes encoding and concatenating the values into a byte string. $H$ is shorthand
for the \textsc{EMSA-PSS-Encode} operation (hashing and padding) defined in PKCS\#1~\cite{PKCS1}.

The client then constructs the \emph{mandate}, which combines the RSA-signed message and the user's
public key:
\begin{equation}
\mathit{mandate} = s \concat c \concat u \concat r \concat n \concat e \enspace.
\end{equation}

The mandate is sent to the server over TLS.\footnote{A channel binding~\cite{ChannelBinding} or
Origin-Bound Certificate~\cite{Dietz12} of this TLS connection may be incorporated into the
signature, e.g.\ encoded in the challenge $c$.} The server can verify the mandate by checking that
$s$ is a valid PKCS\#1 signature, $c$ and $u$ are valid for this service, and that $(n, e)$ is an
acceptable public key for user $r$.

\subsection{Human-to-Machine Authentication}\label{sec:human-to-machine}

The protocol of Sect.~\ref{sec:mandate} is a machine-to-machine authentication protocol, and it
needs to be preceded by a human-to-machine authentication step: for example, a password or biometric
information can be used by the client device to unlock or decrypt the private key.

We assume that the human-to-machine authentication step is weaker than a cryptographic signature
(e.g.\ due to using a weak encryption password), and that it can feasibly be broken by an attacker
if the device storing the private key is lost or compromised. Thus, the goal of human-to-machine
authentication is only to delay an attacker for long enough that the user has enough time to revoke
the compromised device's key (see Sect.~\ref{sec:management}).

In Sect.~\ref{sec:ratelimit} we discuss a technique for strengthening the human-to-machine
authentication step.

\section{Key Management}\label{sec:management}

If the device storing the private key is lost or stolen, the user needs a mechanism for revoking it.
This raises the question: how can the system ensure that only the legitimate owner of the key may
revoke it (to prevent denial of service), in the absence of a key identifying the user (since it has
been lost)? Various approaches have been proposed:

\begin{itemize}
\item If the user's identity was originally established out-of-band by a CA, the same process can be
used to confirm that the revocation request is genuine, and the CA can add the user's certificate to
a revocation list (CRL).
\item A separate revocation key, perhaps stored offline on paper, can be used. However, this key
would also be prone to loss as it is only rarely needed.
\end{itemize}

In this section we discuss a user-friendly approach for revoking lost devices and enrolling new
devices that does not depend on a CA. It is based on the assumption that users have multiple devices
(e.g.\ laptop, smartphone, tablet, game console) on which they access services.

\subsection{Key Revocation}\label{sec:revocation}

To mitigate this risk of key theft, we ensure that the private exponent $d$ is never stored on any
one device, even in encrypted form. Instead, we split it into key fragments that are distributed
among the user's devices. We use the \emph{mediated RSA} (mRSA) scheme~\cite{Boneh01,Kutyiowski12},
which is based on the fact that
\begin{equation}
s = m^d = m^{d_a + d_b} = m^{d_a} m^{d_b} \mod n
\end{equation}
provided that $d = d_a + d_b \mod \phi(n)$.

If two devices $a$ and $b$ each store a key fragment $d_a$ and $d_b$ respectively, and those
fragments sum to the private exponent $d$, then we call those devices \emph{paired}. ($d$ could be
split into any number of fragments $f$, but we focus on the case $f=2$.) In order to
generate a valid signature, any two paired devices need to collaborate.

If device $a$ wants to generate a mandate, it can send a signing request $\mathit{req}$ to device $b$:
\begin{equation}
\mathit{req} = H(c \concat u \concat r) \concat n \concat e
\end{equation}
where the public key $(n, e)$ indicates which key should be used, in case device $b$ stores multiple
keys. Device $b$ then uses its key fragment $d_b$ to calculate a response:
\begin{equation}
\mathit{resp} = H(c \concat u \concat r)^{d_b} = m^{d_b} \mod n
\end{equation}
and returns $\mathit{resp}$ to $a$. Now, $a$ can calculate the signature $s$:
\begin{equation}
s = H(c \concat u \concat r)^{d_a} \cdot \mathit{resp} = m^{d_a} m^{d_b} \mod n \enspace,
\end{equation}
construct a mandate with a valid signature, and thus log in.

If a device is lost, stolen or compromised, this scheme allows the user to revoke that device's
login capability: every device that is paired with the lost device must be instructed to delete the
key fragment from the pairing with the lost device. If the user physically controls all devices that
are paired with the lost device, this can simply be done via the user interface. When all the paired
fragments have been deleted, the key fragments on the lost device become useless.

\subsection{The Mediator Service}\label{sec:mediator}

Splitting a key across two physical devices provides limited benefit: a user must carry both devices
with them, and if both are stolen at the same time, the revocation capability is lost. However,
there is a simple solution: one of the user's `devices' may be a remote service on the internet,
which we call the \emph{mediator}. This service stores key fragments that are paired with each of
the user's physical devices, and responds to signing requests by performing the modular
exponentiation using its key fragments. This allows a user to authenticate with services using only
one physical device -- the coordination with the mediator happens automatically behind the scenes.

When the user requires a device to be revoked, they must authenticate the revocation request from
one of their other devices (see Sect.~\ref{sec:ratelimit} for an algorithm). This implies that a
user must pair at least two physical devices with the mediator, so that the remaining device can
revoke a lost device. A paper print-out of the key can serve as last resort in case all devices are
lost or destroyed.

The mediator need only be partially trusted. It cannot authenticate as the user without the
cooperation of one of the user's physical devices. The user only needs to trust the mediator to not
collude with attackers who steal devices, and to correctly delete key fragments when the user
requires key revocation. The user's privacy is protected by hashing the message
$c \concat u \concat r$ before sending it to the mediator, so the mediator does not learn which
services the user is logging in to, or which usernames they are using.

From the point of view of a service that uses public key authentication, the mediator does not even
exist: a service simply verifies the RSA signature on a mandate, and does not care how that
signature was constructed. This is in contrast to federated login systems such as OpenID, where the
relying party must trust the identity provider.

% TODO section on enrolling new devices

\section{Rate Limiting Password Guesses}\label{sec:ratelimit}

Besides enabling key revocation, mRSA can also be used to strengthen the human-to-machine
authentication step against offline attacks.

For example, say the key fragment on a device is encrypted with a symmetric key derived from a
password. Consider an attacker who has stolen this encrypted fragment. In order to brute-force the
password, the attacker needs a way of determining whether a password guess is correct. However, a
key fragment is just a uniformly distributed random number; by itself, the correctly decrypted key
fragment is almost indistinguishable from the garbage that results from attempting to decrypt with
the wrong password (see Sect.~\ref{sec:fragment-encryption}).

Assuming the attacker has no other key fragments, they can only determine whether the password guess
was correct by communicating with the mediator and testing whether they are able to construct a
valid signature. This gives us an opportunity to rate-limit password guessing attempts: if the
mediator receives too many requests based on an incorrect password, it can block further attempts
and advise the user to revoke the device pairing. Similar ideas have been used to strengthen key
agreement protocols against weak passwords~\cite{Bellovin92}.

In order to achieve this, we must design the protocol as a zero-knowledge proof, such that an
attacker must communicate with the mediator for every password guess, but without revealing the
password or the decrypted key fragment to the mediator. An algorithm is described in
Sect.~\ref{sec:mediator-auth}.

\subsection{Authenticating Requests to the Mediator}\label{sec:mediator-auth}

Say the key fragment $d_a$ has been encrypted with password $\mathit{pass}$, and the attacker has
stolen the encrypted fragment $\mathit{efrag}$:
\begin{equation}
\mathit{efrag} = \mathrm{encrypt}(\mathrm{PBKDF2}(\mathit{pass}), d_a) \enspace.
\end{equation}
The attacker now guesses $\mathit{pass}^\prime$ and computes a guess $d_a^\prime$ of the plaintext:
\begin{equation}
d_a^\prime = \mathrm{decrypt}(\mathrm{PBKDF2}(\mathit{pass}^\prime), \mathit{efrag}) \enspace.
\end{equation}

To check whether $d_a^\prime = d_a$ the attacker needs to contact the mediator where $d_b$ is held.
We modify the mediator's request processing as follows:

\begin{enumerate}
\item In addition to the signing request $\mathit{req}$, the client is required to submit a
signature $s_\mathit{req}$:
\begin{align}
\mathit{req} &= H(c \concat u \concat r) \concat n \concat e \\
s_\mathit{req} &= H(\mathit{req} \concat \mathit{cb})^{d_a^\prime} \mod n
\end{align}
where $\mathit{cb}$ is the \texttt{tls-unique} channel binding~\cite{ChannelBinding}
of the TLS connection between the client and the mediator.
\item Using the channel binding $\mathit{cb}^\prime$ of the TLS connection's server side, the
mediator computes
\begin{equation}
s_\mathit{req} \cdot H(\mathit{req} \concat \mathit{cb}^\prime)^{d_b} =
H(\mathit{req} \concat \mathit{cb})^{d_a^\prime} \cdot
H(\mathit{req} \concat \mathit{cb}^\prime)^{d_b} \mod n
\end{equation}
and checks whether the result is a valid PKCS\#1 signature of
$\mathit{req} \concat \mathit{cb}^\prime$ for the user's public key $(n, e)$. This check succeeds if
$d_a^\prime = d_a$ (i.e.\ the user's password was correct), and if $\mathit{cb}^\prime = \mathit{cb}$
(preventing MITM and replay attacks).
\item If the signature is valid, the mediator computes
\begin{equation}
\mathit{resp} = H(c \concat u \concat r)^{d_b} \mod n
\end{equation}
as before, and returns it to the client. If the signature is not valid, the mediator returns ``bad
signature''. A password-guessing attacker learns that the password guess $\mathit{pass}^\prime$ was
incorrect, but otherwise nothing is revealed that would help them guess the password.
\end{enumerate}

Note that although the mediator computes an RSA signature using the user's private key, the value
being signed ($\mathit{req} \concat \mathit{cb}$) cannot be used to construct a mandate, so the
mediator cannot log in to services on the user's behalf.

This protection against password guessing only works if the attacker does not have any knowledge of
previous requests to the mediator. If the attacker knows $x^{d_a} \mod n$ (a request) or
$x^{d_b} \mod n$ (a response) for any $x$, they can brute-force the password without contacting the
mediator, and thus circumvent the rate-limiting. It is therefore important that communication with
the mediator is protected from eavesdropping (using TLS) and is not logged on the device.

\subsection{Key Fragment Encryption}\label{sec:fragment-encryption}

The method described in section~\ref{sec:ratelimit} for rate-limiting password guesses depends on a
correctly decrypted key fragment being indistinguishable from an incorrect password guess without
contacting the mediator. In this section we propose an encryption scheme which satisfies that
requirement.

We first derive an encryption key from the password using a slow, memory-hard key derivation
function such as Scrypt~\cite{Percival09}. The parameters of the key derivation function (salt, cost
parameter, pseudorandom function used, etc.) are stored in cleartext. We then generate a key stream
using a symmetric block cipher such as AES-128 in CTR mode~\cite{Lipmaa00}.

Let $k$ be the minimum number of bits required to encode the RSA modulus $n$ (i.e. the RSA key
length). To encrypt the key fragment $d_a$, we first encode it as a $k$-bit string, using zeros for
the most significant bits if necessary. We then take the first $k$ bits of the AES-CTR key stream
and XOR them with the $d_a$ bit string:
$$\mathit{efrag} = \mathit{ctr} \concat
(\mathrm{AESCTR}(\mathit{ctr}, \mathrm{scrypt}(\mathit{pass}))_{\{0 \dots k-1\}} \oplus d_a)$$
where $\mathit{ctr}$ is a 128-bit random nonce that is incremented by AESCTR for each subsequent
block of key stream.

Any attempt to decrypt the key fragment results in a uniformly distributed pseudo-random number
between 0 and $2^k$, whereas the correct key fragment is uniformly distributed between 0 and $d$.
Since $d < 2^k$, a password guess that results in a larger decrypted value is less likely to be
correct than a password guess that results in a smaller decrypted value. A password-guessing
attacker can use this knowledge to prioritize guesses, but they cannot entirely rule out guesses
without contacting the mediator.

To quantify the bias, we repeatedly generated 2048-bit RSA keys using OpenSSL. Approximately 90\% of
private exponents were in the range $0.05 < 2^{-k} d < 0.8$, with a fairly uniform distribution
within that range. When key fragments $d_a$ (chosen uniformly from $[0, d]$) were encoded in $k$
bits, they had on average 2.8 high-order zero bits, and the top bit was zero in 94\% of key
fragments.

% Although passwords are the prevalent authentication mechanism on the web today, there are some
% niches in which public key authentication systems have been successfully adopted. For example:
%
% \begin{itemize}
% \item Remote SSH access to servers (TODO citation) is often authenticated with a DSA, ECDSA or
% RSA signature. The user's public key is added to an \verb'authorized_keys' file on the
% server through an out-of-band process (e.g.\ by another user who already has access). When a
% user wishes to log in, the private key on the user's client machine is used to sign the SSH
% session ID, and the server verifies the signature using the list of authorized keys.
% \item TLS client certificates~\cite{TLS} are used in some countries for authenticating tax
% returns and access to public services~\cite{Parsovs14}. A keypair is associated with a user
% identifier by a certificate authority (CA), and a TLS server advertises the CAs from which
% it accepts client certificates. A TLS client can authenticate to the server by signing a
% digest of the TLS key exchange messages with its private key, and sending its public key and
% certificate to the server.
% \item FIDO UAF and U2F use hardware devices for user authentication in web applications (UAF
% replaces password authentication, and U2F augments password authentication with a second
% factor). (TODO citation)
% TODO Is OATH (soft token protocol) symmetric crypto? What about smart cards?
% \end{itemize}

% In public key authentication systems, a user proves ownership of a private key to a service by
% generating a digital signature. If the service already knows the user's public key, or if a
% certificate from a trusted authority associates the public key with a user identifier, then the
% service can confirm that a request was made by the legitimate user.

% If one of a user's devices is lost or stolen, it is desirable for the user to be able to revoke
% the keys of that particular device (using one of their other devices), without affecting the
% validity of keys on their other devices. This implies that different devices must use different
% key material.

\bibliographystyle{splncs03}
\bibliography{references}{}

\end{document}

0 comments on commit d79cc75

Please sign in to comment.