diff --git a/Makefile b/Makefile index d81adf3..c9f75a4 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ .SUFFIXES = .tex .bib .aux .bbl .dvi .ps .pdf -all: octokey.pdf +all: octokey.pdf pass15.pdf octokey.pdf: octokey.bbl pdflatex octokey @@ -12,5 +12,15 @@ octokey.bbl: references.bib octokey.aux octokey.aux: *.tex pdflatex octokey +pass15.pdf: pass15.bbl + pdflatex pass15 + pdflatex pass15 + +pass15.bbl: references.bib pass15.aux + bibtex pass15 + +pass15.aux: *.tex + pdflatex pass15 + clean: rm -f *.{log,aux,out,bbl,blg,dvi,ps,pdf} diff --git a/pass15.tex b/pass15.tex new file mode 100644 index 0000000..a8976eb --- /dev/null +++ b/pass15.tex @@ -0,0 +1,322 @@ +\documentclass{llncs} +%\usepackage[utf8]{inputenc} +\usepackage{amsmath} % for \mod +\usepackage[hyphens]{url} +%\usepackage{doi} +\usepackage{hyperref} +%\usepackage[hyphenbreaks]{breakurl} % Fix URL line breaking when using dvips (e.g. arxiv.org) + +\newcommand*{\concat}{\mathbin{\|}} +\hyphenation{time-stamp} + +\begin{document} +\title{Strengthening Public Key Authentication against Key Theft} +\subtitle{Short Paper} +\author{Martin Kleppmann\inst{1} \and Conrad Irwin\inst{2}} +\institute{ + \email{martin@kleppmann.com} \and \email{conrad.irwin@gmail.com} +} +\maketitle + +\begin{abstract} +Authentication protocols based on an asymmetric keypair (e.g.\ SSH public key authentication, TLS +client certificates, FIDO UAF and U2F) can provide strong authentication provided that the private +key is adequately protected. Use of dedicated cryptographic hardware helps, but does not solve all +risks of key theft. In this paper we discuss algorithms for further protecting private key material +against theft, based on mediated RSA (mRSA) signatures. We show how users can revoke lost or stolen +devices and provision new devices without relying on a trusted authority. When private key material +is encrypted with a password, we show how to prevent offline brute-force attacks using a +zero-knowledge proof. +\end{abstract} + +\section{Public Key Authentication}\label{sec:intro} + +In a public key authentication system, each username $r$ is associated with a public key. For +example, when RSA~\cite{RSA} is used,\footnote{In this paper we focus on RSA. We hope to extend our +approach to support other public-key cryptosystems such as ECC in future work.} a user's public key +$(n, e)$ consists of the modulus $n$ and the public exponent $e$. A service that needs to +authenticate users may store a set of known public keys for a given username $r$, or it may rely on +a certificate authority (CA) to associate usernames with public keys. + +Whenever a user wishes to log in, they must prove ownership of the corresponding private key +$(n, d)$, where $n$ is the same modulus as in the public key, and $d$ is the private exponent. This +ownership proof is often implemented by constructing an authentication request (consisting of the +username, a session identifier or challenge, and other properties), signing it on the client using +the private key, and verifying the signature in the service. Variations of this pattern are used in +SSH~\cite{SSH}, TLS client certificates~\cite{TLS}, and FIDO U2F~\cite{FIDOOverview}. + +In this paper we focus on the computation of the signature using an RSA private key. For clarity, we +omit full protocol details, and describe a simple abstract protocol for website authentication. Our +technique can be adapted to operate within any of the aforementioned protocols. + +\subsection{Constructing a Signature}\label{sec:mandate} + +To log in or sign up to a service, the user's client first requests a challenge $c$ from the +service. It then calculates the RSA signature $s$: +\begin{equation} +s = m^d = H(c \concat u \concat r)^d \mod n +\end{equation} +where $u$ is the URL of the service, $r$ is the username, and $(d, n)$ is the private key. The +symbol $\concat$ denotes encoding and concatenating the values into a byte string. $H$ is shorthand +for the \textsc{EMSA-PSS-Encode} operation (hashing and padding) defined in PKCS\#1~\cite{PKCS1}. + +The client then constructs the \emph{mandate}, which combines the RSA-signed message and the user's +public key: +\begin{equation} +\mathit{mandate} = s \concat c \concat u \concat r \concat n \concat e \enspace. +\end{equation} + +The mandate is sent to the server over TLS.\footnote{A channel binding~\cite{ChannelBinding} or +Origin-Bound Certificate~\cite{Dietz12} of this TLS connection may be incorporated into the +signature, e.g.\ encoded in the challenge $c$.} The server can verify the mandate by checking that +$s$ is a valid PKCS\#1 signature, $c$ and $u$ are valid for this service, and that $(n, e)$ is an +acceptable public key for user $r$. + +\subsection{Human-to-Machine Authentication}\label{sec:human-to-machine} + +The protocol of Sect.~\ref{sec:mandate} is a machine-to-machine authentication protocol, and it +needs to be preceded by a human-to-machine authentication step: for example, a password or biometric +information can be used by the client device to unlock or decrypt the private key. + +We assume that the human-to-machine authentication step is weaker than a cryptographic signature +(e.g.\ due to using a weak encryption password), and that it can feasibly be broken by an attacker +if the device storing the private key is lost or compromised. Thus, the goal of human-to-machine +authentication is only to delay an attacker for long enough that the user has enough time to revoke +the compromised device's key (see Sect.~\ref{sec:management}). + +In Sect.~\ref{sec:ratelimit} we discuss a technique for strengthening the human-to-machine +authentication step. + +\section{Key Management}\label{sec:management} + +If the device storing the private key is lost or stolen, the user needs a mechanism for revoking it. +This raises the question: how can the system ensure that only the legitimate owner of the key may +revoke it (to prevent denial of service), in the absence of a key identifying the user (since it has +been lost)? Various approaches have been proposed: + +\begin{itemize} +\item If the user's identity was originally established out-of-band by a CA, the same process can be +used to confirm that the revocation request is genuine, and the CA can add the user's certificate to +a revocation list (CRL). +\item A separate revocation key, perhaps stored offline on paper, can be used. However, this key +would also be prone to loss as it is only rarely needed. +\end{itemize} + +In this section we discuss a user-friendly approach for revoking lost devices and enrolling new +devices that does not depend on a CA. It is based on the assumption that users have multiple devices +(e.g.\ laptop, smartphone, tablet, game console) on which they access services. + +\subsection{Key Revocation}\label{sec:revocation} + +To mitigate this risk of key theft, we ensure that the private exponent $d$ is never stored on any +one device, even in encrypted form. Instead, we split it into key fragments that are distributed +among the user's devices. We use the \emph{mediated RSA} (mRSA) scheme~\cite{Boneh01,Kutyiowski12}, +which is based on the fact that +\begin{equation} +s = m^d = m^{d_a + d_b} = m^{d_a} m^{d_b} \mod n +\end{equation} +provided that $d = d_a + d_b \mod \phi(n)$. + +If two devices $a$ and $b$ each store a key fragment $d_a$ and $d_b$ respectively, and those +fragments sum to the private exponent $d$, then we call those devices \emph{paired}. ($d$ could be +split into any number of fragments $f$, but we focus on the case $f=2$.) In order to +generate a valid signature, any two paired devices need to collaborate. + +If device $a$ wants to generate a mandate, it can send a signing request $\mathit{req}$ to device $b$: +\begin{equation} +\mathit{req} = H(c \concat u \concat r) \concat n \concat e +\end{equation} +where the public key $(n, e)$ indicates which key should be used, in case device $b$ stores multiple +keys. Device $b$ then uses its key fragment $d_b$ to calculate a response: +\begin{equation} +\mathit{resp} = H(c \concat u \concat r)^{d_b} = m^{d_b} \mod n +\end{equation} +and returns $\mathit{resp}$ to $a$. Now, $a$ can calculate the signature $s$: +\begin{equation} +s = H(c \concat u \concat r)^{d_a} \cdot \mathit{resp} = m^{d_a} m^{d_b} \mod n \enspace, +\end{equation} +construct a mandate with a valid signature, and thus log in. + +If a device is lost, stolen or compromised, this scheme allows the user to revoke that device's +login capability: every device that is paired with the lost device must be instructed to delete the +key fragment from the pairing with the lost device. If the user physically controls all devices that +are paired with the lost device, this can simply be done via the user interface. When all the paired +fragments have been deleted, the key fragments on the lost device become useless. + +\subsection{The Mediator Service}\label{sec:mediator} + +Splitting a key across two physical devices provides limited benefit: a user must carry both devices +with them, and if both are stolen at the same time, the revocation capability is lost. However, +there is a simple solution: one of the user's `devices' may be a remote service on the internet, +which we call the \emph{mediator}. This service stores key fragments that are paired with each of +the user's physical devices, and responds to signing requests by performing the modular +exponentiation using its key fragments. This allows a user to authenticate with services using only +one physical device -- the coordination with the mediator happens automatically behind the scenes. + +When the user requires a device to be revoked, they must authenticate the revocation request from +one of their other devices (see Sect.~\ref{sec:ratelimit} for an algorithm). This implies that a +user must pair at least two physical devices with the mediator, so that the remaining device can +revoke a lost device. A paper print-out of the key can serve as last resort in case all devices are +lost or destroyed. + +The mediator need only be partially trusted. It cannot authenticate as the user without the +cooperation of one of the user's physical devices. The user only needs to trust the mediator to not +collude with attackers who steal devices, and to correctly delete key fragments when the user +requires key revocation. The user's privacy is protected by hashing the message +$c \concat u \concat r$ before sending it to the mediator, so the mediator does not learn which +services the user is logging in to, or which usernames they are using. + +From the point of view of a service that uses public key authentication, the mediator does not even +exist: a service simply verifies the RSA signature on a mandate, and does not care how that +signature was constructed. This is in contrast to federated login systems such as OpenID, where the +relying party must trust the identity provider. + +% TODO section on enrolling new devices + +\section{Rate Limiting Password Guesses}\label{sec:ratelimit} + +Besides enabling key revocation, mRSA can also be used to strengthen the human-to-machine +authentication step against offline attacks. + +For example, say the key fragment on a device is encrypted with a symmetric key derived from a +password. Consider an attacker who has stolen this encrypted fragment. In order to brute-force the +password, the attacker needs a way of determining whether a password guess is correct. However, a +key fragment is just a uniformly distributed random number; by itself, the correctly decrypted key +fragment is almost indistinguishable from the garbage that results from attempting to decrypt with +the wrong password (see Sect.~\ref{sec:fragment-encryption}). + +Assuming the attacker has no other key fragments, they can only determine whether the password guess +was correct by communicating with the mediator and testing whether they are able to construct a +valid signature. This gives us an opportunity to rate-limit password guessing attempts: if the +mediator receives too many requests based on an incorrect password, it can block further attempts +and advise the user to revoke the device pairing. Similar ideas have been used to strengthen key +agreement protocols against weak passwords~\cite{Bellovin92}. + +In order to achieve this, we must design the protocol as a zero-knowledge proof, such that an +attacker must communicate with the mediator for every password guess, but without revealing the +password or the decrypted key fragment to the mediator. An algorithm is described in +Sect.~\ref{sec:mediator-auth}. + +\subsection{Authenticating Requests to the Mediator}\label{sec:mediator-auth} + +Say the key fragment $d_a$ has been encrypted with password $\mathit{pass}$, and the attacker has +stolen the encrypted fragment $\mathit{efrag}$: +\begin{equation} +\mathit{efrag} = \mathrm{encrypt}(\mathrm{PBKDF2}(\mathit{pass}), d_a) \enspace. +\end{equation} +The attacker now guesses $\mathit{pass}^\prime$ and computes a guess $d_a^\prime$ of the plaintext: +\begin{equation} +d_a^\prime = \mathrm{decrypt}(\mathrm{PBKDF2}(\mathit{pass}^\prime), \mathit{efrag}) \enspace. +\end{equation} + +To check whether $d_a^\prime = d_a$ the attacker needs to contact the mediator where $d_b$ is held. +We modify the mediator's request processing as follows: + +\begin{enumerate} +\item In addition to the signing request $\mathit{req}$, the client is required to submit a +signature $s_\mathit{req}$: +\begin{align} + \mathit{req} &= H(c \concat u \concat r) \concat n \concat e \\ + s_\mathit{req} &= H(\mathit{req} \concat \mathit{cb})^{d_a^\prime} \mod n +\end{align} +where $\mathit{cb}$ is the \texttt{tls-unique} channel binding~\cite{ChannelBinding} +of the TLS connection between the client and the mediator. +\item Using the channel binding $\mathit{cb}^\prime$ of the TLS connection's server side, the +mediator computes +\begin{equation} +s_\mathit{req} \cdot H(\mathit{req} \concat \mathit{cb}^\prime)^{d_b} = + H(\mathit{req} \concat \mathit{cb})^{d_a^\prime} \cdot + H(\mathit{req} \concat \mathit{cb}^\prime)^{d_b} \mod n +\end{equation} +and checks whether the result is a valid PKCS\#1 signature of +$\mathit{req} \concat \mathit{cb}^\prime$ for the user's public key $(n, e)$. This check succeeds if +$d_a^\prime = d_a$ (i.e.\ the user's password was correct), and if $\mathit{cb}^\prime = \mathit{cb}$ +(preventing MITM and replay attacks). +\item If the signature is valid, the mediator computes +\begin{equation} +\mathit{resp} = H(c \concat u \concat r)^{d_b} \mod n +\end{equation} +as before, and returns it to the client. If the signature is not valid, the mediator returns ``bad +signature''. A password-guessing attacker learns that the password guess $\mathit{pass}^\prime$ was +incorrect, but otherwise nothing is revealed that would help them guess the password. +\end{enumerate} + +Note that although the mediator computes an RSA signature using the user's private key, the value +being signed ($\mathit{req} \concat \mathit{cb}$) cannot be used to construct a mandate, so the +mediator cannot log in to services on the user's behalf. + +This protection against password guessing only works if the attacker does not have any knowledge of +previous requests to the mediator. If the attacker knows $x^{d_a} \mod n$ (a request) or +$x^{d_b} \mod n$ (a response) for any $x$, they can brute-force the password without contacting the +mediator, and thus circumvent the rate-limiting. It is therefore important that communication with +the mediator is protected from eavesdropping (using TLS) and is not logged on the device. + +\subsection{Key Fragment Encryption}\label{sec:fragment-encryption} + +The method described in section~\ref{sec:ratelimit} for rate-limiting password guesses depends on a +correctly decrypted key fragment being indistinguishable from an incorrect password guess without +contacting the mediator. In this section we propose an encryption scheme which satisfies that +requirement. + +We first derive an encryption key from the password using a slow, memory-hard key derivation +function such as Scrypt~\cite{Percival09}. The parameters of the key derivation function (salt, cost +parameter, pseudorandom function used, etc.) are stored in cleartext. We then generate a key stream +using a symmetric block cipher such as AES-128 in CTR mode~\cite{Lipmaa00}. + +Let $k$ be the minimum number of bits required to encode the RSA modulus $n$ (i.e. the RSA key +length). To encrypt the key fragment $d_a$, we first encode it as a $k$-bit string, using zeros for +the most significant bits if necessary. We then take the first $k$ bits of the AES-CTR key stream +and XOR them with the $d_a$ bit string: +$$\mathit{efrag} = \mathit{ctr} \concat + (\mathrm{AESCTR}(\mathit{ctr}, \mathrm{scrypt}(\mathit{pass}))_{\{0 \dots k-1\}} \oplus d_a)$$ +where $\mathit{ctr}$ is a 128-bit random nonce that is incremented by AESCTR for each subsequent +block of key stream. + +Any attempt to decrypt the key fragment results in a uniformly distributed pseudo-random number +between 0 and $2^k$, whereas the correct key fragment is uniformly distributed between 0 and $d$. +Since $d < 2^k$, a password guess that results in a larger decrypted value is less likely to be +correct than a password guess that results in a smaller decrypted value. A password-guessing +attacker can use this knowledge to prioritize guesses, but they cannot entirely rule out guesses +without contacting the mediator. + +To quantify the bias, we repeatedly generated 2048-bit RSA keys using OpenSSL. Approximately 90\% of +private exponents were in the range $0.05 < 2^{-k} d < 0.8$, with a fairly uniform distribution +within that range. When key fragments $d_a$ (chosen uniformly from $[0, d]$) were encoded in $k$ +bits, they had on average 2.8 high-order zero bits, and the top bit was zero in 94\% of key +fragments. + +% Although passwords are the prevalent authentication mechanism on the web today, there are some +% niches in which public key authentication systems have been successfully adopted. For example: +% +% \begin{itemize} +% \item Remote SSH access to servers (TODO citation) is often authenticated with a DSA, ECDSA or +% RSA signature. The user's public key is added to an \verb'authorized_keys' file on the +% server through an out-of-band process (e.g.\ by another user who already has access). When a +% user wishes to log in, the private key on the user's client machine is used to sign the SSH +% session ID, and the server verifies the signature using the list of authorized keys. +% \item TLS client certificates~\cite{TLS} are used in some countries for authenticating tax +% returns and access to public services~\cite{Parsovs14}. A keypair is associated with a user +% identifier by a certificate authority (CA), and a TLS server advertises the CAs from which +% it accepts client certificates. A TLS client can authenticate to the server by signing a +% digest of the TLS key exchange messages with its private key, and sending its public key and +% certificate to the server. +% \item FIDO UAF and U2F use hardware devices for user authentication in web applications (UAF +% replaces password authentication, and U2F augments password authentication with a second +% factor). (TODO citation) +% TODO Is OATH (soft token protocol) symmetric crypto? What about smart cards? +% \end{itemize} + +% In public key authentication systems, a user proves ownership of a private key to a service by +% generating a digital signature. If the service already knows the user's public key, or if a +% certificate from a trusted authority associates the public key with a user identifier, then the +% service can confirm that a request was made by the legitimate user. + +% If one of a user's devices is lost or stolen, it is desirable for the user to be able to revoke +% the keys of that particular device (using one of their other devices), without affecting the +% validity of keys on their other devices. This implies that different devices must use different +% key material. + +\bibliographystyle{splncs03} +\bibliography{references}{} + +\end{document}