Skip to content

Web Security: safe communication and authentication

01es edited this page Jul 31, 2019 · 14 revisions

Web facing applications are exposed to a lot more risk than Intranet applications, and thus require a lot more resilience to hacking security system that could withstand both eavesdropping and interrogative adversaries. This document provides a detailed overview of the security system for TG-based applications that is provided as part of the platform.

  1. Server Authentication for User Protection
  2. How HTTPS works
  3. TG Application Server over HTTPS
  4. Client Authentication for Server Protection
  5. Login and 2-factor authentication
  6. Password strength
  7. Preventing rapid-fire login attempts
  8. Reduced Sign-On instead of Single Sign-On
  9. Restoring a password
  10. Authenticators
  11. Mics

Server Authentication for User Protection

Application users should be certain that they are accessing the right application on the correct server, and that their communication with the server is secure. This basically means three things:

  • The server is authentic
  • The data integrity of the communicated information is ensured
  • The communication channel is encrypted

The HTTP protocol over TLS (Transport Layer Security, aka SSL, Secure Sockets Layer) provides a complete solution to the above requirements. Thus, the HTTPS protocol support by the TG Application Server is the most rigorous solution to the problem of Server Authentication.

How HTTPS works

Without going too much into the details, it is beneficial to have a general understanding of the HTTPS security and how it actually makes communication safe. There are two parts to the TLS mechanism. One has todo with encrypting the communication channel, another has to do with certification, which identifies the server as the entity it claims to be. Let's start with channel encryption.

Channel encryption

The TLS clockwork employs the use of Asymmetric and Symmetric Cryptography. The asymmetric cryptography uses two keys -- public and private -- for encryption/decryption of messages. If a messages gets encrypted with a private key it can only be decrypted with corresponding public key and wise versa. The public key, as the name suggests, gets send to the client application (e.g. a Web Browser) via an open channel. Any eavesdropping adversary would be able to see it, and that is completely fine. The private key remains private to the server. So, unless the server is hacked, only the server should be in a possession of the private key.

Thus, once the client received a public key, all the information encrypted by the client and send back to the server can only be decrypted at by the server. However, any information encrypted and send by the server can be decrypted not only by the client, but also by the eavesdropping adversary, who might have captured the public key during its initial transmission to the client.

This is where the symmetric cryptography comes to place. Symmetric cryptography uses a single key for encoding/decoding, and is much faster than asymmetric cryptography at that. Therefore, upon receiving a public key, the client generates a key for symmetric cryptography, encodes it with its public key and sends back to the server. Thus provided key for symmetric cryptography would only be decrypted at the server end. At this stage both client and server have a shared key and start using it to encode/decode all further communication. No eavesdropping adversary would not be able to understand any of the information being communicated this way.

Server Certification

Another part of the TLS clockwork has to do with server identification, and this is where certification plays its role. As part of the HTTPS handshake between the client and server, the server presents a certificate of authenticity. This certificate has to be signed by a Certificate Authority (CA) or it could be self signed.

The client such as a Web Browser has means for checking whether it can trust the presented certificate. If the certificate is signed by some CA, the client checks agains its reference of trusted CAs and if make a decision to trust or not to trust the presented certificate. If the certificate is self signed, all Web Browsers reject such certificate unless it was added to a list of exceptions (i.e. the user explicitly informed the browser to trust this certificate).

If the certificate is rejected then all communication with the server is stopped, as it could be an imposter trying to emulate the actual server that the client intends to connect to. However, if the certificate is accepted, the client carries out with a HTTPS handshake (exchanges the keys etc).

An interesting thing about CAs is that any organisation (or person for that matter) could become a CA and start signing, for example, its server certificates. Customers may choose to trust a particular not well known CA (e.g. FMS) by adding a corresponding CA certificate to their OS or browser. So, this all boils down to trust, and has nothing to do with the actual encryption of the communication channel.

Please note that each server certificate is issued for a single fully qualified domain name (FQDN). This information is used by browser to check that the FQDN part of the URI they're accessing matches the Common Name property of the presented certificate. Certificates for different servers should always be issued by signing Public/Private Keys that are unique to those servers (i.e. no server certificate should reuse the same key pair).

TG Application Server over HTTPS

Jetty, which is a Servlet Engine and a HTTP Server used as the basis for TG Application Server, together with Restlet (a framework used for implementing Resource Oriented architecture of TG Application Server) support TLS infrastructure, and can be configured to enable communication over HTTPS.

A Java KeyStore (JKS) infrastructure is used for that (as in case of any Java application) to establish a TLS infrastructure. The server Public/Private Keys and its Certificate, which are used to establish a HTTPS communication, are stored in a keystore file. A keystore file should be protected by a password. Access to Private Keys, which are stored there, are also protected by yet another password.

The keystore file password is mainly used to check file integrity when accessing it in order to make sure that it was not tempered with. Also, it provides an additional layer of protection is case the keystore file gets leaked/stolen. The Private Key password is there to protect the Private Keys that are stored in a keystore file. This introduces yet another security layer to protect the most essential information, which is used for establishing secure communication between the client and server software. It a keystore gets stolen and cracked then eavesdropping adversaries in possession of this information would easily read all communication, and could even set up a fake server to obtain user names and passwords by redirecting users to thus set up service.

In most cases, the keystore file is located on the same physical server as the application server, and is protected by the underlying OS. Thus, if the server security is sound there would be no way to steal the keystore file.

Another important concern is how to handle passwords to a keystore file and Private Keys. One common anti-pattern is to store those passwords in a class file as part of the source. This leads to two possible ways for an adversary to access the passwords. Firstly, the application source might get stolen (e.g. a GitHub repo gets hacked). In this case an adversary would obtain the passwords from the source files. Secondly, packed jar files of the application server might get leaked as a result of some (even accidental) security breach (e.g. jar files got uploaded to an insecure FTP server). By decompiling classes from jars, an adversary would obtain the passwords.

The most reasonable option is to provide those password as an application server argument during its start up. Even in case of keeping them in a start up script on the deployment server would be much safer than keeping them in a class file (source or compiled).

In addition, different TLS certificates should be used for development and deployment purposes, and kept in different keystore files. In fact, the deployment keystore file should only be kept at the deployment server, and it would be best if the only person who knows the password to that keystore and Public Keys is the administrator of the deployment server. This way any accidental information leaks would be significantly reduced.

Client Authentication for Server Protection

The previous section describes a way to protect the user from connecting to an untrusted server and from eavesdropping adversary listening to the data being exchanged. This, however, does not protect the server from being accessed by an interrogative adversary, who would want to access the data that is stored at the server side. There are no bullet proof methods agains user imposters, but there are approaches that would most certainly make it really challenging to break.

If a server needs to be protected from anonymous access, a client authentication mechanism would need to be put in place. Unfortunately, unlike in case of HTTPS, there are no standard reliable HTTP mechanisms for user authentication. However, there are very reasonable approaches on top of HTTP that proved to be highly reliable. Article Dos and Don’ts of Client Authentication on the Web by MIT Laboratory for Computer Science provides a really good insight into the problem with client authentication and offers one possible solution that was well analysed from different penetration perspectives. TG employs the systems based on the proposed approach, but with its own modification that further hardens the proposed schema.

The rest of the discussion is presented in several subsections, each addressing specific concerns of the developed methodology for client authentication.

Login and 2-factor authentication

It is assumed that before the first time any user attempts to login to the system, they've been registered by a system administrator and have been provided with their username. No passwords are specified for such first time users.

When users access the system, they are presented with a login prompt where they can sing in by entering their credentials or, as it would be in case of first time users, choose to sign up. The sign up functionality provides users with a prompt to enter their username (provided to them by a system administrator) and the password of their choosing (see more regarding this in the next subsection).

In order to validate that the person who is signing up is in fact a valid user and not an imposter, a cryptographically random token gets send to user's email and should be presented by the user to complete the sigh up process. As the result of successful sign up, user's system record gets updated with a HMAC-SHA1 hash code of the password.

Once user is signed up, s/he is required to login explicitly. Each explicit login is done in conjunction with 2-factor authentication, where the user should enter a pin code that gets send via SMS as part of the login procedure. User's cell phone number is expected to be entered into the system by a system administrator who registered users in the first place.

In case an SMS sending capability is not provided, we could always use a weaker 1.5-factor authentication, where pin codes are provided to users via their email. In case of 2-factor authenticaton, the first factor refers to something that users know (password), and second factor refers to something users own (e.g. a cell phone). The sent via SMS pin would then only be accessible to a person who is in a possession of their cell phone. The 1.5-factor authentication is better than just 1-factor authentication (password only), but is weaker than a 2-factor approach as it would usually be easier for an adversary to gain access to user's email account rather then their physical cell phone.

Specifically for user's sign up (one off operation), a pin code could be provided to a user by a system administrator verbally in person or over a phone call. This would count as a proper 2-factor authentication.

As of May 2016 Telstra seems to be providing a free and paid SMS gateway. Free includes up to 1000 SMS per month and 100 per day. Paid options are available with more messages.

Upon login the user may choose to mark the device that is used for accessing the application as trusted. This would tell the system not to require explicit login on this device in the future. More about this capability is discussed in later sections.

Password strength

The resilience of user passwords to guessing is one of the essential security aspects. If user passwords can be easily guessed then no matter how clever and bullet proof is the rest of the application security subsystem, an adversary would have one less barrier to break before gaining access to the application. This would be almost like having no user passwords at all. The 2-factor authentication makes this a bit harder for an adversary to use the password without the access to user's phone, but through social engineering this kind of information can also be obtained. Therefore, it is best to all parts of the security puzzle to be really hard to break or guess.

There are information entropy based approaches to measuring the strength of passwords (e.g. NIST Special Publication 800-63, dropbox approach). And there are implementations for password strength validation from Dropbox in JavaScript and from Virginia Tech Middleware in Java.

The strength of user passwords is only checked when users sing up or change their passwords. It is important to check password strength at both client and server sides. At the client side users should be provided with an instantaneous strength estimation feedback upon entering their password. This ways they know that their password is reliable and would not be rejected by the systems.

Preventing rapid-fire login attempts

Success of brute force methods for guessing passwords is largely based on the number of passwords that can be fed into the systems in a given timeframe. If the target application does not restrict the number of passwords that can be presented for a given user then the password will be picked up very quickly on the modern hardware.

The 2-factor or even 1.5-factor authentication helps a great deal to increase the number of combinations that would need to be presented to the system during a brute force attack. However, depending on the computational power used for attacking, such approach would not stop, but only slow down the adversary.

A better approach would be to slow down the rate at which passwords can be presented to the system for a given user. For example, three unsuccessful sequential entries for the same user could result in 1 minute delay before another password could be presented for the same user. The delay time may increase exponentially with the increase of the number of unsuccessful sequential tries for the same username. Such approach makes brute force attacks impractical.

Reduced Sign-On instead of Single Sign-On

A quote from a Wikipedia entry states the following about the very nature of single sign-on (SSO):

Single sign-on (SSO) is a property of access control of multiple related, but independent software systems. With this property a user logs in once and gains access to all systems without being prompted to log in again at each of them.

SSO implies a mutual trust between multiple software systems that are potentially developed by different vendors. If one of such trusted systems gets hacked, the adversary automatically gains access to all other systems in the same SSO circle of trust.

A widely adopted alternative to this quite vulnerable approach is Reduced Sign-On (RSO). The main idea behind it is to reduce the number of times users need to enter their name and password, but require the users to login explicitly (at least once) to the independent software systems separately. This approach is especially applicable for Web facing software systems, where weakness in one trusted application would give an adversary an unrestricted access to the rest of applications on the same server.

The TG security subsystem follows the RSO principle, and employs the concept of Trusted Devices similar to the way Google does with their services. Upon user login, s/he indicates whether the devices (PC, tablet etc.) that is being currently used, can be trusted for future sessions. If so, the session authenticator for this devices would be remembered as trusted by the system and all future attempts to access the system from trusted devices would not require the user to login explicitly again.

Security sensitive parts of the application such as user password change or email change are exempt from this rule, and require an explicit login. This provides an additional security layer to protect the user privacy and system integrity.

The untrusted devices would also not require an explicit login for an uninterrupted/continuous user session. Each request to the system after user explicitly logs in is recognised as part of uninterrupted session if the time between sequential requests is sufficiently short (e.g. 5 minutes). If a session from an untrusted device expires (e.g. the 5 minute timeframe has lapsed between sequential requests), the users is required to login explicitly again.

The approach with trusted and untrusted devices provides a way to accommodate the requirement for RSO. For trusted devices this provides users with convenience almost identical to SSO. And at the same time, it provides convenient and secure way for users to access the system from devices that cannot be trusted (e.g. a computer that does not belong to the user) at times where there is no other alternative.

Restoring a password

The process of password restoring is almost identical to the sign up procedure. By expressing the need to restore their password (i.e. via a specially designated part of the application), the user gets sent an email containing a cryptographically random token. The hash code of this token is persisted at the server side agains the user to be used in the password restoration process for validation, and has an expiration time. All previously stored user sessions are removed.

Then, the user presents their username and the obtained reset token together with the proposed new password. If the presented reset token is still valid and matches the username (and the new password is strong) then the provided password becomes the current one. As with the sign up procedure, the user needs to login explicitly after successfully reseting their password. At his stage a 2-factor authentication should require to enter a valid pin code that would be send to the user via SMS at the time of login.

Authenticators

Upon successful explicit login, the server provides a corresponding client with a specially designed token, which is used to automatically authenticate the user for subsequent after the explicit login requests. Such tokens are often referred to as security tokens or simply authenticators. Their main purpose is to protect the user password by removing the need to transmit it over the communication channels as part of each request. This significantly reduces chances for user password to be stolen.

In addition, authenticators are used to designate and track user sessions with the application, and provide a way to differentiate between trusted and untrusted devices used by users to access it. The main difference between authenticators for trusted and untrusted devices is the duration of their validity. For untrusted devices it is very short (e.g. 2-5 minutes), and for trusted much longer (24 hours or more). This ensures quick expiration of user sessions for untrusted devices, requiring an explicit login after a short period of inactivity. While, the sessions from trusted devices can be used for days without requiring an explicit login.

Authenticator structure and lifecycle

Authenticators in TG consist of the username, series identifier, expiration time and a hash code of these three parts (HMAC-SHA1), all separated by double colon:

username::series identifier :: expiration time :: HMAC-SHA1

The username part should correspond to the name provided by the user upon an explicit login. It is considered to be easily guessable. That is any of usernames for registered in the system users can be easily obtained by an adversary by means of algorithmic generation.

The series identifier part is a cryptographically strong random string, which uniquely identifies a user session in a way that is extremely difficult to guess. It is used in addition to a username in order to make authenticators more difficult to forge in a way explained further in the text. New series identifiers are generated for each request, replacing the previously associated with a corresponding user identifier. Series identifiers are persisted into the session table. However, to prevent massive identifier leaking in case of a stolen database, instead of the actual series identifiers, their HMAC codes are stored. The same username can be associated with several series identifiers in case of several devices being used by a corresponding user for working with the system (e.g. user's table and a workstation).

The expiration time holds the unix timestamp (number of milliseconds since 1970-01-01) when the session becomes stale (i.e. expires). It allows for a quick check to be performed in order to identify requests against expired sessions without doing any database lookups. This is especially convenient due to the stateless nature of the TG application server.

The HMAC-SHA1 is the hash code of the string that is built as a concatenation of the first three parts. It is the corner stone of the authenticator's verification to ensure that the presented authenticator is indeed the one generated by the system and was not tempered with. Hash codes for authenticators are generated using a secrete key (2048 or 4096 bit long), which is known only at the server side. Therefore, if at least one character in the authenticator was tempered with, it would be immediately identified.

As can be deduced from the above, authenticators have several mutually checked layers of sophisticated protection:

  • Series identifiers are cryptographically random and thus difficult to guess in order to forge valid authenticators. Even if authenticators get intercepted by an adversary, and are used to synthesise authenticators for other than intercepted users, then due to series identifier uniqueness this would be automatically recognised as an attack against those other users. Because series identifier are associated with usernames, it is possible to identify whose authenticators were intercepted or leaked and a symmetrical protective action could be taken (e.g. invalidate all sessions or even disable user accounts associated with the stolen authenticators until the situation is being resolved).

  • The use of information hashing as part of authenticators provides quick and reliable way to verify authenticity of authenticators without any database lookups. So, even if somehow series identifier/username associations are obtained and new authenticators are forged, the knowledge of a secrete key would be required to generate a valid hash code. In order to obtain a secrete key a sophisticated server attach is required. At the same time, if a secrete key gets obtained by an adversary, users would still remain relatively safe due to the use of difficult to guess series identifiers that are required to forge valid authenticators.

Safekeeping of authenticators

First of all, it is critical to understand that authenticators identify users in exactly the same way as if they would be providing their username and password for each request. This means that authenticators should be protected from being stolen or leaked.

The most obvious way to steal authenticators would be eavesdropping on the communication channel. However, this is easily avoided by using HTTPS, which should be taken for granted in case of TG-based applications. Another less obvious way is obtaining authenticators from either a server or a client side, where authenticators have to be persisted in order to be reused.

In case of the server side, persisting of authenticators is required for validation. This is especially essential to identify stolen authenticators as explained further in the text. Safekeeping authenticators at the server side is achieved by hashing their series identifier part before persisting. This ways there is not way to reconstruct the original value by the very nature of cryptographic hashing (has codes cannot be decoded, they are one way encoding only). One needs to have the original value in order to obtain a corresponding hash code. So, when a client request presents a series identifier as part of its authenticator, its hash code gets computed and then the hash code is compared agains the persisted value. It is important to note that such approach also protects the system from Remote Timing Attacks.

For the client request to be authenticated it needs to include a valid authenticator. If a client request does not have an authenticator it results in a HTTP error 401 (unauthorised), and user is presented with a login prompt. This means that once issued by the server, the authenticator has to be persisted also at the client side and be automatically attached to the request the next time a user attempts to access the server. This is achieved by means of persistent cookies.

In order to protect the cookies at the client side, mainly from being used in cross-site scripting (XSS) attacks, they should always be marked at HttpOnly cookies. This tells the browser that such cookie should only be accessed by the originating server, and any attempt to access them from client-side scripts is strictly forbidden. More about this can be read here and here.

What happens if an authenticator gets stolen?

Although, extremely unlikely, it is still possible to envisage a situation where users may have their authenticator stolen. In this subsection we'd like to discuss what is the worst case scenario under this condition and what can be done to mitigate such risk.

There are two possible case: an authenticator is stolen from an untrusted devices, and from a trusted device.

Untrusted Device

Due to a very short (minutes) session life for untrusted devices, an adversary can only obtain a still valid session if s/he has access to the authenticated device immediately after the user performed the last request from that device. In this case, an adversary would have a full access to the same application functionality and data as the original user. Again, this is a very unlikely event, but still possible.

In order to prevent this from happening, the user must logout from the system before stoping the use of an untrusted device. If this does not happen, it would only be possible to recognise a potentially unauthorised access by performing usage pattern analysis (e.g. too intensive system usage due to multiple simultaneous or close to each other requests that are not typical for a normal user activity, the same user may have started using the system from a different device, but there is still some activity associated with another session).

Trusted Device

Due to employment of HTTPS for communication and HttpOnly cookies for storing authenticators at the client side, the only way for an adversary to obtain an authenticator from a trusted device is to have direct access to that device (e.g. the user provided their credentials to a coworker for him to log into that device for some work related purpose).

If an adversary would keep using the same trusted device then the only way to identify unauthorised access is by analysing the usage pattern. However, it is most likely that an authenticator would be stolen from a trusted device, and then used to access the system from a different device by presenting the stolen authenticator.

In this case, there two possible scenarios -- the usage pattern analysis as before, or the legitimate user tries to access the system from the same trusted device where the authenticator was stolen from. The latter case can be easily identified as an attack due to the property of continuity for series identifiers. Basically, if a verified by HMAC code authenticator is presented, but a series identifier is no longer associated with the username (due to its regeneration upon every request) then all user sessions should be invalidated and an explicit login required. This would affect both the legitimate user and the adversary who was accessing the system with a stolen authenticator.

Deactivating users

System administrators should have the functionality to lock user accounts, which would result in removal of all current authenticators and would prevent the user from logging explicitly. Users should be presented with a relevant message stating that their account has been locked out and its reason, which needs to be provided by the system administrator when performing a lock up.

Dormant users accounts (e.g. those not used for more than 90 days) should be automatically deactivated.

Misc

How does changing your password every 90 days increase security?

In short, it does not. However, many organisations keep practicing this so called "good security practice". Here is a summary from this research:

Password expiration is widely practiced, owing to the potential it holds for revoking attackers’ access to accounts for which they have learned (or broken) the passwords. In this paper we present the first large-scale measurement (we are aware of) of the extent to which this potential is realized in practice. Our study is grounded in a novel search framework and an algorithm for devising a search strategy that is approximately optimal. Using this framework, we confirm previous conjectures that the effectiveness of expiration in meeting its intended goal is weak. Our study goes beyond this, however, in evaluating susceptibility of accounts to our search techniques even when passwords in those accounts are individually strong, and the extent to which use of particular types of transforms predicts the transforms the same user might employ in the future. We believe our study calls into question the continued use of expiration and, in the longer term, provides one more piece of evidence to facilitate a move away from passwords altogether.

Please also refer this discussion at Security StackExchange.

Clone this wiki locally