Tests and results

Experiments

I tried to use the MTProto protocol during the development. The protocol is designed for access to a server API from applications running on mobile devices. It must be emphasized that a web browser is not such an application.

The protocol is subdivided into three virtually independent components:

High-level component (API query language): defines the method whereby API queries and responses are converted to binary messages.
Cryptographic (authorization) layer: defines the method by which messages are encrypted prior to being transmitted through the transport protocol.
Transport component: defines the method for the client and the server to transmit messages over some other existing network protocol (such as HTTP, HTTPS, TCP, UDP).

High-Level Component (RPC Query Language/API)

From the standpoint of the high-level component, the client and the server exchange messages inside a session. The session is attached to the client device (the application, to be more exact) rather than a specific http/https/tcp connection. In addition, each session is attached to a user key ID by which authorization is actually accomplished.

Several connections to a server may be open; messages may be sent in either direction through any of the connections (a response to a query is not necessarily returned through the same connection that carried the original query, although most often, that is the case; however, in no case can a message be returned through a connection belonging to a different session). When the UDP protocol is used, a response might be returned by a different IP address than the one to which the query had been sent.Authorization and Encryption

Prior to a message (or a multipart message) being transmitted over a network using a transport protocol, it is encrypted in a certain way, and an external header is added at the top of the message which is: a 64-bit key identifier (that uniquely identifies an authorization key for the server as well as the user) and a 128-bit message key. A user key together with the message key defines an actual 256-bit key which is what encrypts the message using AES-256 encryption. Note that the initial part of the message to be encrypted contains variable data (session, message ID, sequence number, server salt) that obviously influences the message key (and thus the AES key and iv). The message key is defined as the 128 middle bits of the SHA256 of the message body (including session, message ID, etc.), including the padding bytes, prepended by 32 bytes taken from the authorization key. Multipart messages are encrypted as a single message.

For a technical specification, see Mobile Protocol: Detailed Description

The first thing a client application must do is create an authorization key which is normally generated when it is first run and almost never changes.

The protocol’s principal drawback is that an intruder passively intercepting messages and then somehow appropriating the authorization key (for example, by stealing a device) will be able to decrypt all the intercepted messages post factum. This probably is not too much of a problem (by stealing a device, one could also gain access to all the information cached on the device without decrypting anything); however, the following steps could be taken to overcome this weakness:

Session keys generated using the Diffie-Hellman protocol and used in conjunction with the authorization and the message keys to select AES parameters. To create these, the first thing a client must do after creating a new session is send a special RPC query to the server (“generate session key”) to which the server will respond, whereupon all subsequent messages within the session are encrypted using the session key as well.
Protecting the key stored on the client device with a (text) password; this password is never stored in memory and is entered by a user when starting the application or more frequently (depending on application settings).
Data stored (cached) on the user device can also be protected by encryption using an authorization key which, in turn, is to be password-protected. Then, a password will be required to gain access even to those data.

There are several types of messages:

RPC calls (client to server): calls to API methods
RPC responses (server to client): results of RPC calls
Message received acknowledgment (or rather, notification of status of a set of messages)
Message status query
Multipart message or container (a container that holds several messages; needed to send several RPC calls at once over an HTTP connection, for example; also, a container may support gzip).

From the standpoint of lower level protocols, a message is a binary data stream aligned along a 4 or 16-byte boundary. The first several fields in the message are fixed and are used by the cryptographic/authorization system.

Each message, either individual or inside a container, consists of a message identifier (64 bits, see below), a message sequence number within a session (32 bits), the length (of the message body in bytes; 32 bits), and a body (any size which is a multiple of 4 bytes). In addition, when a container or a single message is sent, an internal header is added at the top (see below), then the entire message is encrypted, and an external header is placed at the top of the message (a 64-bit key identifier and a 128-bit message key).

A message body normally consists of a 32-bit message type followed by type-dependent parameters. In particular, each RPC function has a corresponding message type. For more detail, see Binary Data Serialization, Mobile Protocol: Service Messages.

All numbers are written as little endian. However, very large numbers (2048-bit) used in RSA and DH are written in the big endian format because that is what the OpenSSL library does.

Authorization and Encryption

Prior to a message (or a multipart message) being transmitted over a network using a transport protocol, it is encrypted in a certain way, and an external header is added at the top of the message which is: a 64-bit key identifier (that uniquely identifies an authorization key for the server as well as the user) and a 128-bit message key. A user key together with the message key defines an actual 256-bit key which is what encrypts the message using AES-256 encryption. Note that the initial part of the message to be encrypted contains variable data (session, message ID, sequence number, server salt) that obviously influences the message key (and thus the AES key and iv). The message key is defined as the 128 middle bits of the SHA256 of the message body (including session, message ID, etc.), including the padding bytes, prepended by 32 bytes taken from the authorization key. Multipart messages are encrypted as a single message.

For a technical specification, see Mobile Protocol: Detailed Description

The first thing a client application must do is create an authorization key which is normally generated when it is first run and almost never changes.

The protocol’s principal drawback is that an intruder passively intercepting messages and then somehow appropriating the authorization key (for example, by stealing a device) will be able to decrypt all the intercepted messages post factum. This probably is not too much of a problem (by stealing a device, one could also gain access to all the information cached on the device without decrypting anything); however, the following steps could be taken to overcome this weakness:

Session keys generated using the Diffie-Hellman protocol and used in conjunction with the authorization and the message keys to select AES parameters. To create these, the first thing a client must do after creating a new session is send a special RPC query to the server (“generate session key”) to which the server will respond, whereupon all subsequent messages within the session are encrypted using the session key as well.
Protecting the key stored on the client device with a (text) password; this password is never stored in memory and is entered by a user when starting the application or more frequently (depending on application settings).
Data stored (cached) on the user device can also be protected by encryption using an authorization key which, in turn, is to be password-protected. Then, a password will be required to gain access even to those data.

Beginning

PackTrack-Api was started 3 mounths ago to replace insecure chats. The unique existent network is running on a single server named chat.thelinuxsect.com.

Results

PAckTrack-Api uses http, wich is an open protocol that uses TCP and, optionally, TLS. An App client can connect to other the Api server to expand the PackTrack network. Users access Api networks by connecting a client to a server. There are many client implementations, such as PackTrack-App or PackTrack-Cli. The server do not require users to register an account but a user (username) is required before being connected.

It was originally a plain text protocol (although later extended), which on request was assigned port 80/TCP by IANA. However, the de facto standard has always been to run the Api on 443/TCP and nearby port numbers (for example TCP ports 6660–6669, 7000) to avoid having to run the Api software with root privileges.

The protocol specified that characters were 8-bit but did not specify the character encoding the text was supposed to use. This can cause problems when users using different clients and/or different platforms want to converse.

All client-to-server protocols in use today are descended from the protocol implemented in the http2 version of the HTTP2 server, and documented in RFC 2616. Since RFC 2616 was published, the new features in the http2.10 implementation led to the publication of several revised protocol documents; however, these protocol changes have not been widely adopted among other implementations.

Although many specifications on the PackTrack-Api protocol have been published, there is no official specification, as the protocol remains dynamic. Virtually no clients and very few servers rely strictly on the above RFCs as a reference.

The standard structure of a network of PackTrackApi servers is a tree. Messages are routed along only necessary branches of the tree but network state is sent to every server and there is generally a high degree of implicit trust between servers. This architecture has a number of problems. A misbehaving or malicious server can cause major damage to the network and any changes in structure, whether intentional or a result of conditions on the underlying network, require a net-split and net-join. This results in a lot of network traffic and spurious quit/join messages to users and temporary loss of communication to users on the splitting servers. Adding a server to a large network means a large background bandwidth load on the network and a large memory load on the server. Once established however, each message to multiple recipients is delivered in a fashion similar to multicast, meaning each message travels a network link exactly once. This is a strength in comparison to non-multicasting protocols such as Simple Mail Transfer Protocol (SMTP) or Extensible Messaging and Presence Protocol (XMPP).