Skip to content
El RIDO edited this page Apr 27, 2024 · 20 revisions

Data Types

All of the used data types and JSON structures can be found in self-documenting JSON-LD:

Preparation

Data passed in

User created paste content. This is to be encrypted. [ref]

paste_data: UTF-8 string containing a JSON structure with the different components of the paste
paste_data_json = {
    "paste": "text content of the paste",
    "attachment": "[data URI as per RFC 2397]",
    "attachment_name": "filename.ext",
    "children": [
        "paste_id#key",
        "https://example.com/"
    ]
}

User set options and encryption params. This is authenticated, but not encrypted. [ref]

paste_aadata: UTF-8 string containing a JSON structure with the additional authenticated [meta] data of the paste
paste_aadata_json = [
    [
        base64(cipher_iv),
        base64(kdf_salt),
        kdf_iterations,
        kdf_keysize,
        cipher_tag_size,
        cipher_algo,
        cipher_mode,
        compression type - "zlib" or "none" (the rawdeflate library used before PrivateBin version 1.3 is not quite zlib compatible)
    ],
    format of the paste - "plaintext" or "syntaxhighlighting" or "markdown",
    open-discussion flag - 1 or 0,
    burn-after-reading flag - 1 or 0
]

Comments are similar, but slightly different. [ref] [ref2]

comment_data_json = {
    "comment": "text content of the comment",
    "nickname": "comment_maker_nick"
}

# comments have a simpler aadata format too
comment_aadata_json = [
    base64(cipher_iv),
    base64(kdf_salt),
    kdf_iterations,
    kdf_keysize,
    cipher_tag_size,
    cipher_algo,
    cipher_mode,
    compression type - "zlib" or "none"
]

Paste password, if provided, is used in deriving encryption key. The key encrypts the paste and all its comments.

paste_password: UTF-8 string

Note: Paste children may contain external URLs (privatebin pastes or other websites) or just paste IDs followed by the key. Both don't include a password, so users have the option of linking to a new paste without giving access, if they change the password during the clone.

Note: ECMA script strings are UTF-16 encoded (this includes contents of form fields retrieved via the DOM on a otherwise UTF-8 encoded web page) and need to be converted to UTF-8 first.

Process data

If paste_password is an empty string:

paste_key = random(32) # 32 bytes
paste_passphrase = paste_key

if a paste_password has been specified:

paste_key = random(32) # 32 bytes
paste_passphrase = paste_key + paste_password # string/byte concatenation

Note: These random(32) bytes in the paste_key are important to keep around till the end. Use a cryptographically secure random number generator to generate them. These bytes encoded in base58 form the fragment part (characters after #) of paste url.

Processing of the paste_data, if compression is enabled (the default):

paste_blob = zlib.compress(paste_data)

Because of a bug in the deflate algorithm used in PrivateBin you can't use a standard-conform deflate algorithm for that in the format version 1.

Key derivation (PBKDF2)

Since passwords and keys are usually too short to be usable for encryption, it is common practice to use salted key derivation to turn such low entropy input into the actual key to use during en/decryption.

kdf_salt = random(8) # 8 bytes
kdf_iterations = 100000 # was 10000 before PrivateBin version 1.3
kdf_keysize = 256 # bits of resulting kdf_key

kdf_key = PBKDF2_HMAC_SHA256(kdf_keysize, kdf_salt, paste_password)

Encryption

cipher_algo = "aes"
cipher_mode = "gcm" # was "ccm" before PrivateBin version 1.0
cipher_iv = random(16) # 128 bit
cipher_tag_size = 128

cipher_text = cipher(AES(kdf_key), GCM(iv, paste_meta), paste_blob)

URL format of the paste

After submitting the paste (see format below) via POST request, the API will reply with a JSON string containing a status of the operation and either an error message or the id of the stored paste. You can use this to concatenate the final URL string, as follows:

privatebin_url = "https://example.com/?"
paste_id = ... # received from the API
encoded_key = base58(paste_key) # for paste_key, see above
paste_url = privatebin_url + paste_id + "#" + encoded_key

Alternatively, as of PrivateBin > 1.7, to create a URL that upon opening, asks the user for confirmation to load the paste, adds a dash after the fragment:

paste_url = privatebin_url + paste_id + "#-" + encoded_key

This second URL type is intended to be used with burn-after-reading pastes, so that URL-scanners used in email- or chat-security systems, that execute the JavaScript, don't trigger the deletion of the paste. But you can choose to generate burn-after-reading pastes and not use this URL type or use this type for regular pastes.

Format version 2 (PrivateBin >= 1.3)

The main changes in this over version 1 are:

  • the use of a standards conforming deflate implementation #193 and offering compression to be turned off #38.
  • allow paste versioning, by including an encrypted link to another paste #255
  • increase the iterations in the used KDF to at least 10000 #350
  • proper use of adata for authenticating the meta data. Clients can be sure the server didn't change the static parts created with the paste. The dynamic parts of the meta data is stored separately.

Paste format:

{
    "v": 2,
    "adata": paste_aadata,
    "ct": base64(cipher_text),
    "meta": {
        "expire": "5min" # generated client side on paste creation, not returned by server
        "created": unix_timestamp_created, # generated server side, only returned for comments but not pastes
        "time_to_live": [seconds], # generated server side based on creation minus expiration timestamps, only returned for pastes
        "icon": [data URL] # generated on the server for every comment, returned only for comments
    }
}

The paste is JSON encoded and as such the order of properties doesn't matter. The order of list elements in arrays (i.e. the children or comments) is important and needs to be preserved.

The "meta" block is mostly filled in by the server on requests. When creating a paste it is not present for comments and or contains only the "expire" value for pastes. If missing a paste will be created with the servers configured default expiration setting. The server validates the format and will reject storing invalid formats if detected.

The meta data in "adata" and "meta" isn't encrypted for the following reasons:

  • created - Needed for comments as these need to be sorted by date on the server side to allow for (TBD) pagination. Could be useful for sorting of pastes if we ever offer an administration interface. Not really a secret, as the server knows this anyway.
  • expire - Required server side to handle expiration of pastes. When responding the server calculates the time_to_live based on this.
  • time_to_live - Calculated by the server based on the expire and created to check if the paste has expired and needs to be deleted. Since the server has it anyway, it is returned to the client so it doesn't need to do the same calculation, reducing the risk of incorrect display if the clients clock isn't set correctly.
  • formatter - Required by the server to check if it supports it's display (configurable option).
  • burnafterreading - Required to allow for deletion after first access.
  • opendiscussion - Required server side to know if comments are accepted for a given paste or not. When responding it is used to avoid searching for non-existing comments when they are disabled in the paste.

Format version 1 (PrivateBin <= 1.2.1)

The main difference to version 2 is the use of the RawDeflate library that isn't quite compliant with the deflate standard and with some inputs creates messages that can't be decompressed even by itself.

The key derivation deviates, also. If paste_password is an empty string:

paste_passphrase = base64(random(32)) # 32 bytes

If a paste_password has been specified:

paste_passphrase = base64(random(32)) + hex(sha256(paste_password))

The paste_data is purely the paste contents, not a JSON structure:

paste_data: UTF-8 text

The meta data is not authenticated as part of the adata property and instead part of the general meta data.

Before PrivateBin version 1.0 the cipher_mode was "ccm", from version 1.0 onwards it is "gcm". Both versions can be read, even by older PrivateBin instances.

This format uses 1000 iterations for key derivation when creating new messages, but it can read messages with a higher iteration count.

kdf_iterations = 1000
cipher_data = {"iv": cipher_iv,
               "v": 1,
               "iter": kdf_iterations,
               "ks": kdf_key_size,
               "ts": cipher_tag_size,
               "mode": cipher_mode,
               "adata": cipher_associated_data,
               "cipher": cipher_algo,
               "salt": kdf_salt,
               "ct": cipher_text}

Legacy Format (ZeroBin)

This is nearly identical to format version 1, but uses Base64.js version 1.7 which produces non-standard base64 encoding due to a faulty implementation. Pastes encoded in this format can't be read without enabling the legacy mode in PrivateBin.