# EMQX File Transfer

## Basic setup

Here we're prepping the notebook environment. Nothing worth to look at.

In [79]:
import json
import requests
import pprint
import logging
import mqttft

logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)

In [96]:
! docker-compose up -d

[1A[1B[0G[?25l[+] Running 1/0
[34m ⠿ Network emqx-bridge    Created                                          0.0s
[0m[37m ⠿ Container demo-emqx-1  Starting                                         0.1s
[0m[?25h[1A[1A[1A[0G[?25l[+] Running 1/2
[34m ⠿ Network emqx-bridge    Created                                          0.0s
[0m[37m ⠿ Container demo-emqx-1  Starting                                         0.2s
[0m[?25h[1A[1A[1A[0G[?25l[+] Running 1/2
[34m ⠿ Network emqx-bridge    Created                                          0.0s
[0m[37m ⠿ Container demo-emqx-1  Starting                                         0.3s
[0m[?25h[1A[1A[1A[0G[?25l[34m[+] Running 2/2[0m
[34m ⠿ Network emqx-bridge    Created                                          0.0s
[0m[34m ⠿ Container demo-emqx-1  Started                                          0.3s
[0m[?25h

In [None]:
! docker-compose ps

Let's obtain an authorization token for the various HTTP APIs we will use later in this demo.

In [98]:
r = requests.post('http://localhost:18083/api/v5/login', json={'username': 'admin', 'password': 'passw0rd'})
auth = r.json()
authHeaders = {'authorization': f'Bearer {auth["token"]}'}
authHeaders

{'authorization': 'Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2ODA3MjIxNzIyMzMsImlzcyI6IkVNUVgifQ.DQTaHImaGr0HUfhFSkuo-4ZeOsS2iyWuoif5xBbObJ8'}

## Local filesystem backend

Let's start with local filesystem backend. We setting up two things at once here: segments storage and exports storage, both of which will use local filesystem. Segments storage contains intermediate files for in-progress transfers. Exports storage contains the resulting files for all successfully completed file transfers.

In [99]:
ft_config = {
    'storage': {
        'type': "local",
        'segments': {
            'root': "/opt/emqx/data/file-transfer/segments"
        },
        'exporter': {
            'type': "local",
            'root': "/opt/emqx/data/file-transfer/exports"
        }
    }
}

r = requests.put('http://localhost:18083/api/v5/configs/file_transfer',
                   headers=authHeaders,
                   json=ft_config)
pprint.pp(r.json())

{'storage': {'exporter': {'root': '/opt/emqx/data/file-transfer/exports',
                          'type': 'local'},
             'segments': {'gc': {'interval': '1h',
                                 'maximum_segments_ttl': '24h',
                                 'minimum_segments_ttl': '5m'},
                          'root': '/opt/emqx/data/file-transfer/segments'},
             'type': 'local'}}


Ok, time to upload our first small text file.

In [100]:
mqttft.transfer('demo-client', 'file-4242', 'test.txt', host='localhost', port=1883, segment_size=1024)

DEBUG:root:Sending CONNECT (u0, p0, wr0, wq0, wf0, c1, k60) client_id=b'demo-client'
DEBUG:root:Sending PUBLISH (d0, q1, r0, m1), 'b'$file/file-4242/init'', ... (114 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m2), 'b'$file/file-4242/0'', ... (1024 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m3), 'b'$file/file-4242/1024'', ... (238 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m4), 'b'$file/file-4242/fin/1262'' (NULL payload)
DEBUG:root:Received CONNACK (0, 0)
DEBUG:root:Received PUBACK (Mid: 1)
DEBUG:root:Received PUBACK (Mid: 2)
DEBUG:root:Received PUBACK (Mid: 3)
DEBUG:root:Received PUBACK (Mid: 4)
DEBUG:root:Sending DISCONNECT


Let's see what we have now in the local filesystem.

In [103]:
! tree run/data/file-transfer

[01;34mrun/data/file-transfer[0m
├── [01;34mexports[0m
│   ├── [01;34m36[0m
│   │   └── [01;34m26[0m
│   │       └── [01;34m7CDF642D027E224CBAC8BC80F1EAEEA77859[0m
│   │           └── [01;34mdemo-client[0m
│   │               └── [01;34mfile-pdf[0m
│   │                   ├── riaklil-en.pdf
│   │                   └── riaklil-en.pdf.MANIFEST.json
│   ├── [01;34m9A[0m
│   │   └── [01;34m65[0m
│   │       └── [01;34m607A7007317C32B281F7A5D32991585A23EA[0m
│   │           └── [01;34mdemo-client[0m
│   │               └── [01;34mfile-4242[0m
│   │                   ├── test.txt
│   │                   └── test.txt.MANIFEST.json
│   └── [01;34mtmp[0m
└── [01;34msegments[0m
    └── [01;34mdemo-client[0m
        ├── [01;34mfile-4242[0m
        └── [01;34mfile-pdf[0m

17 directories, 4 files


In [91]:
! cat run/data/file-transfer/exports/**/demo-client/file-4242/test.txt

In publishing and graphic design, Lorem ipsum is a placeholder text commonly
used to demonstrate the visual form of a document or a typeface without relying
on meaningful content. Lorem ipsum may be used as a placeholder before final
copy is available. It is also used to temporarily replace text in a process
called greeking, which allows designers to consider the form of a webpage or
publication, without the meaning of the text influencing the design.

Lorem ipsum is typically a corrupted version of De finibus bonorum et malorum,
a 1st-century BC text by the Roman statesman and philosopher Cicero, with words
altered, added, and removed to make it nonsensical and improper Latin.

Versions of the Lorem ipsum text have been used in typesetting at least since
the 1960s, when it was popularized by advertisements for Letraset transfer
sheets.[1] Lorem ipsum was introduced to the digital world in the mid-1980s,
when Aldus employed it in graphic and word-processing templates for its desktop
pu

In [None]:
! jq . run/data/file-transfer/exports/**/demo-client/file-4242/test.txt.MANIFEST.json

Nice!

Now it's time to transfer something bigger. A book.

In [102]:
mqttft.transfer('demo-client', 'file-pdf', 'riaklil-en.pdf', host='localhost', port=1883, segment_size=102400)

DEBUG:root:Sending CONNECT (u0, p0, wr0, wq0, wf0, c1, k60) client_id=b'demo-client'
DEBUG:root:Sending PUBLISH (d0, q1, r0, m1), 'b'$file/file-pdf/init'', ... (123 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m2), 'b'$file/file-pdf/0'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m3), 'b'$file/file-pdf/102400'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m4), 'b'$file/file-pdf/204800'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m5), 'b'$file/file-pdf/307200'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m6), 'b'$file/file-pdf/409600'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m7), 'b'$file/file-pdf/512000'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m8), 'b'$file/file-pdf/614400'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m9), 'b'$file/file-pdf/716800'', ... (102400 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m10), 'b'$file/file-pdf/819200'', ... (102400 bytes)
DEBU

Let's request the File Transfer API this time to see what files we now have.

In [104]:
r = requests.get('http://localhost:18083/api/v5/file_transfer/files', headers=authHeaders)
files = r.json()
pprint.pp(files)

{'files': [{'clientid': 'demo-client',
            'fileid': 'file-pdf',
            'metadata': {'checksum': '71858FA34C517BE2DDF78F18F0B9D4AF59A2B7E45D9407BF6B98944B00DEB411',
                         'name': 'riaklil-en.pdf',
                         'size': 2156579},
            'name': 'riaklil-en.pdf',
            'size': 2156579,
            'timestamp': '2023-04-05T18:16:27+00:00',
            'uri': '/api/v5/file_transfer/file?node=emqx%40emqx.dev&fileref=36%2F26%2F7CDF642D027E224CBAC8BC80F1EAEEA77859%2Fdemo-client%2Ffile-pdf%2Friaklil-en.pdf'},
           {'clientid': 'demo-client',
            'fileid': 'file-4242',
            'metadata': {'checksum': '08BA54A7562D1D5678ED12E77CA1088CCD866F6A0DCD2B62EACB76E4C6BBB0D7',
                         'name': 'test.txt',
                         'size': 1262},
            'name': 'test.txt',
            'size': 1262,
            'timestamp': '2023-04-05T18:16:19+00:00',
            'uri': '/api/v5/file_transfer/file?node=emqx%40emqx

Nice. Let's try to download the file through provided (relative) URI.

In [105]:
r = requests.get('http://localhost:18083' + files['files'][0]['uri'], headers=authHeaders)
with open('download.pdf', 'wb') as fd:
    for chunk in r.iter_content():
        fd.write(chunk)
! ls -la download.pdf

-rw-r--r-- 1 keynslug keynslug 2156579 Apr  5 21:17 download.pdf


Perfect! Let's move to the S3 stuff.

## S3 Exporter

Remember that we previously configured `local` storage exporter. Why don't we switch to the S3 exporter, which is capable of uploading completely transferred files to any S3-API-compatible storage. This time it will be the AWS S3 itself.

In [114]:
s3_secret = ! cat secrets/aws
s3_exporter_config = {
    'type': 's3',
    'host': "s3.us-east-1.amazonaws.com",
    'port': "443",

    'access_key_id': "AKIAXYPMVIWAHZSHOTES",
    'secret_access_key': s3_secret[0],

    'bucket': "keynslug-emqx-s3-demo",
    'acl': 'public_read',

    'transport_options': {
        'request_timeout': '30s',
        'ssl': {
            'enable': True
        }
    }
}

ft_config['storage']['exporter'] = s3_exporter_config

r = requests.put('http://localhost:18083/api/v5/configs/file_transfer',
                   headers=authHeaders,
                   json=ft_config)
! echo '{r.text}' | jq . | grep -v 'secret_access_key'

{
  "storage": {
    "exporter": {
      "access_key_id": "AKIAXYPMVIWAHZSHOTES",
      "acl": "public_read",
      "bucket": "keynslug-emqx-s3-demo",
      "host": "s3.us-east-1.amazonaws.com",
      "max_part_size": "5gb",
      "min_part_size": "5mb",
      "port": "443",
      "transport_options": {
        "connect_timeout": "15s",
        "enable_pipelining": 100,
        "pool_size": 8,
        "pool_type": "random",
        "request_timeout": "30s",
        "ssl": {
          "ciphers": [],
          "depth": 10,
          "enable": true,
          "hibernate_after": "5s",
          "reuse_sessions": true,
          "secure_renegotiate": true,
          "user_lookup_fun": "emqx_tls_psk:lookup",
          "verify": "verify_none",
          "versions": [
            "tlsv1.3",
            "tlsv1.2",
            "tlsv1.1",
            "tlsv1"
          ]
        }
      },
      "type": "s3",
      "url_expire_time": "1h"
    },
    "segments": {
      "gc": {
        "interval": 

Perfect! Let's repeat those tranfers we already did, so we could see that they will now end up in the AWS S3.

In [108]:
mqttft.transfer('demo-client', 'file-4242', 'test.txt', host='localhost', segment_size=1024)

DEBUG:root:Sending CONNECT (u0, p0, wr0, wq0, wf0, c1, k60) client_id=b'demo-client'
DEBUG:root:Sending PUBLISH (d0, q1, r0, m1), 'b'$file/file-4242/init'', ... (114 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m2), 'b'$file/file-4242/0'', ... (1024 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m3), 'b'$file/file-4242/1024'', ... (238 bytes)
DEBUG:root:Sending PUBLISH (d0, q1, r0, m4), 'b'$file/file-4242/fin/1262'' (NULL payload)
DEBUG:root:Received CONNACK (0, 0)
DEBUG:root:Received PUBACK (Mid: 1)
DEBUG:root:Received PUBACK (Mid: 2)
DEBUG:root:Received PUBACK (Mid: 3)
DEBUG:root:Received PUBACK (Mid: 4)
DEBUG:root:Sending DISCONNECT


In [None]:
mqttft.transfer('demo-client', 'file-pdf', 'riaklil-en.pdf', host='localhost', segment_size=102400)

Finally. Now looking at what have changed in the File Transfer API response.

In [110]:
r = requests.get('http://localhost:18083/api/v5/file_transfer/files', headers=authHeaders)
pprint.pp(r.json())

{'files': [{'clientid': 'demo-client',
            'fileid': 'file-4242',
            'name': 'riaklil-en.pdf',
            'size': 2156579,
            'timestamp': '2023-04-05T16:57:14+00:00',
            'uri': 'https://s3.us-east-1.amazonaws.com:443/keynslug-emqx-s3-demo/demo-client/file-4242/riaklil-en.pdf?AWSAccessKeyId=AKIAXYPMVIWAHZSHOTES&Signature=Djgt3XFmNpFIP8IB0qqYr7yFxuE%3D&Expires=1680722699'},
           {'clientid': 'demo-client',
            'fileid': 'file-4242',
            'name': 'test.txt',
            'size': 1262,
            'timestamp': '2023-04-05T18:24:01+00:00',
            'uri': 'https://s3.us-east-1.amazonaws.com:443/keynslug-emqx-s3-demo/demo-client/file-4242/test.txt?AWSAccessKeyId=AKIAXYPMVIWAHZSHOTES&Signature=MgCJVwnIB%2FreILT9YEPlAiv1mDw%3D&Expires=1680722699'},
           {'clientid': 'demo-client',
            'fileid': 'file-pdf',
            'name': 'riaklil-en.pdf',
            'size': 2156579,
            'timestamp': '2023-04-05T18:23:46+00:

Well time to try to transfer something MASSIVE.

In [None]:
mqttft.transfer('demo-client', 'file-movie', 'tears_of_steel_1080p.80.webm', host='localhost', segment_size=204800)