Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The request body is too large and exceeds the maximum permissible limit #106

Closed
jmlero opened this issue Jan 29, 2016 · 7 comments
Closed

Comments

@jmlero
Copy link

jmlero commented Jan 29, 2016

Using the python blobxfer, many times and with many large files (but always smaller than 190G), I receive the error "The request body is too large and exceeds the maximum permissible limit"
I can upload some large files, but is not possible for others files, doesn't matter how many retries or the number of workers.

Also, I have this problem when the local resource to upload is a folder (a folder containing one single file), but if I try to upload the same file specifying the file, the error doesn't appear.

I am using the following python packages:

pip freeze
azure==1.0.2
azure-common==1.0.0
azure-mgmt==0.20.1
azure-mgmt-common==0.20.0
azure-mgmt-compute==0.20.0
azure-mgmt-network==0.20.1
azure-mgmt-nspkg==1.0.0
azure-mgmt-resource==0.20.1
azure-mgmt-storage==0.20.0
azure-nspkg==1.0.0
azure-servicebus==0.20.1
azure-servicemanagement-legacy==0.20.1
azure-storage==0.20.2
elasticsearch==2.2.0
futures==3.0.3
python-dateutil==2.4.2
requests==2.9.1
six==1.10.0
urllib3==1.14
wheel==0.26.0

And blobxfer.py v0.9.9.5

As an example:

azure blobxfer parameters [v0.9.9.5]
subscription id: None
management cert: None
transfer direction: local->Azure
local resource: archive/
remote resource: None
max num of workers: 4
timeout: None
storage account: ---
use SAS: False
upload as page blob: False
auto vhd->page blob: False
container: ---
blob container URI: https://---.blob.core.windows.net/---
compute file MD5: True
skip on MD5 match: True
chunk size (bytes): 4194304
create container: True
keep mismatched MD5: False
recursive if dir: True
keep root dir on up: False
collate to: disabled

script start time: 2016-01-29 10:00:09

g--.tar.gz md5: lhD55kDeLW9uh4PXtJ7LhQ==
detected 0 empty files to upload
performing 25600 put blocks/blobs and 1 put block lists
xfer progress: [ ] 0.00% 0.00 blocks/min The request body is too large and exceeds the maximum permissible limit.

RequestBodyTooLargeThe request body is too large and exceeds the maximum permissible limit.

RequestId:a6af7e74-0001-00f6-0474-5ae0d5000000
Time:2016-01-29T09:06:25.4964043Z100000

ls -lah
-rw-r--r-- 1 root root 100G Jan 26 15:46 g--.tar.gz

@alfpark
Copy link
Contributor

alfpark commented Jan 29, 2016

Can you upgrade to 0.9.9.10 and reproduce the issue?

@jmlero
Copy link
Author

jmlero commented Feb 1, 2016

I tried with the version 0.9.9.9 (the one available at this moment).

The structure of the folder I would like to upload is the following:

/data/groups/folder/archive/file1
/data/groups/folder/archive/file2
...
/data/groups/folder/archive/fileN

100 GB each one.

And the layout I would like on the storage is the following in this case:
file1
file2
...
fileN

Using --strip-components=3

azure blobxfer parameters [v0.9.9.9]

platform: Linux-2.6.32-431.20.3.el6.x86_64-x86_64-with-redhat-6.5-Carbon
python interpreter: CPython 2.7.11
package versions: az.common=1.0.0 az.sml=0.20.1 az.stor=0.20.2 req=2.9.1
subscription id: None
management cert: None
transfer direction: local->Azure
local resource: /data/groups/folder/archive/
include pattern: None
remote resource: None
max num of workers: 48
timeout: None
storage account: XXX
use SAS: False
upload as page blob: False
auto vhd->page blob: False
container: folder
blob container URI: https://XXX.blob.core.windows.net/folder
compute file MD5: True
skip on MD5 match: True
chunk size (bytes): 4194304
create container: True
keep mismatched MD5: False
recursive if dir: True
component strip on up: 3
remote delete: False
collate to: disabled
local overwrite: True
encryption mode: disabled
RSA key file: disabled
RSA key type: disabled

script start time: 2016-02-01 10:03:40
computing file md5 on: /data/groups/folder/archive/file.tar.gz.split

md5: lhD55kDeLW9uh4PQ==
detected 0 empty files to upload
performing 25600 put blocks/blobs and 1 put block lists
spawning 48 worker threads

xfer progress: [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100.00% 793.60 blocks/min

102400.0 MiB transfered, elapsed 1935.4741931 sec. Throughput = 423.255449709 Mbit/sec

Is working fine at least for 1 file, but the structure on the storage is:
archive/file1

Using --strip-components=4 I obtain the following:

azure blobxfer parameters [v0.9.9.9]

platform: Linux-2.6.32-431.20.3.el6.x86_64-x86_64-with-redhat-6.5-Carbon
python interpreter: CPython 2.7.11
package versions: az.common=1.0.0 az.sml=0.20.1 az.stor=0.20.2 req=2.9.1
subscription id: None
management cert: None
transfer direction: local->Azure
local resource: /data/groups/folder/archive/
include pattern: None
remote resource: None
max num of workers: 48
timeout: None
storage account: XXX
use SAS: False
upload as page blob: False
auto vhd->page blob: False
container: folder
blob container URI: https://XXX.blob.core.windows.net/folder
compute file MD5: True
skip on MD5 match: True
chunk size (bytes): 4194304
create container: True
keep mismatched MD5: False
recursive if dir: True
component strip on up: 4
remote delete: False
collate to: disabled
local overwrite: True
encryption mode: disabled
RSA key file: disabled
RSA key type: disabled

script start time: 2016-02-01 09:58:24
computing file md5 on: /data/groups/folder/archive/file.tar.gz.split
md5: lhD55kDeLW9uh4PXtJ7LhQ==
detected 0 empty files to upload
performing 25600 put blocks/blobs and 1 put block lists
spawning 48 worker threads
xfer progress: [ ] 0.00% 0.00 blocks/min Traceback (most recent call last):
File "blobxfer_0.9.9.9.py", line 880, in run
offset, bytestoxfer, encparam, flock, filedesc)
File "blobxfer_0.9.9.9.py", line 1002, in putblobdata
content_md5=contentmd5, timeout=self.timeout)
File "blobxfer_0.9.9.9.py", line 1309, in azure_request
return req(_args, *_kwargs)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/blob/blobservice.py", line 2366, in put_block
self._perform_request(request)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/storageclient.py", line 178, in _perform_request
_storage_error_handler(ex)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/_serialization.py", line 25, in _storage_error_handler
return _general_error_handler(http_error)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/_common_error.py", line 82, in _general_error_handler
raise AzureHttpError(message, http_error.status)
AzureHttpError: The request body is too large and exceeds the maximum permissible limit.

RequestBodyTooLargeThe request body is too large and exceeds the maximum permissible limit.

RequestId:c7662108-0001-0111-2ccf-5cb68d000000
Time:2016-02-01T09:02:45.8204723Z100000

Thanks and regards

@alfpark
Copy link
Contributor

alfpark commented Feb 1, 2016

Unfortunately, I cannot reproduce this error.

Can you place this statement: print(offset, bytestoxfer, len(data)) prior to the azure_request function call on line 999? Please re-run your scenario and paste your output.

@jmlero
Copy link
Author

jmlero commented Feb 2, 2016

The output:

detected 0 empty files to upload
performing 25600 put blocks/blobs and 1 put block lists
spawning 48 worker threads
xfer progress: [ ] 0.00% 0.00 blocks/min 0 4194304 4194304
4194304 4194304 4194304
8388608 4194304 4194304
12582912 4194304 4194304
16777216 4194304 4194304
20971520 4194304 4194304
25165824 4194304 4194304
29360128 4194304 4194304
33554432 4194304 4194304
37748736 4194304 4194304
41943040 4194304 4194304
46137344 4194304 4194304
50331648 4194304 4194304
54525952 4194304 4194304
58720256 4194304 4194304
62914560 4194304 4194304
67108864 4194304 4194304
71303168 4194304 4194304
75497472 4194304 4194304
79691776 4194304 4194304
83886080 4194304 4194304
88080384 4194304 4194304
92274688 4194304 4194304
96468992 4194304 4194304
100663296 4194304 4194304
104857600 4194304 4194304
109051904 4194304 4194304
113246208 4194304 4194304
117440512 4194304 4194304
121634816 4194304 4194304
125829120 4194304 4194304
130023424 4194304 4194304
134217728 4194304 4194304
138412032 4194304 4194304
142606336 4194304 4194304
146800640 4194304 4194304
150994944 4194304 4194304
155189248 4194304 4194304
159383552 4194304 4194304
163577856 4194304 4194304
167772160 4194304 4194304
171966464 4194304 4194304
176160768 4194304 4194304
180355072 4194304 4194304
184549376 4194304 4194304
188743680 4194304 4194304
192937984 4194304 4194304
197132288 4194304 4194304
Traceback (most recent call last):
File "/data/groups/adm_informatics/prod/archive2azure/blobxfer_0.9.9.9.py", line 880, in run
offset, bytestoxfer, encparam, flock, filedesc)
File "/data/groups/adm_informatics/prod/archive2azure/blobxfer_0.9.9.9.py", line 1003, in putblobdata
content_md5=contentmd5, timeout=self.timeout)
File "/data/groups/adm_informatics/prod/archive2azure/blobxfer_0.9.9.9.py", line 1310, in azure_request
return req(_args, *_kwargs)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/blob/blobservice.py", line 2366, in put_block
self._perform_request(request)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/storageclient.py", line 178, in _perform_request
_storage_error_handler(ex)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/_serialization.py", line 25, in _storage_error_handler
return _general_error_handler(http_error)
File "/cm/local/apps/python/2.7.11_azure/lib/python2.7/site-packages/azure/storage/_common_error.py", line 82, in _general_error_handler
raise AzureHttpError(message, http_error.status)
AzureHttpError: The request body is too large and exceeds the maximum permissible limit.

RequestBodyTooLargeThe request body is too large and exceeds the maximum permissible limit.

RequestId:3309da30-0001-0000-78be-5dc7c3000000
Time:2016-02-02T13:33:05.8171503Z100000

@alfpark
Copy link
Contributor

alfpark commented Feb 2, 2016

Thanks for running the script with the modification. According to the new debug lines, the data being sent to the Azure Python Storage SDK is consistent with the maximum allowable block size of 4MB.

I have two suggestions:

  1. Retry the upload using a SAS key so the request is a direct REST call via requests
  2. Pass --chunksizebytes 4194296 as a parameter

@jmlero
Copy link
Author

jmlero commented Feb 3, 2016

Using the second option, pass --chunksizebytes 4194296 as a parameter works fine. I will try also using a SAS key, but I think that for me the best solution is change the value _MAX_BLOB_CHUNK_SIZE_BYTES to 4194296 as you suggested

If this error is related with the SDK, also I will try the next release of the SDK as soon as it will be available.

Thanks and regards

@alfpark
Copy link
Contributor

alfpark commented Feb 3, 2016

Thanks for working through the issue. If you want, you can directly raise this issue to the Azure Python Storage SDK github repo. I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants