Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: connection reset/crashed when download large zero object with compression enable #15334

Closed
wants to merge 0 commits into from

Conversation

fangyuxiangGL
Copy link
Contributor

large zero object has a large compression rate,
even 4M compressed data can decompress several GB data.
Handle so much data in single process lead strange issue.

Fixed: http://tracker.ceph.com/issues/20098

Signed-off-by: fang yuxiang fang.yuxiang@eisoo.com

@fangyuxiangGL
Copy link
Contributor Author

large zero object has a large compression rate, even 4M compressed data can decompress several GB data. Handle so much data in single process lead strange issue.
a) if you upload a zero object with size 4G, then connection reset will happen when you download it (civetweb return error, when send response to client)
b) if you upload a zero object with size 10G, then radosgw will crashed with below info:

/home/fyx/github/ceph/src/common/buffer.cc: 1010: FAILED assert(o+l <= _len)

ceph version 12.0.2-1605-g2f9ee0b (2f9ee0b)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f79c8784030]
2: (ceph::buffer::ptr::copy_in(unsigned int, unsigned int, char const*, bool)+0x235) [0x7f79d1e76295]
3: (ceph::buffer::list::rebuild(ceph::buffer::ptr&)+0x4c) [0x7f79d1e76b9c]
4: (ceph::buffer::list::rebuild()+0x8b) [0x7f79d1e7754b]
5: (ceph::buffer::list::c_str()+0x19) [0x7f79d1e775e9]
6: (RGWGetObj_ObjStore_S3::send_response_data(ceph::buffer::list&, long, long)+0x4c2) [0x7f79d26cf5b2]
7: (RGWGetObj::get_data_cb(ceph::buffer::list&, long, long)+0x5b) [0x7f79d25c0f5b]
8: (RGWGetObj_Decompress::handle_data(ceph::buffer::list&, long, long)+0x3b2) [0x7f79d274ec12]
9: (RGWRados::flush_read_list(get_obj_data*)+0xae) [0x7f79d260dc0e]
10: (RGWRados::Object::Read::iterate(long, long, RGWGetDataCB*)+0x345) [0x7f79d2652685]
11: (RGWGetObj::execute()+0xd18) [0x7f79d25e4da8]
12: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, bool)+0x162) [0x7f79d25fb422]
13: (process_request(RGWRados*, RGWREST*, RGWRequest*, std::string const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*)+0xb66) [0x7f79d25fc1b6]
14: (RGWCivetWebFrontend::process(mg_connection*)+0x31c) [0x7f79d24e141c]
15: (()+0x1e0c6f) [0x7f79d251ac6f]
16: (()+0x1e25fb) [0x7f79d251c5fb]
17: (()+0x7df5) [0x7f79c80dadf5]
18: (clone()+0x6d) [0x7f79c5d6a1ad]
NOTE: a copy of the executable, or objdump -rdS <executable> is needed to interpret this.

the reason is that the data is so huge in a single process, and lots small decompressed data chunk(4M)
appended to the out buffer. remember that the object has 10G data, but bufferlist->_len is 32bit, so it
overflows, then radosgw crashed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant