Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: implement GetObjectAttributes #55259

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

mattbenjamin
Copy link
Contributor

Fixes: https://tracker.ceph.com/issues/64109

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

<Checksum>
<ChecksumCRC32>string</ChecksumCRC32>
<ChecksumCRC32C>string</ChecksumCRC32C>
<ChecksumSHA1>string</ChecksumSHA1>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these for future implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I do have a whole checksum PR, I wasn't sure if I wanted this one to go before or after it; the checksum part also explicitly requires part enumeration

Comment on lines +5970 to +5865
/* XXXX the following conjunction should be &&--but iam_action2 is currently not
* hooked up and always fails (but should succeed if the requestor has READ
* acess to the object) */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try adding the new s3GetObjectAttributes/s3GetObjectVersionAttributes values to the mapping in

static const actpair actpairs[] =
{{ "s3:AbortMultipartUpload", s3AbortMultipartUpload },

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, that sounds right

void RGWGetObjAttrs::execute(optional_yield y)
{
RGWGetObj::execute(y);
/* XXXX now we need logic from list parts */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#50148 added some stuff that should be useful for this

in rgw_rados.cc, get_part_obj_state() interprets the manifest for the total parts_count, seeks to the requested part, then reads the part's head object

this functionality is exposed by sal::Object::ReadOp with an optional part_num input param, and parts_count output param

for your purposes here, i'd start with a ReadOp::prepare() for the part_num corresponding to x-amz-part-number-marker + 1. that'll give you the first part's size and xattrs for stuff like checksums. it also returns the total parts_count that you can continue to loop over

just note that there's a special case for part_num=1 on non-multipart uploads described in #50148 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, casey

Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@mattbenjamin
Copy link
Contributor Author

(will rebase this next week)

@github-actions github-actions bot removed the stale label Apr 12, 2024
mattbenjamin and others added 14 commits April 29, 2024 11:54
Among other things, provides XXH128 and XXH3.  Includes
compile fixes for gcc-13.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Fast, cryptographic hash functions suitable for block checksums.
BLAKE3 is the faster, more portable successor to Blake2(b,s),
now with 4x throughput.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adds new Blake3 digest format (native), a concrete type to
represent digests, and static_visitor machinery to unify varying
checksum computations.

This framework, together with new trailing checksum header support,
is used to implement S3 additional checksum verification.  Parts of
the AWS content checksum API work build on a prior contribution from
imtzw <tongzhiwei_yewu@cmss.chinamobile.com>.
Thank you!

Fixes: https://tracker.ceph.com/issues/42080
Fixes: https://tracker.ceph.com/issues/63951

squashed commits:
* rgw_cksum: add trival test vectors for sha-format digests
      Computed digests match those produced by sha1sum, sha256sum,
      and sha512sum utilities.
* rgw_cksum: add test vectors for blake3
      Tests the same input strings with digests validated by
      b3sum (https://crates.io/crates/b3sum).
* rgw_ckum: switch to accel crc32c
      The internal Ceph convention appears to be to omit a final
      xor where ceph_crc32c is used, but it's required for compatibility
      with AWS implementations.
* rgw_cksum: add XXH3 digest
* rgw_cksum: write class encoder for rgw::digest::Cksum
* rgw_cksum: also reverse crc32c (REBASEME)
      Mark noticed that the crc32c output was being tested against a
      byteswapped value (crc32c also needs byteswap on LE).
* rgw_cksum: add digest::Cksum serde tests
* rgw_cksum: fix main(...) linkage
      (so we run our main unit and not the one in gmock
* rgw_cksum: convenience extensions for integration with RGW/S3
* introduce rgw_cksum unique_ptr factory
* rgw_cksum: mark string transform accessors const
* rgw_cksum: fixup unittest_rgw_chksum compilation--all existing tests pass
* rgw_cksum: hook up put-object checksum workflow
* tweaks to report on content checksum mismatch
* rgw_cksum: match SDK as well as general checksum header
* make it more efficient
* initialize RGWPutObj_Cksum::digest
* rgw_cksum:  write parse_cksum_type w/const char* arg
* initialize _type correctly; doing armored wrong
* fix expected checksum header name, clean up verify
* fix output on checksum verify fail, cleanup
* introduce Cksum::to_armor();  all AWS cases pass
* oops, extra 0-byte at end of to_armor() result
* use to_armor() with decoded checksums (i.e., for all S3 presentation)
* remove unnecessary finalize() in RGWPutObj_Cksum dtor
* RGWPutObj_Cksum::Factory fixes
* fixes test_object_checksum_sha256
* choose preferred checksum algorithm header if both are present
* log verified checksums in RGWPutObj::execute at 16
* checksum not needed in policy condition
* fix checksum trailing header format

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
I.e., prove cksum == Cksum(cksum.to_armor().c_str())

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
includes commits:

* fixes init-multipart header return
* introduce checksum to SAL MultipartPart interface
* thread optional checksum through DataProcessor
* code complete multipart checksum verify
* fix formatter
* fix ckecksum format for multipart objects in GET/HEAD ops
* always return parts_count from ReadOp::prepare() if applicable
      This behavior is used when returning the checksum of a multipart
      upload object.
* tweak conditional multipart_parts_count
* add checksum output to ListMultipart
* fix nil-return from GetHeaderCksumResult
* re-arm truncated if re-entering list-parts
* complete-multipart w/list-parts
* validate supplied checksum in CompleteMultipart
* verify checksum type against initial checksum algorithm
* rgw_op: suppress more x-amz headers
* final fixes and cleanups

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
* fix to deal with parts_count == 1 asymmetry

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
* properly transform pseudo headers in PostObj
* enable cksum verify in PostObj
* match checksum headers in match_policy_vars
* fixup add POST headers to environment

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Fixes: https://tracker.ceph.com/issues/64109

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
also, now working w/o obvious error for non-mpu

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
…e over the parts and read their cksum attr

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
…by default

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants