Skip to content

Latest commit

 

History

History
64 lines (56 loc) · 3.27 KB

20230203-minutes.md

File metadata and controls

64 lines (56 loc) · 3.27 KB

S3 Working Group Minutes

Friday, 2023/02/03

3:00 pm ET

Attendees:

Terrell Russell, Kory Draughn, Alan King, Violet White, Tim Cutts (Amazon), Illyoung Choi (CyVerse), Tony Edgin (CyVerse), Nirav Merchant (CyVerse)

Minutes

DISCUSSION

  • Native C++ S3 Implementation
    • Working endpoints: CopyObject, ListObjectsV2, GetObject, DeleteObject, HeadObject, PutObject
    • Two plugin interfaces defined:
      • Authentication - user mapping
      • Bucket Resolution - bucket mapping
    • Skipping implementation of multipartUpload for now
    • Focused on testing and packaging
  • S3 frontend to SFTPGo
    • Author doesn't currently have bandwidth for this project
  • Multipart Options
    • a) Multiobject - write all parts individually to irods, then complete triggers copy/concatenate/whatever
      • pro - relatively simple
      • con - lots of extra policy, could trigger replication to multiple continents (just a config option)
        • requires API plugin for concatenate()
    • b) Store-and-forward - write it all down in the bridge, then send it to irods
      • pro - simple, no extra policy
      • con - slow/delayed, need POTENTIALLY HUGE disk
    • c) Efficient store-and-forward - write down / hold non-contiguous parts in bridge - send contiguous parts to irods when ready
      • pro - elegant, single write
      • con - more complexity, need biggish disk
      • maybe off the table because a client can re-send the same numbered part and it should overwrite the earlier same part
        • OR... new thread! offset, overwrite, who cares, same size, magic/perfect, just works, don't look at me…
    • d) Store-and-register - write it all down where irods can see it, then just register it in irods
      • pro - simple, fastest
      • con - just reg policy?, adds dependency on co-visibility of bridge and irods
        • cannot continue on failure (incomplete writes)
          • iRODS doesn't know what happened
          • Client has no way to recover
  • Illyoung - HDF5 file format? Because we don't know the sizes ahead of time
    • Just write them down as they come, document the parts, and then register the whole thing in iRODS as hdf5 type… would require a server-translation thinger to give the file back to the user - same as option a) concatenate()
      • Possible polling/delay option - glacier-like until it's ready
    • Feels that (a) is probably the best path forward
      • Tim also agrees that this approach is closer to how S3 works
      • Core idea is similar to iput/ibun
      • Violet - plans on writing to a secret collection before concatenating all parts
      • Kory - What happens if policy exists and it modifies permissions on the parts?
    • Is multipart upload required?
    • No. Minio doesn't implement it
  • Possible 'option' flag to pick a/b/c/d mode… we could implement more than one of these
  • Nirav - people will use well known tooling to communicate with this
    • We can try to support a wide range of tools, but we may hit a wall
    • OR, we can list a set of tools which are supported
    • Common tools appear to be the AWS client and Boto3
  • Nirav - What versions of iRODS does this target?
    • Kory - We're targeting versions of iRODS that support the necessary features
      • Meaning 4.2.11+ possibly
    • We need people who are able to test it out
  • Next Meeting
    • Mar 2023