Split >5GB Files automatically #21

djalova · 2016-03-31T18:30:58Z

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

djalova · 2016-03-31T18:34:46Z

We cannot guarantee that a single partition will always be under Swift's 5GB size limit. When a single partition reaches the 5GB size limit, this patch will partition the file itself by opening a new stream and write the rest of the contents in another object named "FileName/part-xxxx-split-xxxx-attempt..."

@gilv Can you review this?

gilv · 2016-04-01T03:17:15Z

@djalova Sure, i will check it. Thanks. Interesting idea to split files this way.

fraPace · 2016-04-01T07:48:20Z

@djalova
I believe hard coding the limit is not the correct idea.
The default limit on Swift is indeed 5GB, but that might not be the case. The Swift admin might have chosen to use a lower limit for some particular reason, and your code will fail.

The correct way, in my opinion would be to perform (at the creation of Stocator instance) a query to get the Swift cluster capabilities (http://developer.openstack.org/api-ref-objectstorage-v1.html#infoDiscoverability) and thus knowing the maximum size allowed per object.

@gilv What do you think?

gilv · 2016-04-01T14:45:23Z

@Nosfe @djalova I agree, but i think we should make it configurable via configuration.
I think more important is to understand wether we should use approach as was suggested by @djalova or use standard approach of SLO with manifest. I have to admit, that my first impression that @djalova used approach which looks much better than using manifest for SLO or DLO

fraPace · 2016-04-04T14:51:49Z

@gilv @djalova
I agree with you, but I would go a little bit further. The logical flow that I propose is the following:

Stocator set a default value of 5GB
Stocator queries the Object maximum file allowed by the Swift that will be used
2.1) If the query provide some results, override the default value
Stocator looks for the configurable parameter
3.1) If it exist, check if it is not greater (if so, error message), and override the previous value

In this way the possibility of errors due to miss configuration is almost absent.

What do you think?

djalova · 2016-04-04T17:48:37Z

@Nosfe @gilv
I agree with your points. I already added a commit to make the object size configurable. I just added a commit to check if the value from the configuration is valid.

gilv · 2016-04-04T18:10:47Z

@djalova I think it's very good patch :) much better then my original idea by using SLO and manifest.
I didn't had time to test it yet. Did you by chance tested it with real object that writes a single part more then 5 GB?

djalova · 2016-04-04T18:22:30Z

@gilv Yes, @jasoncl and I have tested writing with a single file of 6GB and 10GB with a max object size of 5GB. We've also tested writing 100MB files with 10MB max sizes. In all cases we tried reading from the split files, and the reads work without any loss of data.

djalova · 2016-04-07T22:52:28Z

@gilv Have you had time to test this out yet? I think I'm done with the changes I want to add. Let me know if you think there needs to be more done to get this merged. Thanks.

gilv · 2016-04-09T08:02:02Z

@djalova It looks good. I just need to perform couple of additional tests and had no time so far. Will try to do it during the next week.

gilv · 2016-04-15T05:31:39Z

src/main/java/com/ibm/stocator/fs/swift/SwiftOutputStream.java

  }

  @Override
  public void close() throws IOException {
+    LOG.info("{} bytes written", totalBytesWritten);


@djalova I guess we don't want it at info level, otherwise it will be printed all the time. Can you remove it please? I recently try to reduce number of debug prints. If you like, you can make it at trace level

Sure, that makes sense.

gilv · 2016-04-15T05:33:54Z

@djalova It looks good, i just saw the code and left one comment. I will later also try to run the code. Can you please add some unitest to it or a functional test? It's not mandatory, but would be good to have.

gilv · 2016-04-15T05:34:37Z

@djalova Can you please rebase this branch, it seems it has conflicts with master branch

djalova · 2016-04-15T18:56:10Z

@gilv I rebased and added your suggestion. I'll work on the test and push when it's done.

gilv · 2016-04-17T08:09:50Z

@djalova I think it's very good, but we need to resolve the resiliency issues that this patch adds.

As example, assume SF311.csv/part-00000-attempt_201604171048_0000_m_000000_0 is written. If the task is failed there will be new additional(s) attempts, like
SF311.csv/part-00000-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-attempt_201604171048_0000_m_000000_2
SF311.csv/part-00000-attempt_201604171048_0000_m_000000_3
etc..
Assume that job completed successfully.

The list() method will pick up the correct part-0000, based on the size ( the large size is the winner ). For example it may choose SF311.csv/part-00000-attempt_201604171048_0000_m_000000_2 and ignore attempt 0 and 3. The resolution uses the fact that it's the same object name "part-0000"

Adding this patch will affect the way list works, since you modify part-ID to part-ID-split
For example, if task will fail after split-0008

SF311.csv/part-00000-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00001-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00002-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00003-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00004-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00005-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00006-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00007-attempt_201604171048_0000_m_000000_0
SF311.csv/part-00000-split-00008-attempt_201604171048_0000_m_000000_0

and there will be replacement task "1"

SF311.csv/part-00000-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00001-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00002-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00003-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00004-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00005-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00006-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00007-attempt_201604171048_0000_m_000000_1
SF311.csv/part-00000-split-00008-attempt_201604171048_0000_m_000000_1

then the current code will fail to identify correct attempt, since part-0000 is modified to part-0000-split-number.

I think to resolve this we just need to modify list method, so it will pick up "part-NUMBER" and not "part-NUMBER-SPLIT"

djalova · 2016-04-18T22:20:18Z

@gilv Are you referring to the String returned by nameWithoutTaskID()? I added some debug messages and it includes the "split-xxxxx" without adding any extra code.

gilv · 2016-04-19T12:19:08Z

@djalova i think this is exact issue here - the name should not contain "split"..

gilv · 2016-04-19T12:20:06Z

@djalova The algorithm in list() method should be adapted as i wrote in my previous remark. Otherwise it will not work. I can try to adapt it.

djalova · 2016-04-19T17:49:18Z

@gilv Is this so that if any of the split uploads fails, we should start over and look for a part-00000 with a different a attempt number? I assumed that since the listing is alphabetical this wouldn't be a problem. For example if we had 2 attempts, A & B it would be listed:
part-0000-attemptA
part-0000-attemptB
part-0000-split-0001-attemptA
part-0000-split-0001-attemptB
Then if "part-0000-split-0001-attemptA" fails, we will catch it when it is compared to "part-0000-split-0001-attemptB" for collisions.

djalova · 2016-05-18T23:58:44Z

@gilv I don't think there's an issue with the list because when it checks for collisions the split number is included. Also since we go through the list alphabetically, when we check for collisions it compares objects of the same part and split number when there is a failed attempt.

gilv · 2016-05-20T03:32:04Z

@djalova Spit logic is internal, and the Spark's task is not aware of it. Here is an example
Task1 writes data and it split internally to part1-attempt-1-split-1, part1-attempt-1-split-2, part1-attempt-1-split-3
Consider a replacement task , that will generate part1-attempt-2-split-1, part1-attempt-2-split-2, part1-attempt-2-split-3

the list algorithm will not work in this case.

djalova · 2016-05-20T17:38:17Z

@gilv The naming scheme is part-#-split-#-attempt#. In the list logic, everything after the last '-' is stripped so when the objects are compared we compare part-#-split#. Do we want to make sure that we grab parts and splits from the same attempt? Or do we just care about the part and split number?

khanderao · 2016-08-16T22:53:41Z

Hello @djalova : when is this fix planned to get merged?

djalova · 2016-08-17T00:33:59Z

Hi @khanderao
The code in master has a changed a bit since I opened this PR. I'll have this updated by tomorrow.

gilv · 2016-08-17T05:13:20Z

@khanderao do you have a use case when a single Spark task write more then 4GB of data?

gilv mentioned this pull request Apr 5, 2016

Swift driver: support large objects creation #26

Closed

gilv reviewed Apr 15, 2016
View reviewed changes

djalova added 9 commits April 15, 2016 10:57

Split >5GB file uploads (WIP)

600e2c6

Automatically partition files when a >5GB file is uploaded

164d0fc

Bug fixes and refactoring

f1b3533

Made split size configurable

0b0c4c3

Added check to max object size config

788f420

Improved check

15a6173

Added removed comments

2d4aebe

Code cleanup

5a18d26

Changed info log message to trace

21f43b9

djalova force-pushed the partition-write branch from cd05a0d to 21f43b9 Compare April 15, 2016 18:50

gilv added bug enhancement labels Apr 20, 2016

djalova added 6 commits April 25, 2016 15:54

Fixed merge conflicts

6568599

Merge branch 'master' into partition-write

31fdbf3

Added functional test

3943eee

Fixed path to split object

09cf670

Fix test

c738ca5

Merged with master

0c50eb7

djalova force-pushed the partition-write branch from 0c50eb7 to c738ca5 Compare May 12, 2016 21:40

djalova added 2 commits May 12, 2016 14:43

Merge remote-tracking branch 'upstream/master' into partition-write

941c52e

Added condition for tearDown

bf0196f

djalova added 2 commits August 17, 2016 11:38

Merge branch 'master' into partition-write

f42ede1

Refactoring

b99f4a0

djalova closed this Aug 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split >5GB Files automatically #21

Split >5GB Files automatically #21

djalova commented Mar 31, 2016

djalova commented Mar 31, 2016

gilv commented Apr 1, 2016

fraPace commented Apr 1, 2016

gilv commented Apr 1, 2016

fraPace commented Apr 4, 2016

djalova commented Apr 4, 2016

gilv commented Apr 4, 2016

djalova commented Apr 4, 2016

djalova commented Apr 7, 2016

gilv commented Apr 9, 2016

gilv Apr 15, 2016

djalova Apr 15, 2016

gilv commented Apr 15, 2016

gilv commented Apr 15, 2016

djalova commented Apr 15, 2016 •

edited

Loading

gilv commented Apr 17, 2016

djalova commented Apr 18, 2016

gilv commented Apr 19, 2016

gilv commented Apr 19, 2016

djalova commented Apr 19, 2016

djalova commented May 18, 2016

gilv commented May 20, 2016

djalova commented May 20, 2016

khanderao commented Aug 16, 2016

djalova commented Aug 17, 2016

gilv commented Aug 17, 2016

Split >5GB Files automatically #21

Split >5GB Files automatically #21

Conversation

djalova commented Mar 31, 2016

djalova commented Mar 31, 2016

gilv commented Apr 1, 2016

fraPace commented Apr 1, 2016

gilv commented Apr 1, 2016

fraPace commented Apr 4, 2016

djalova commented Apr 4, 2016

gilv commented Apr 4, 2016

djalova commented Apr 4, 2016

djalova commented Apr 7, 2016

gilv commented Apr 9, 2016

gilv Apr 15, 2016

Choose a reason for hiding this comment

djalova Apr 15, 2016

Choose a reason for hiding this comment

gilv commented Apr 15, 2016

gilv commented Apr 15, 2016

djalova commented Apr 15, 2016 • edited Loading

gilv commented Apr 17, 2016

djalova commented Apr 18, 2016

gilv commented Apr 19, 2016

gilv commented Apr 19, 2016

djalova commented Apr 19, 2016

djalova commented May 18, 2016

gilv commented May 20, 2016

djalova commented May 20, 2016

khanderao commented Aug 16, 2016

djalova commented Aug 17, 2016

gilv commented Aug 17, 2016

djalova commented Apr 15, 2016 •

edited

Loading