Skip to content

Comments

[HUDI-5298] Optimize WriteStatus storing HoodieRecord#8472

Closed
xiaochen-zhou wants to merge 11 commits intoapache:masterfrom
xiaochen-zhou:HUDI-5298
Closed

[HUDI-5298] Optimize WriteStatus storing HoodieRecord#8472
xiaochen-zhou wants to merge 11 commits intoapache:masterfrom
xiaochen-zhou:HUDI-5298

Conversation

@xiaochen-zhou
Copy link
Contributor

Change Logs

WriteStatus stores the entire HoodieRecord. we can optimize it to store just the required info (record key, partition path, location).

Impact

Optimize WriteStatus to store just the required info (record key, partition path, location).

Risk level (write none, low medium or high below)

low

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed


public class HoodieRecordStatus implements Serializable, KryoSerializable {


Copy link
Contributor

@danny0405 danny0405 Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key + location are actually an index item, just rename it to IndexItem ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key + location are actually an index item, just rename it to IndexItem ?

Thank you very much for your review, I have modified the code, can you re-review the code when you are free, and make some comments.

@danny0405
Copy link
Contributor

It is great if we can have numbers to illustrate the gains after the patch, like the cost reduction for memory or something.

@xiaochen-zhou
Copy link
Contributor Author

It is great if we can have numbers to illustrate the gains after the patch, like the cost reduction for memory or something.

I would be happy to do it.

# Conflicts:
#	hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/inmemory/HoodieInMemoryHashIndex.java
@xiaochen-zhou
Copy link
Contributor Author

xiaochen-zhou commented Apr 22, 2023

It is great if we can have numbers to illustrate the gains after the patch, like the cost reduction for memory or something.

I did a test based on your suggestion:
The number of HoodieRecords is 1000 * 100 * 2

WriteStatus status = new WriteStatus(false, 1.0);
for (int i = 0; i < 1000 * 100; i++) {
  status.markSuccess(mock(HoodieRecord.class), Option.empty());
  status.markFailure(mock(HoodieRecord.class), t, Option.empty());
}
System.out.println("status memory: " + ObjectSizeCalculator.getObjectSize(status));

The memory occupied by WriteStatus before optimization is: 125512336 byte

private final List<HoodieRecord> writtenRecords = new ArrayList<>();
private final List<HoodieRecord> failedRecords = new ArrayList<>();
status memory: 125512336

The memory occupied by WriteStatus after optimization is: 427408 byte

private final List<IndexItem> writtenRecordIndexes = new ArrayList<>();
private final List<IndexItem> failedRecordIndexes = new ArrayList<>();
status memory: 427408

@xiaochen-zhou
Copy link
Contributor Author

xiaochen-zhou commented Apr 22, 2023

It is great if we can have numbers to illustrate the gains after the patch, like the cost reduction for memory or something.

The memory occupied by WriteStatus after optimization is about 1/300 of that before optimization !

Copy link
Contributor

@bvaradar bvaradar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to keep failed records as is for logging the complete record elsewhere cc @nsivabalan

@vinothchandar
Copy link
Member

@prashantwason @nbalajee @suryaprasanna would this break you all in anyway? Do we need the record data anywhere for successful writes?

cc @rmahindra123 as well. same question.


import java.io.Serializable;

public class IndexItem implements Serializable, KryoSerializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give some doc to the class.

* Identifies the record across the table.
*/
protected HoodieKey key;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make all these members private and final?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make all these members private and final?

Thank you very much for review and sorry for the late response. I will try to modify the code according to your suggestions

Copy link
Contributor Author

@xiaochen-zhou xiaochen-zhou May 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make all these members private and final?

We may not be able to make all these members final because they need to be reassigned

  @Override
  public final void read(Kryo kryo, Input input) {
    this.key = kryo.readObjectOrNull(input, HoodieKey.class);
    this.currentLocation = (HoodieRecordLocation) kryo.readClassAndObject(input);
    this.newLocation = (HoodieRecordLocation) kryo.readClassAndObject(input);
  }

@prashantwason
Copy link
Member

@prashantwason @nbalajee @suryaprasanna would this break you all in anyway? Do we need the record data anywhere for successful writes?

record index implementation requires the record key and the location to create the mapping in the index. This is similar requirement to other non-implicit indexes like HBaseIndex.

@prashantwason
Copy link
Member

@clownxc If I understand correctly, the memory savings are coming from dropping the "data" part of the HoodieRecord? I noticed that HoodieRecord has only 2 additional members - sealed (boolean) and data (t). Are the savings due to usage of the mock class (which may have bloating compared to the original HoodieRecord)?

But hoodie write handles deflate the HoodieRecord after writing so the data portion should go away reducing the amount of savings possible.

Can you run the test again with these changes:

  1. WriteStatus status = new WriteStatus(true, 1.0); // enable success record tracking as errors should be rare
  2. Create an actual HoodieRecord and use that in the for loop instead of the mock(HoodieRecord.class)
  3. Call deflate on the create HoodieRecord to remove the data as the write handles do.

I feel the above may give a more realistic view of savings.

Also, how did you find this interesting optimization? I am interested as there may be other avenues of such savings within HUDI so if would be good to know how you track these.

@xiaochen-zhou
Copy link
Contributor Author

@clownxc If I understand correctly, the memory savings are coming from dropping the "data" part of the HoodieRecord? I noticed that HoodieRecord has only 2 additional members - sealed (boolean) and data (t). Are the savings due to usage of the mock class (which may have bloating compared to the original HoodieRecord)?

But hoodie write handles deflate the HoodieRecord after writing so the data portion should go away reducing the amount of savings possible.

Can you run the test again with these changes:

  1. WriteStatus status = new WriteStatus(true, 1.0); // enable success record tracking as errors should be rare
  2. Create an actual HoodieRecord and use that in the for loop instead of the mock(HoodieRecord.class)
  3. Call deflate on the create HoodieRecord to remove the data as the write handles do.

I feel the above may give a more realistic view of savings.

Also, how did you find this interesting optimization? I am interested as there may be other avenues of such savings within HUDI so if would be good to know how you track these.

I would be happy to do it.

@xiaochen-zhou
Copy link
Contributor Author

xiaochen-zhou commented May 6, 2023

@clownxc If I understand correctly, the memory savings are coming from dropping the "data" part of the HoodieRecord? I noticed that HoodieRecord has only 2 additional members - sealed (boolean) and data (t). Are the savings due to usage of the mock class (which may have bloating compared to the original HoodieRecord)?

But hoodie write handles deflate the HoodieRecord after writing so the data portion should go away reducing the amount of savings possible.

Can you run the test again with these changes:

  1. WriteStatus status = new WriteStatus(true, 1.0); // enable success record tracking as errors should be rare
  2. Create an actual HoodieRecord and use that in the for loop instead of the mock(HoodieRecord.class)
  3. Call deflate on the create HoodieRecord to remove the data as the write handles do.

I feel the above may give a more realistic view of savings.

Also, how did you find this interesting optimization? I am interested as there may be other avenues of such savings within HUDI so if would be good to know how you track these.

this interesting optimization was reported by @nsivabalan HUDI-5298 and has not been implemented for a long time. so I try to complete it.

@hudi-bot
Copy link
Collaborator

hudi-bot commented May 6, 2023

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@xiaochen-zhou
Copy link
Contributor Author

According to the suggestion provided by @prashantwason , I did a test as follows:

    WriteStatus status = new WriteStatus(true, 1.0);
    String partitionPath = HoodieTestDataGenerator.DEFAULT_PARTITION_PATHS[0];
    dataGen = new HoodieTestDataGenerator(new String[] {partitionPath});
    String newCommitTime = "001";
    List<HoodieRecord> records = dataGen.generateInserts(newCommitTime, 1000);
    Throwable t = new Exception("some error in writing");
    for (int i = 0; i < 1000 ; i++) {
      HoodieRecord data1 = records.get(i);
      status.markSuccess(data1, Option.empty());
      data1.deflate();
      HoodieRecord data2 = records.get(i++);
      status.markFailure(data2, t, Option.empty());
      data2.deflate();
    }
    System.out.println("status memory: " + ObjectSizeCalculator.getObjectSize(status));

It was found that the memory space occupation before(status memory: 113048) and after optimization(status memory: 117032) basically did not change, The main reason is that hoodie write handles deflate the HoodieRecord after writing and the mock class which may have bloating (I'm sorry because I didn't take these two factors into account in the previous test)
@prashantwason @danny0405 @vinothchandar

I have a doubt that if there is some optimization needed for writeStatus.markFailure if an exception occurs before record.deflate()

      writeStatus.markSuccess(hoodieRecord, recordMetadata);
      // deflate record payload after recording success. This will help users access payload as a
      // part of marking
      // record successful.
      hoodieRecord.deflate();
      return finalRecordOpt;
    } catch (Exception e) {
      LOG.error("Error writing record  " + hoodieRecord, e);
      writeStatus.markFailure(hoodieRecord, e, recordMetadata);
    }

or, In some places, there will be no deflate operation when writeStatus.markFailure

    if (indexedRecord.isPresent()) {
      // Skip the ignored record.
      try {
        if (!indexedRecord.get().shouldIgnore(writeSchema, recordProperties)) {
          recordList.add(indexedRecord.get());
        }
      } catch (IOException e) {
        writeStatus.markFailure(record, e, record.getMetadata());
        LOG.error("Error writing record  " + indexedRecord.get(), e);
      }
    }

Although the optimized effect may not have a large benefit

  public void markFailure(HoodieRecord record, Throwable t, Option<Map<String, String>> optionalRecordMetadata) {
    if (failedRecords.isEmpty() || (random.nextDouble() <= failureFraction)) {
      // Guaranteed to have at-least one error
      failedRecords.add(record);
      errors.put(record.getKey(), t);
    }
    totalRecords++;
    totalErrorRecords++;
  }

hope you leave some comments in your free time. @prashantwason @danny0405 @vinothchandar

@bvaradar
Copy link
Contributor

bvaradar commented May 7, 2023

@clownxc : For failed records, we need to have them logged elsewhere and so no need to deflate. For exception cases, the write status should be marked as failure. So, I don't see any reason to change this.

@xiaochen-zhou
Copy link
Contributor Author

@clownxc : For failed records, we need to have them logged elsewhere and so no need to deflate. For exception cases, the write status should be marked as failure. So, I don't see any reason to change this.

I see, Thank you very much for review.

@danny0405
Copy link
Contributor

Cool, I think we are good to close this issue..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants