Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-2749][dataload] In HDFS Empty tablestatus file is written during datalaod, iud or compaction when disk is full. #2517

Closed

Conversation

mohammadshahidkhan
Copy link
Contributor

@mohammadshahidkhan mohammadshahidkhan commented Jul 17, 2018

Problem:
When a failure happens due to disk full during load, IUD or Compaction, then while updating the tablestatus file, the tablestaus.tmp file during atomic file operation remains empty, and in the finally block the empty tablestaus.tmp file is getting renamed to the actual file. This leads to empty tablestatus file. Once such problem happens the tablestatus file can not be retrieved and the already loaded data can be used.
Solution:
If the failures happens during write then the the schema rename in the finally
block must be avoided.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed?
    NA
  • Any backward compatibility impacted?
    NA
  • Document update required?
    NA
  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.
    NA
  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
    NA

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6024/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7260/

@mohammadshahidkhan
Copy link
Contributor Author

retest SDV please

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5903/

@mohammadshahidkhan
Copy link
Contributor Author

The failled test case is unrelated and random.
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5903/

}

}

@Override public void setFailed() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setFailed needs to be raised on Exception cases in the below usages of AtomicFileOperations

  • org.apache.carbondata.core.datamap.status.DiskBasedDataMapStatusProvider#writeLoadDetailsIntoFile
  • org.apache.carbondata.core.metadata.SegmentFileStore#writeSegmentFile(org.apache.carbondata.core.metadata.SegmentFileStore.SegmentFile, java.lang.String)
  • org.apache.carbondata.core.statusmanager.SegmentStatusManager#writeLoadMetadata

Please also handle any other references I missed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7349/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6113/

@brijoobopanna
Copy link
Contributor

retest sdv please

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5955/

@mohammadshahidkhan
Copy link
Contributor Author

retest sdv please

}
} else {
LOGGER.warn("The temporary file renaming skipped due to I/O error, deleting file "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here actually delete code is added. AtomicFileOperationsImpl is taking care of overwriting temp file even if it exists. So, please correct the message accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KanakaKumar Thanks for the review.
Actually when the delete fails the temp file will persists and its better to delete it at the same time, to avoid problem in secure environment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fine.

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5966/

@KanakaKumar
Copy link
Contributor

LGTM

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7450/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6207/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5982/

@mohammadshahidkhan mohammadshahidkhan force-pushed the tablestatus_fix branch 2 times, most recently from 69c253f to cd03a27 Compare July 25, 2018 07:03
@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7489/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6247/

@mohammadshahidkhan mohammadshahidkhan force-pushed the tablestatus_fix branch 2 times, most recently from 678f529 to e49c4ca Compare July 25, 2018 09:55
@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7499/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7504/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6261/

@brijoobopanna
Copy link
Contributor

retest sdv please

1 similar comment
@brijoobopanna
Copy link
Contributor

retest sdv please

@brijoobopanna
Copy link
Contributor

retest sdv please

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6001/

@brijoobopanna
Copy link
Contributor

retest sdv please

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6012/

@mohammadshahidkhan
Copy link
Contributor Author

retest sdv please

@@ -724,6 +724,7 @@ public void writeLoadDetailsIntoFile(List<SegmentUpdateDetails> listOfSegmentUpd
brWriter.write(metadataInstance);
} catch (IOException ioe) {
LOG.error("Error message: " + ioe.getLocalizedMessage());
fileWrite.setFailed();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some places IO exception is thrown back after setting the flag and some places exception is not thrown..any specific reason for this?...can we keep the behavior same..better to throw back the exception at all places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manishgupta88 Thanks for the review.
Agreed with your point eating exception is not a good idea, handled throwing back at all the places.

… during datalaod, iud or compaction when disk is full.
@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6025/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7572/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6330/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6038/

@kunal642
Copy link
Contributor

LGTM. The failures in 2.2.1 build are note related to this PR. Those are failing on master as well. Need to fix.

@asfgit asfgit closed this in 7628571 Jul 30, 2018
asfgit pushed a commit that referenced this pull request Jul 30, 2018
during datalaod, iud or compaction when disk is full.

Problem:
When a failure happens due to disk full during load, IUD or Compaction,
then while updating the tablestatus file, the tablestaus.tmp file during
atomic file operation remains empty, and in the finally block the empty
tablestaus.tmp file is getting renamed to the actual file.
This leads to empty tablestatus file. Once such problem happens the
tablestatus file can not be retrieved and the already loaded data can be used.

Solution:
If the failures happens during write then the the schema rename in the finally
block must be avoided.

This closes #2517
sgururajshetty pushed a commit to sgururajshetty/carbondata that referenced this pull request Aug 2, 2018
during datalaod, iud or compaction when disk is full.

Problem:
When a failure happens due to disk full during load, IUD or Compaction,
then while updating the tablestatus file, the tablestaus.tmp file during
atomic file operation remains empty, and in the finally block the empty
tablestaus.tmp file is getting renamed to the actual file.
This leads to empty tablestatus file. Once such problem happens the
tablestatus file can not be retrieved and the already loaded data can be used.

Solution:
If the failures happens during write then the the schema rename in the finally
block must be avoided.

This closes apache#2517
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants