Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-1900][Core,processing] Modify loadmetadata to store timestamp long value(in ms) instead of formatted date string for fields "loadStartTime" and "timestamp" #1666

Conversation

mohammadshahidkhan
Copy link
Contributor

@mohammadshahidkhan mohammadshahidkhan commented Dec 15, 2017

Problem:

If the table is moved to environment having different timezone or we change the system current timezone, after IUD operation some of the blocks are not treated as valid blocks.
[{"timestamp":"15-12-2017 16:50:31:703","loadStatus":"Success","loadName":"0","partitionCount":"0","isDeleted":"FALSE","dataSize":"912","indexSize":"700","updateDeltaEndTimestamp":"","updateDeltaStartTimestamp":"","updateStatusFileName":"","**loadStartTime**":"15-12-2017 16:50:27:493","visibility":"true","fileFormat":"COLUMNAR_V3"}]

part-0-0_batchno0-0-1513336827493.carbondata

If timezone is different than at the time load was done, the value calculated from loadStartTime "15-12-2017 16:50:27:493" will not match to the time stamp extracted from block file name.

Solution:

We should stop writing the loadStartTime and timestamp in "dd-MM-yyyy HH:mm:ss:SSS" format.
We should write the long value of the timestamp as given below.
[{"timestamp":"1513336827593","loadStatus":"Success","loadName":"0","partitionCount":"0","isDeleted":"FALSE","dataSize":"912","indexSize":"700","updateDeltaEndTimestamp":"","updateDeltaStartTimestamp":"","updateStatusFileName":"","loadStartTime":"1513336827493", "visibility":"true","fileFormat":"COLUMNAR_V3"}]

  • Any interfaces changed?

  • Any backward compatibility impacted?
    For old loaded if string to Long parse fail, will fall back for date parsing.

  • Document update required?
    For old loaded table we can add limitation that the table movement should be done
    across the same timezone environment.

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.
    Manually tested

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
    NA

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2005/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/779/

@mohammadshahidkhan mohammadshahidkhan changed the title table status timezone problem fix [CARBONDATA-1900][Core,processing] Modify loadmetadata to store timestamp long value(in ms) instead of formatted date string for fields "loadStartTime" and "timestamp" Dec 15, 2017
@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2325/

@mohammadshahidkhan
Copy link
Contributor Author

This PR is ok, no code change after first push only pushed the commit message change
@ravipesala Please review

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/796/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2338/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2029/

@mohammadshahidkhan
Copy link
Contributor Author

mohammadshahidkhan commented Dec 16, 2017

Failed Test in SDV build is unrelated, its randomly failing with other PR also,
but if re trigger getting passed.

@mohammadshahidkhan
Copy link
Contributor Author

retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/801/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2032/

} catch (ParseException e) {
LOGGER.error("Cannot convert" + loadStartTime + " to Time/Long type value" + e.getMessage());
return null;
// for new loads the factTimeStamp will be long string
Copy link
Contributor

@jackylk jackylk Dec 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future for maintenance it is not easy to understand what new load is, can you add more background information for better maintenance. You can tell what is stored before Carbon 1.3 and what is changed since 1.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed Added the detailed comment at class level

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not mention "new loads" in line 275. Add comment in line 279, mentioning it is the processing for existing table before carbon 1.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@mohammadshahidkhan mohammadshahidkhan force-pushed the tablestatus_timezone_fix branch 2 times, most recently from c15fc73 to a2687b1 Compare December 18, 2017 07:02
@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/860/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2085/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2382/

} catch (ParseException e) {
LOGGER.error("Cannot convert" + factTimeStamp + " to Time/Long type value" + e.getMessage());
parser = new SimpleDateFormat(CarbonCommonConstants.CARBON_TIMESTAMP);
// for new loads the factTimeStamp will be long string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not mention "new loads" in line 275. Add comment in line 279, mentioning it is the processing for existing table before carbon 1.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -211,25 +234,32 @@ public long getLoadStartTimeAsLong() {
* @return
*/
private long convertTimeStampToLong(String factTimeStamp) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between this function and getTimeStamp, seems they are doing the same thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function getTimeStamp returns value in nano second and returns in mili second

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2383/

…tamp long value(in ms) instead of formated date string for fields "loadStartTime" and "timestamp"
@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/864/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2089/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2387/

@mohammadshahidkhan
Copy link
Contributor Author

retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2117/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/893/

@jackylk
Copy link
Contributor

jackylk commented Dec 19, 2017

LGTM

1 similar comment
@manishgupta88
Copy link
Contributor

LGTM

@asfgit asfgit closed this in 804ddb7 Dec 19, 2017
jatin9896 pushed a commit to jatin9896/incubator-carbondata that referenced this pull request Jan 5, 2018
…tamp long value(in ms) instead of formatted date string for fields "loadStartTime" and "timestamp"

If the table is moved to environment having different timezone or we change the system current timezone, after IUD operation some of the blocks are not treated as valid blocks.

We should stop writing the loadStartTime and timestamp in dd-MM-yyyy HH:mm:ss:SSS format. We should write the long value of the timestamp

This closes apache#1666
anubhav100 pushed a commit to anubhav100/incubator-carbondata that referenced this pull request Jun 22, 2018
…tamp long value(in ms) instead of formatted date string for fields "loadStartTime" and "timestamp"

If the table is moved to environment having different timezone or we change the system current timezone, after IUD operation some of the blocks are not treated as valid blocks.

We should stop writing the loadStartTime and timestamp in dd-MM-yyyy HH:mm:ss:SSS format. We should write the long value of the timestamp

This closes apache#1666
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants