-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HOTFIX] Fix task id in FileFormat write #3324
Conversation
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/3810/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/12083/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/4015/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/3811/ |
/** | ||
* counter used for generating unique task numbers. | ||
*/ | ||
val counter = new AtomicLong() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use UUID instead of this shared variable to generate the id? In practise, UUID is better than timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackylk : ok I will check.
@ravipesala : I have used the similar logic used in SparkCarbonTableFormat, UUID is not used there because of any issue ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackylk in carbon taskid is treated as number, not a string so the queries might fail. And to change the number to the string may need more changes. So here uuid generated based on the taskid and counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understood.
@@ -49,7 +49,7 @@ public CarbonThreadFactory(String name, boolean withTime) { | |||
@Override public Thread newThread(Runnable r) { | |||
final Thread thread = defaultFactory.newThread(r); | |||
if (withTime) { | |||
thread.setName(name + "_" + System.currentTimeMillis()); | |||
thread.setName(name + "_" + System.nanoTime()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it is better to use the generated task number as thread id, so that debug is easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just made it more accurate by changing from milli to nano.
This is not directly linked with spark task number so cannot use it here, This thread pool is used internally in many places.
I will analyse and handle in other PR if required.
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/12084/ |
LGTM |
retest this please |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/3812/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/4017/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/12085/ |
@ajantha-bhat I feel concurrent loading may still have the issue as we don't have segmentid in it. I feel it is safe to have UUID but changes are a little high. |
Scenario is handled in #3325 |
problem : In FIleFormat write, carbon is using task id as System.nanoTime()
cause : when multiple tasks launched concurrently, there is a chance that two task can have same id very rarely. Due to this, two spark task launched for one insert will have same carbondata file name.
so, when both tasks write to one file, chances are more to corrupt the file. which leads in query failure
solution: use own unique task id instead of nano seconds.
here use spark task id + global counter to generate unique task id across jobs.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
Any interfaces changed? NA
Any backward compatibility impacted? NA
Document update required? NA
Testing done
done. Attached the report
testReport.txt
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. [NA]