-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-3560] Fixed issues for Add Segment #3426
Conversation
633553d
to
7f032f1
Compare
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/713/ |
Build Failed with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/721/ |
Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/720/ |
7f032f1
to
09d2e5c
Compare
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/714/ |
Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/721/ |
Build Success with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/722/ |
@ravipesala Please review. |
* @param segmentPath | ||
* @return | ||
*/ | ||
public static CarbonFile[] getListOfCarbonIndexFiles(String segmentPath) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the input parameter point to segment folder of transactional table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, its the path to segment folder to get the carbon file.
CarbonFile segmentFolder = FileFactory.getCarbonFile(segmentPath); | ||
CarbonFile[] indexFiles = segmentFolder.listFiles(new CarbonFileFilter() { | ||
@Override public boolean accept(CarbonFile file) { | ||
return (file.getName().endsWith(CarbonTablePath.INDEX_FILE_EXT) || file.getName() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move file.getName()
to next line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
val allSegments = SegmentStatusManager.readLoadMetadata(carbonTable.getMetadataPath) | ||
|
||
for (currSegment <- allSegments) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use allSegments.contains
or .exist
instead of for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment to explain this validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
.../src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
Show resolved
Hide resolved
@@ -92,6 +92,22 @@ case class CarbonAddLoadCommand( | |||
val segmentPath = options.getOrElse( | |||
"path", throw new UnsupportedOperationException("PATH is manadatory")) | |||
|
|||
val format = options.getOrElse("format", "carbondata") | |||
val isCarbonFormat = format.equals("carbondata") || format.equals("carbon") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to line 107
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -542,7 +541,7 @@ class AddSegmentTestCase extends QueryTest with BeforeAndAfterAll { | |||
copy(path.toString, newPath) | |||
checkAnswer(sql("select count(*) from addsegment1"), Seq(Row(30))) | |||
|
|||
sql(s"alter table addsegment1 add segment options('path'='$newPath', 'format'='parquet')").show() | |||
sql(s"alter table addsegment1 add segment options('path'='$newPath', 'format'='PARQUET')").show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this change required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is required just for the testing purpose of uppercase format, so that I need not add a new testcase just for case sensitivity.
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/725/ |
Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/732/ |
Build Success with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/733/ |
6a84526
to
44f9d68
Compare
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/730/ |
val isCarbonFormat = format.equalsIgnoreCase("carbondata") || format.equalsIgnoreCase("carbon") | ||
|
||
// If in the given location no carbon files are found then we can throw an exception | ||
if (isCarbonFormat && SegmentFileStore.getListOfCarbonIndexFiles(segmentPath).isEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this validation should come from SparkFiileFormat, so ideally there should not be any changes in this class for format level validation. Better move it to format level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ravipesala Here we are just checking the presence of carbon files present in the current folder location and as there was a 'isCarbonFormat' below so I just shifted it above and used it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to check this, better to check weather index file and data file are matching:
- If it has merge index file, all data file should be present in merge index file
- otherwise, one index file for one data file should present
please move this validation to another function and invoke it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manishnalla1994 the condition earlier is not for validation, it is just to update the metadata. How is the behavior of SparkFiileFormat infer schema if there are no carbondata or carbonindex files? I feel it is better to correct in the down layer than adding validations on top layer.
Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/737/ |
Build Failed with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/738/ |
44f9d68
to
1b3f616
Compare
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/734/ |
Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/740/ |
Build Success with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/742/ |
1b3f616
to
66c584f
Compare
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/746/ |
if (allSegments.exists(a => | ||
a.getPath != null && a.getPath.equalsIgnoreCase(segmentPath) | ||
)) { | ||
throw new AnalysisException(s"Cannot add the segment. This path is already in use by " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change message to path already exists in table status file, can not add same segment path repeatedly: $path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
val format = options.getOrElse("format", "carbondata") | ||
val isCarbonFormat = format.equalsIgnoreCase("carbondata") || format.equalsIgnoreCase("carbon") | ||
|
||
// If in the given location no carbon files are found then we can throw an exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If in the given location no carbon files are found then we can throw an exception | |
// If in the given location no carbon index files are found then we should throw an exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
66c584f
to
10c2a5f
Compare
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/747/ |
Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/753/ |
Build Success with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/755/ |
10c2a5f
to
b0389d5
Compare
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/769/ |
Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/775/ |
Build Success with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/777/ |
LGTM |
retest this please |
Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/785/ |
Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/791/ |
Build Success with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/793/ |
LGTM |
Issue1 : When the format is given in uppercase, add segment fails with unknown format.
Solution1 : Made format case-insensitive.
Issue2 : The same path is being added repeatedly, blocked this operation.
Issue3 : Added validation for the folder not containing carbon files.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.