Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-2388][SDK]Avro Record Complex Type Implementation #2209

Closed
wants to merge 1 commit into from

Conversation

sounakr
Copy link
Contributor

@sounakr sounakr commented Apr 23, 2018

Avro Complex DataType Support.

  • AVRO Complex type. Supported Datatype - ARRAYS and RECORDS.

  • Carbon Supported DataType - ARRAYS and STRUCT

  • SDK support to handle complex datatype.

  • Carbon Complex Type Support
    - Support for NonDictionary Fields.
    - Existing Complex type bug fixes.

  • Any interfaces changed?

  • Any backward compatibility impacted?

  • Document update required?

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4138/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5319/

@jackylk
Copy link
Contributor

jackylk commented Apr 23, 2018

Please modify the PR title, it is for SDK, change to [SDK]

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5510/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4345/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4611/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4354/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5518/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4620/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4621/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4355/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5519/

//
// checkAnswer(sql("select * from sdkOutputTable"), Seq(Row("robot0", 0, 0.0),
// Row("robot1", 1, 0.5),
// Row("robot2", 2, 1.0)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please revert back this test case and add new one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sounakr sounakr changed the title [WIP][Non Transactional Table]Avro Record Complex Type Implementation [CARBONDATA-2388][SDK]Avro Record Complex Type Implementation Apr 30, 2018
// This routine is going to infer schema from the carbondata file footer
// Convert the ColumnSchema -> TableSchema -> TableInfo.
// Return the TableInfo.
if (tableInfoFromCache != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If tableInfoFromCache is found, No need to call inferschema itself, please handle this outside infer schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4622/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4357/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5521/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4623/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4624/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4358/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5522/

}


public PrimitiveQueryType(String name, String parentname, int blockIndex,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more extra constructor is not required. If dictionary is not null then isDictionary=true

Copy link
Contributor Author

@sounakr sounakr Apr 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

KeyGenerator[] generator) throws IOException, KeyGenException {
@Override public void parseAndBitPack(ByteBuffer byteArrayInput,
DataOutputStream dataOutputStream, KeyGenerator[] generator)
throws IOException, KeyGenException {
int dataLength = byteArrayInput.getInt();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove extra space

@@ -43,6 +43,9 @@
*/
private static final long serialVersionUID = 7676766554874863763L;

public void columnSchema() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use of this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. No Use.

@Override
public void parseAndBitPack(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
KeyGenerator[] generator) throws IOException, KeyGenException {
@Override public void parseAndBitPack(ByteBuffer byteArrayInput,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this method from interface as below new method will be used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.getDataBasedOnDataTypeFromNoDictionary(
ByteBuffer.wrap(complexTypeKeyArray[complexTypeColumnIndex++]));
} else
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the braces up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -348,6 +356,33 @@ public void setCardinalityFinder(DictionaryCardinalityFinder cardinalityFinder)
return complexKeyGenerators;
}

public Boolean[][] createComplexDictionaryFieldIdentification() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not required, please remove this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -222,8 +222,9 @@ private void addComplexColumn(int index, int rowId, byte[] complexColumns) {
ByteBuffer byteArrayInput = ByteBuffer.wrap(complexColumns);
ByteArrayOutputStream byteArrayOutput = new ByteArrayOutputStream();
DataOutputStream dataOutputStream = new DataOutputStream(byteArrayOutput);
complexDataType.parseAndBitPack(byteArrayInput, dataOutputStream,
model.getComplexDimensionKeyGenerator());
int startOffset = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use of startOffset when not used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed


int parseComplexValue(ByteBuffer byteArrayInput, DataOutputStream dataOutputStream,
KeyGenerator[] generator, Boolean[][] complexDictionaryIndentification, int startOffset)
throws IOException, KeyGenException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove complexDictionaryIndentification and startOffset if not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complexDictionaryIndentification is removed but StartOffset is used.

@ravipesala
Copy link
Contributor

@sounakr Please add testcases for complextype also not only in AVRO.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5542/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4637/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4379/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4638/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5558/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4395/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4654/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5560/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4656/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5563/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4401/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4398/

@ravipesala
Copy link
Contributor

@sounakr Please fix the build

  a) Complex Type suport for AVRO Record and Array datatype.
  b) Complex Type No Dictionary Implementation.
@sounakr
Copy link
Contributor Author

sounakr commented May 1, 2018

Retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5565/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4659/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4403/

@ravipesala
Copy link
Contributor

LGTM

@asfgit asfgit closed this in 3202cf5 May 1, 2018
@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4661/

anubhav100 pushed a commit to anubhav100/incubator-carbondata that referenced this pull request Jun 22, 2018
Avro Complex DataType Support.

AVRO Complex type. Supported Datatype - ARRAYS and RECORDS.

Carbon Supported DataType - ARRAYS and STRUCT

SDK support to handle complex datatype.

Carbon Complex Type Support
- Support for NonDictionary Fields.
- Existing Complex type bug fixes.

This closes apache#2209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants