[CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter #4090

Indhumathi27 · 2021-02-07T11:33:54Z

Why is this PR needed?

Currently, only two writer's(Local & S3) is supported for flink carbon streaming support. If user wants to ingest data from flink as a carbon format, directly into HDFS carbon table, there is no writer type to support it.

What changes were proposed in this PR?

Since the code for writing flink stage data will be same for Local and Hdfs FileSystems, we can use the existing CarbonLocalWriter to write data into hdfs, by using CarbonFile API instead of java File API.

Changed code to use CarbonFile API instead of java.io.File.

Does this PR introduce any user interface change?

No

Is any new testcase added?

No

CarbonDataQA2 · 2021-02-07T13:04:51Z

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5438/

CarbonDataQA2 · 2021-02-07T13:10:35Z

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3677/

CarbonDataQA2 · 2021-02-07T16:09:43Z

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5439/

CarbonDataQA2 · 2021-02-07T16:16:00Z

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3678/

CarbonDataQA2 · 2021-02-08T08:34:44Z

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5440/

CarbonDataQA2 · 2021-02-08T08:41:10Z

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3679/

ajantha-bhat · 2021-02-09T14:55:45Z

docs/flink-integration-guide.md

@@ -78,7 +78,7 @@ limitations under the License.
    val carbonProperties = new Properties
    // Set the carbon properties here, such as date format, store location, etc.

-    // Create carbon bulk writer factory. Two writer types are supported: 'Local' and 'S3'.
+    // Create carbon bulk writer factory. Three writer types are supported: 'Local', Hdfs' and 'S3'.


Everywhere if we use file factory API and support HDFS conf input, only one writer is enough right ? do we need 3 writers ?
Because in carbon table or SDK we don't create multiple type of writers to handle this kind of scenario.

yes. i also thought about the same. But since, already they have implemented writers for LOCAL and S3 type, i have implemented for HDFS. But i can see, there are some differences only for S3 writer, some extra configurations are needed and they are not creating directory while writing stage directories in S3. you can check CarbonS3Writer.commit.

Chaned code to use CarbonLocalWriter itself to handle Local and Hdfs FileSystems. Please review

CarbonDataQA2 · 2021-02-10T08:25:49Z

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5450/

CarbonDataQA2 · 2021-02-10T08:26:31Z

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3689/

ajantha-bhat · 2021-02-10T13:52:15Z

LGTM

Indhumathi27 changed the title ~~[WIP] Use CarbonFile API instead of java File API while writing Flink data~~ [WIP] Support HDFS Carbon writer for Flink Carbon Streaming Feb 8, 2021

Indhumathi27 force-pushed the flink_fs branch from 5171f27 to 42567df Compare February 8, 2021 06:58

Indhumathi27 changed the title ~~[WIP] Support HDFS Carbon writer for Flink Carbon Streaming~~ [CARBONDATA-4122] Support HDFS Carbon writer for Flink Carbon Streaming Feb 8, 2021

ajantha-bhat reviewed Feb 9, 2021

View reviewed changes

Use CarbonFile API instead of java File API while writing Flink data

8a6d987

Indhumathi27 force-pushed the flink_fs branch from 42567df to 8a6d987 Compare February 10, 2021 06:49

Indhumathi27 changed the title ~~[CARBONDATA-4122] Support HDFS Carbon writer for Flink Carbon Streaming~~ [CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter Feb 10, 2021

asfgit closed this in 115182d Feb 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter #4090

[CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter #4090

Indhumathi27 commented Feb 7, 2021 •

edited

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 8, 2021

CarbonDataQA2 commented Feb 8, 2021

ajantha-bhat Feb 9, 2021

Indhumathi27 Feb 9, 2021

Indhumathi27 Feb 10, 2021

CarbonDataQA2 commented Feb 10, 2021

CarbonDataQA2 commented Feb 10, 2021

ajantha-bhat commented Feb 10, 2021

[CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter #4090

[CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter #4090

Conversation

Indhumathi27 commented Feb 7, 2021 • edited

Why is this PR needed?

What changes were proposed in this PR?

Does this PR introduce any user interface change?

Is any new testcase added?

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 7, 2021

CarbonDataQA2 commented Feb 8, 2021

CarbonDataQA2 commented Feb 8, 2021

ajantha-bhat Feb 9, 2021

Choose a reason for hiding this comment

Indhumathi27 Feb 9, 2021

Choose a reason for hiding this comment

Indhumathi27 Feb 10, 2021

Choose a reason for hiding this comment

CarbonDataQA2 commented Feb 10, 2021

CarbonDataQA2 commented Feb 10, 2021

ajantha-bhat commented Feb 10, 2021

Indhumathi27 commented Feb 7, 2021 •

edited