Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter #4090

Closed
wants to merge 1 commit into from

Conversation

Indhumathi27
Copy link
Contributor

@Indhumathi27 Indhumathi27 commented Feb 7, 2021

Why is this PR needed?

Currently, only two writer's(Local & S3) is supported for flink carbon streaming support. If user wants to ingest data from flink as a carbon format, directly into HDFS carbon table, there is no writer type to support it.

What changes were proposed in this PR?

Since the code for writing flink stage data will be same for Local and Hdfs FileSystems, we can use the existing CarbonLocalWriter to write data into hdfs, by using CarbonFile API instead of java File API.

Changed code to use CarbonFile API instead of java.io.File.

Does this PR introduce any user interface change?

  • No

Is any new testcase added?

  • No

@CarbonDataQA2
Copy link

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5438/

@CarbonDataQA2
Copy link

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3677/

@CarbonDataQA2
Copy link

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5439/

@CarbonDataQA2
Copy link

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3678/

@Indhumathi27 Indhumathi27 changed the title [WIP] Use CarbonFile API instead of java File API while writing Flink data [WIP] Support HDFS Carbon writer for Flink Carbon Streaming Feb 8, 2021
@Indhumathi27 Indhumathi27 changed the title [WIP] Support HDFS Carbon writer for Flink Carbon Streaming [CARBONDATA-4122] Support HDFS Carbon writer for Flink Carbon Streaming Feb 8, 2021
@CarbonDataQA2
Copy link

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5440/

@CarbonDataQA2
Copy link

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3679/

@@ -78,7 +78,7 @@ limitations under the License.
val carbonProperties = new Properties
// Set the carbon properties here, such as date format, store location, etc.

// Create carbon bulk writer factory. Two writer types are supported: 'Local' and 'S3'.
// Create carbon bulk writer factory. Three writer types are supported: 'Local', Hdfs' and 'S3'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everywhere if we use file factory API and support HDFS conf input, only one writer is enough right ? do we need 3 writers ?
Because in carbon table or SDK we don't create multiple type of writers to handle this kind of scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. i also thought about the same. But since, already they have implemented writers for LOCAL and S3 type, i have implemented for HDFS. But i can see, there are some differences only for S3 writer, some extra configurations are needed and they are not creating directory while writing stage directories in S3. you can check CarbonS3Writer.commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chaned code to use CarbonLocalWriter itself to handle Local and Hdfs FileSystems. Please review

@Indhumathi27 Indhumathi27 changed the title [CARBONDATA-4122] Support HDFS Carbon writer for Flink Carbon Streaming [CARBONDATA-4122] Use CarbonFile API instead of java File API for Flink CarbonLocalWriter Feb 10, 2021
@CarbonDataQA2
Copy link

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5450/

@CarbonDataQA2
Copy link

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3689/

@ajantha-bhat
Copy link
Member

LGTM

@asfgit asfgit closed this in 115182d Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants