Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUPPORT]HoodieSnapshotCopier throw exception when deal with a nonpartitioned hudi table #2244

Closed
xqy179 opened this issue Nov 11, 2020 · 2 comments

Comments

@xqy179
Copy link

xqy179 commented Nov 11, 2020

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

Maybe there is a bug in HoodieSnapshotCopier. When i use HoodieSnapshotCopier to backup a nonpartitioned hudi table , it will threw exception! My application code:

    val copier = new HoodieSnapshotCopier();
    copier.snapshot(spark.sparkContext, srcBasePath, desBasePath, false);

Code trace:
code in public class HoodieSnapshotCopier implements Serializable { ... }

        // also need to copy over partition metadata
        Path partitionMetaFile =
            new Path(new Path(baseDir, partition), HoodiePartitionMetadata.HOODIE_PARTITION_METAFILE);  
         ...
         Path toPartitionPath = new Path(outputDir, partition);

when snapshot a nonpartitioned table, the variable partition value is empty, so new Path(baseDir, partition) with a empty string will throw exception! After I change the code like below, and it work well!

      Path partitionMetaFile;
        if(partition.isEmpty()) {
          partitionMetaFile = new Path(new Path(baseDir), HoodiePartitionMetadata.HOODIE_PARTITION_METAFILE);
        }else {
          partitionMetaFile = new Path(new Path(baseDir, partition), HoodiePartitionMetadata.HOODIE_PARTITION_METAFILE);
        }

        ...

        Path toPartitionPath;
        if(partition.isEmpty()) {
          toPartitionPath = new Path(outputDir);
        } else {
          toPartitionPath = new Path(outputDir, partition);
        }

Expected behavior

Environment Description

  • Hudi version :

    HUDI version is 0.6.0, and 0.6.1 is still the same.

  • Spark version :

  • Hive version :

  • Hadoop version :

  • Storage (HDFS/S3/GCS..) :

  • Running on Docker? (yes/no) :

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

@bvaradar
Copy link
Contributor

@xqy179 : I have created a PR #2250 to fix this. Can you give it a shot and let me know if it works.

Thanks,
Balaji.V

@bvaradar
Copy link
Contributor

The fix was merged. Closing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants