Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-28912][BRANCH-2.4] Fixed MatchError in getCheckpointFiles()
### What changes were proposed in this pull request? This change fixes issue SPARK-28912. ### Why are the changes needed? If checkpoint directory is set to name which matches regex pattern used for checkpoint files then logs are flooded with MatchError exceptions and old checkpoint files are not removed. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually. 1. Start Hadoop in a pseudo-distributed mode. 2. In another terminal run command nc -lk 9999 3. In the Spark shell execute the following statements: ```scala val ssc = new StreamingContext(sc, Seconds(30)) ssc.checkpoint("hdfs://localhost:9000/checkpoint-01") val lines = ssc.socketTextStream("localhost", 9999) val words = lines.flatMap(_.split(" ")) val pairs = words.map(word => (word, 1)) val wordCounts = pairs.reduceByKey(_ + _) wordCounts.print() ssc.start() ssc.awaitTermination() ``` Closes apache#25719 from avkgh/SPARK-28912-branch-2.4. Authored-by: avk <nullp7r@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
- Loading branch information