Skip to content

Commit

Permalink
[SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

Uses the StreamCapabilities probe in MAPREDUCE-7403 to identify when a
PathOutputCommitter is compatible with dynamic partition overwrite.

This patch has unit tests but not integration tests; really needs
to test the SQL commands through the manifest committer into gcs/abfs,
or at least local fs. That would be possible once hadoop 3.3.5 is out...

Uses the StreamCapabilities probe in MAPREDUCE-7403 to identify when a
PathOutputCommitter is compatible with dynamic partition overwrite.

### Why are the changes needed?

Hadoop 3.3.5 adds a new committer in mapreduce-core which works fast and correctly on azure and gcs. (it would also work on hdfs, but its optimised for the cloud stores).

The stores and the committer do meet the requirements of Spark SQL Dynamic Partition Overwrite, so it is OK to for spark to work through it.

Spark does not know this; MAPREDUCE-7403 adds a way for any PathOutputCommitter to declare that they are compatible; the IntermediateManifestCommitter will do so.
(apache/hadoop#4728)

### Does this PR introduce _any_ user-facing change?

No.

There is documentation on the feature in the hadoop [manifest committer](https://github.com/apache/hadoop/blob/82372d0d22e696643ad97490bc902fb6d17a6382/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md) docs.

### How was this patch tested?

1. Unit tests in hadoop-cloud which work with hadoop versions with/without the matching change.
2. New integration tests in https://github.com/hortonworks-spark/cloud-integration which require spark to be built against hadoop with the manifest committer declaring compatibility

Those new integration tests include

* spark sql test derived from spark's own [CloudRelationBasicSuite.scala#L212](https://github.com/hortonworks-spark/cloud-integration/blob/master/cloud-examples/src/test/scala/org/apache/spark/sql/sources/CloudRelationBasicSuite.scala#L212)
* Dataset tests extended to verify support for/rejection of dynamic partition overwrite [AbstractCommitDataframeSuite.scala#L151](https://github.com/hortonworks-spark/cloud-integration/blob/master/cloud-examples/src/test/scala/com/cloudera/spark/cloud/committers/AbstractCommitDataframeSuite.scala#L151)

Tested against azure cardiff with the manifest committer; s3 london (s3a committers reject dynamic partition overwrites)

Closes #37468 from steveloughran/SPARK-40034-MAPREDUCE-7403-manifest-committer-partitioning.

Authored-by: Steve Loughran <stevel@cloudera.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
  • Loading branch information
a0x8o committed Sep 9, 2022
1 parent b94ddad commit b70340e
Show file tree
Hide file tree
Showing 19 changed files with 798 additions and 61 deletions.
4 changes: 4 additions & 0 deletions common/network-common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,10 @@
<artifactId>leveldbjni-all</artifactId>
<version>1.8</version>
</dependency>
<dependency>
<groupId>org.rocksdb</groupId>
<artifactId>rocksdbjni</artifactId>
</dependency>

<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@

/**
* The enum `DBBackend` use to specify a disk-based store used in shuffle service local db.
* Only LEVELDB is supported now.
* Support the use of LevelDB and RocksDB.
*/
public enum DBBackend {
LEVELDB(".ldb");
LEVELDB(".ldb"), ROCKSDB(".rdb");

private final String fileSuffix;

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.network.shuffledb;

import java.io.IOException;

import com.google.common.base.Throwables;
import org.rocksdb.RocksDBException;

/**
* RocksDB implementation of the local KV storage used to persist the shuffle state.
*/
public class RocksDB implements DB {
private final org.rocksdb.RocksDB db;

public RocksDB(org.rocksdb.RocksDB db) {
this.db = db;
}

@Override
public void put(byte[] key, byte[] value) {
try {
db.put(key, value);
} catch (RocksDBException e) {
throw Throwables.propagate(e);
}
}

@Override
public byte[] get(byte[] key) {
try {
return db.get(key);
} catch (RocksDBException e) {
throw Throwables.propagate(e);
}
}

@Override
public void delete(byte[] key) {
try {
db.delete(key);
} catch (RocksDBException e) {
throw Throwables.propagate(e);
}
}

@Override
public DBIterator iterator() {
return new RocksDBIterator(db.newIterator());
}

@Override
public void close() throws IOException {
db.close();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.network.shuffledb;

import java.io.IOException;
import java.util.AbstractMap;
import java.util.Map;
import java.util.NoSuchElementException;

import com.google.common.base.Throwables;
import org.rocksdb.RocksIterator;

/**
* RocksDB implementation of `DBIterator`.
*/
public class RocksDBIterator implements DBIterator {

private final RocksIterator it;

private boolean checkedNext;

private boolean closed;

private Map.Entry<byte[], byte[]> next;

public RocksDBIterator(RocksIterator it) {
this.it = it;
}

@Override
public boolean hasNext() {
if (!checkedNext && !closed) {
next = loadNext();
checkedNext = true;
}
if (!closed && next == null) {
try {
close();
} catch (IOException ioe) {
throw Throwables.propagate(ioe);
}
}
return next != null;
}

@Override
public Map.Entry<byte[], byte[]> next() {
if (!hasNext()) {
throw new NoSuchElementException();
}
checkedNext = false;
Map.Entry<byte[], byte[]> ret = next;
next = null;
return ret;
}

@Override
public void close() throws IOException {
if (!closed) {
it.close();
closed = true;
next = null;
}
}

@Override
public void seek(byte[] key) {
it.seek(key);
}

private Map.Entry<byte[], byte[]> loadNext() {
if (it.isValid()) {
Map.Entry<byte[], byte[]> nextEntry =
new AbstractMap.SimpleEntry<>(it.key(), it.value());
it.next();
return nextEntry;
}
return null;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,10 @@
import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.common.annotations.VisibleForTesting;

import org.apache.spark.network.shuffledb.DB;
import org.apache.spark.network.shuffledb.DBBackend;
import org.apache.spark.network.shuffledb.LevelDB;
import org.apache.spark.network.shuffledb.DB;
import org.apache.spark.network.shuffledb.RocksDB;
import org.apache.spark.network.shuffledb.StoreVersion;

public class DBProvider {
Expand All @@ -34,11 +35,13 @@ public static DB initDB(
StoreVersion version,
ObjectMapper mapper) throws IOException {
if (dbFile != null) {
// TODO: SPARK-38888, add rocksdb implementation.
switch (dbBackend) {
case LEVELDB:
org.iq80.leveldb.DB levelDB = LevelDBProvider.initLevelDB(dbFile, version, mapper);
return levelDB != null ? new LevelDB(levelDB) : null;
case ROCKSDB:
org.rocksdb.RocksDB rocksDB = RocksDBProvider.initRockDB(dbFile, version, mapper);
return rocksDB != null ? new RocksDB(rocksDB) : null;
default:
throw new IllegalArgumentException("Unsupported DBBackend: " + dbBackend);
}
Expand All @@ -49,11 +52,11 @@ public static DB initDB(
@VisibleForTesting
public static DB initDB(DBBackend dbBackend, File file) throws IOException {
if (file != null) {
// TODO: SPARK-38888, add rocksdb implementation.
switch (dbBackend) {
case LEVELDB: return new LevelDB(LevelDBProvider.initLevelDB(file));
default:
throw new IllegalArgumentException("Unsupported DBBackend: " + dbBackend);
case ROCKSDB: return new RocksDB(RocksDBProvider.initRocksDB(file));
default:
throw new IllegalArgumentException("Unsupported DBBackend: " + dbBackend);
}
}
return null;
Expand Down

0 comments on commit b70340e

Please sign in to comment.