Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Contributing to Apache DataFusion Java

Bug reports, design discussion, and patches are welcome. This project follows
the Apache DataFusion contribution model.

## Filing issues and discussing changes

- File bugs and feature requests on [GitHub issues](https://github.com/apache/datafusion-java/issues).
- For larger or design-level discussion, the mailing list is
[dev@datafusion.apache.org](mailto:dev@datafusion.apache.org).
- Please open an issue before sending a PR for any significant change so the
approach can be agreed on first.

## Development workflow

Branch from `main`, write changes with [conventional commit](https://www.conventionalcommits.org/)
messages in the imperative mood (e.g. `feat: add foo`, `fix(native): handle bar`),
and open a pull request targeting `main`.

The first build in a fresh checkout reaches out to `raw.githubusercontent.com`
to fetch the DataFusion protobuf schemas (see *Updating the DataFusion /
protobuf schema version* below). Subsequent builds are offline — the
`download-maven-plugin` cache under `~/.m2/repository/.cache/` satisfies them.

## Code style

- Java: run `./mvnw spotless:apply` before committing. CI fails the build if
formatting drifts.
- Rust: run `cargo fmt` and `cargo clippy --all-targets -- -D warnings` inside
`native/`.
- New source files need the Apache 2.0 license header. Apache RAT enforces this
during `verify`.

## Updating the DataFusion / protobuf schema version

Three things must move together when bumping DataFusion:

1. `native/Cargo.toml` — the `datafusion` crate dependency.
2. `pom.xml` — the `<datafusion.version>` Maven property. **Must equal the
Cargo version**; a mismatch means JVM-built protobuf plans won't deserialize
on the native side.
3. `pom.xml` — the `<sha512>` checksums on the two `download-maven-plugin`
executions. These pin the downloaded `.proto` files; the build fails if
upstream silently re-tags them, which is the desired behavior.

Recipe:

```sh
# 1. Bump the Cargo dep
$EDITOR native/Cargo.toml # set datafusion = "<new>"
(cd native && cargo update -p datafusion)

# 2. Bump the Maven property to match
$EDITOR pom.xml # set <datafusion.version>

# 3. Compute the new SHA-512 hashes for both `.proto` files from the upstream
# tag you just set in step 2, then paste them into the two <sha512> elements
# in pom.xml.
NEW=$(grep -m1 -oE '<datafusion.version>[^<]+' pom.xml | cut -d'>' -f2)
curl -sL "https://raw.githubusercontent.com/apache/datafusion/$NEW/datafusion/proto-common/proto/datafusion_common.proto" | shasum -a 512 | awk '{print $1}'
curl -sL "https://raw.githubusercontent.com/apache/datafusion/$NEW/datafusion/proto/proto/datafusion.proto" | shasum -a 512 | awk '{print $1}'
$EDITOR pom.xml # paste the two hashes into the <sha512> elements

# Drop the local download cache so the next build re-downloads against the new hashes.
rm -rf ~/.m2/repository/.cache/download-maven-plugin target/proto

# 4. Verify
make && make test
```

The protobuf runtime version (`<protobuf.version>` in `pom.xml`) tracks the
Java ecosystem (security and JDK compatibility), not DataFusion. Bump it
independently when there is a reason.
58 changes: 58 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ under the License.
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<junit.version>5.11.3</junit.version>
<datafusion.version>53.1.0</datafusion.version>
<protobuf.version>3.25.5</protobuf.version>
</properties>

<dependencies>
Expand All @@ -58,9 +60,21 @@ under the License.
<version>19.0.0</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>${protobuf.version}</version>
</dependency>
</dependencies>

<build>
<extensions>
<extension>
<groupId>kr.motd.maven</groupId>
<artifactId>os-maven-plugin</artifactId>
<version>1.7.1</version>
</extension>
</extensions>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
Expand Down Expand Up @@ -118,6 +132,7 @@ under the License.
<exclude>NOTICE.txt</exclude>
<!-- Project documentation that does not require headers -->
<exclude>README.md</exclude>
<exclude>CONTRIBUTING.md</exclude>
<exclude>docs/**</exclude>
<!-- VCS and editor metadata -->
<exclude>.gitignore</exclude>
Expand All @@ -138,6 +153,49 @@ under the License.
</excludes>
</configuration>
</plugin>
<plugin>
<groupId>com.googlecode.maven-download-plugin</groupId>
<artifactId>download-maven-plugin</artifactId>
<version>1.9.0</version>
<executions>
<execution>
<id>fetch-datafusion-common-proto</id>
<phase>generate-sources</phase>
<goals><goal>wget</goal></goals>
<configuration>
<url>https://raw.githubusercontent.com/apache/datafusion/${datafusion.version}/datafusion/proto-common/proto/datafusion_common.proto</url>
<outputDirectory>${project.build.directory}/proto/datafusion/proto-common/proto</outputDirectory>
<outputFileName>datafusion_common.proto</outputFileName>
<sha512>d6f3368372ea277cc23e26f196994b81616d38599357bb374cbd7eb1760e649a789e4c133d86a395ac701049a500348da2ec039d3f978ac5d8112c2876dded1f</sha512>
</configuration>
</execution>
<execution>
<id>fetch-datafusion-proto</id>
<phase>generate-sources</phase>
<goals><goal>wget</goal></goals>
<configuration>
<url>https://raw.githubusercontent.com/apache/datafusion/${datafusion.version}/datafusion/proto/proto/datafusion.proto</url>
<outputDirectory>${project.build.directory}/proto/datafusion/proto/proto</outputDirectory>
<outputFileName>datafusion.proto</outputFileName>
<sha512>c3d162b8e2a418e03f74caceaccfd934af89bb95a12ede13d4cc1701d24c734d74b1e96372142b173db05938dab7f965ad60d476363308c441677a63ea5fbcf7</sha512>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.xolstice.maven.plugins</groupId>
<artifactId>protobuf-maven-plugin</artifactId>
<version>0.6.1</version>
<configuration>
<protocArtifact>com.google.protobuf:protoc:${protobuf.version}:exe:${os.detected.classifier}</protocArtifact>
<protoSourceRoot>${project.build.directory}/proto</protoSourceRoot>
</configuration>
<executions>
<execution>
<goals><goal>compile</goal></goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
41 changes: 41 additions & 0 deletions src/test/java/org/apache/datafusion/proto/ProtoGenerationTest.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.datafusion.proto;

import static org.junit.jupiter.api.Assertions.assertNotNull;

import org.apache.datafusion.protobuf.LogicalPlanNode;
import org.junit.jupiter.api.Test;

/**
* Smoke test: confirms the datafusion-proto schema was downloaded, protoc generated Java sources,
* those sources landed on the compile classpath, and the {@code protobuf-java} runtime resolves.
*
* <p>Does not exercise any DataFusion plan semantics — those tests arrive with JVM-side plan
* construction.
*/
class ProtoGenerationTest {

@Test
void generatedClassIsLoadableAndConstructible() {
LogicalPlanNode node = LogicalPlanNode.newBuilder().build();
assertNotNull(node);
}
}