Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][Integration] Spark jobs are failing with problem on org.apache.arrow.flatbuf #41571

Closed
raulcd opened this issue May 7, 2024 · 15 comments
Closed

Comments

@raulcd
Copy link
Member

raulcd commented May 7, 2024

Describe the bug, including details regarding any error messages, version, and platform.

The spark jobs:
test-conda-python-3.10-spark-v3.5.0
test-conda-python-3.11-spark-master
test-conda-python-3.8-spark-v3.5.0

Are currently failing with:

 Error: ] /spark/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala:26: object flatbuf is not a member of package org.apache.arrow
Error: ] /spark/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala:456: Class org.apache.arrow.flatbuf.Message not found - continuing with a stub.
Error: [ERROR] two errors found
Error:  Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.8.0:compile (scala-compile-first) on project spark-sql_2.12: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:4.8.0:compile failed: org.apache.commons.exec.ExecuteException: Process exited with an error: 255 (Exit value: 255) -> [Help 1]
Error:  
Error:  To see the full stack trace of the errors, re-run Maven with the -e switch.
Error:  Re-run Maven using the -X switch to enable full debug logging.
Error:  
Error:  For more information about the errors and possible solutions, please read the following articles:
Error:  [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
Error:  
Error:  After correcting the problems, you can resume the build with the command
Error:    mvn <args> -rf :spark-sql_2.12

Component(s)

Continuous Integration, Integration

@raulcd
Copy link
Member Author

raulcd commented May 7, 2024

This is also reproducible for 16.1.0 but I don't think this is related to any change on our side and doesn't seem to be a blocker.
@danepitkin @vibhatha @kiszk any thoughts?

@kiszk
Copy link
Member

kiszk commented May 7, 2024

The error message is the same as #40549.

@kiszk
Copy link
Member

kiszk commented May 9, 2024

2024-04-28, success
test-conda-python-3.10-spark-v3.5.0

2024-04-29, failure
test-conda-python-3.10-spark-v3.5.0

2024-05-07?, failure
test-conda-python-3.10-spark-v3.5.0

2024-05-09, failure
test-conda-python-3.10-spark-v3.5.0

2024-05-12, failure
test-conda-python-3.10-spark-v3.5.0

@kiszk
Copy link
Member

kiszk commented May 12, 2024

In the previous issue, this PR causes the issue.

MINOR: [Java] Bump org.apache.maven.plugins:maven-shade-plugin from 3.2.4 to 3.5.2 in /java

This PR, which was merged on April 29, include the following change.

Update maven-shade-plugin to 3.5.1 (via org.apache:parent)

@raulcd
Copy link
Member Author

raulcd commented May 12, 2024

@danepitkin @lidavidm what should we do with this one?

@kiszk
Copy link
Member

kiszk commented May 12, 2024

#41309 (comment) causes this failure.

@kiszk
Copy link
Member

kiszk commented May 12, 2024

I am just surprised this comment is completely ignored.

@lidavidm
Copy link
Member

We can revert. Sorry about that @kiszk

@lidavidm
Copy link
Member

It would be good to know why maven-shade-plugin is causing issues, though

lidavidm added a commit to lidavidm/arrow that referenced this issue May 12, 2024
lidavidm added a commit to lidavidm/arrow that referenced this issue May 21, 2024
lidavidm added a commit that referenced this issue May 21, 2024
### Rationale for this change

The commit in question caused a lot of CI issues

### Are these changes tested?

N/A

### Are there any user-facing changes?

N/A
* GitHub Issue: #41571

Authored-by: David Li <li.davidm96@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
@lidavidm lidavidm added this to the 17.0.0 milestone May 21, 2024
@lidavidm
Copy link
Member

Issue resolved by pull request 41628
#41628

@laurentgo
Copy link
Collaborator

I guess my bad but it's quite hard to work on improving the build and also track tests which are not part of the CI. I will repost the change though with hopefully a fix, although nobody really understands what's the exact issue?

@lidavidm
Copy link
Member

Yes, I've asked Vibhatha if we can get more of those tests into proper CI...

@laurentgo
Copy link
Collaborator

The issue seems to be because maven-shade-plugin generates and attach a new pom file containing the following dependency:

    <dependency>
      <groupId>org.apache.arrow</groupId>
      <artifactId>arrow-format</artifactId>
      <version>17.0.0-SNAPSHOT</version>
      <scope>provided</scope>
    </dependency>

which is exactly what the plugin is supposed to be doing

I have the same resulting pom by using maven-shade-plugin 3.3.0, and going through changes, I found this change which is a bugfix to the plugin for this issue

As to how to address this within the Arrow project, I'd need some perspective. The main issue is that a Maven module has a unique pom.xml definition for all the artifacts attached to it. But when attaching shaded artifacts the list of dependencies is not quite the same, and this is definitively conflicting.
I have no idea which project is actually using the shaded version of arrow-vector, but Spark isn't and so if we agree that the pom.xml should reflect the dependencies for the main artifact, then we should tell maven-shade-plugin to not generate a new pom.xml file. If we need both dependencies, then we are quite in a predicament I'd guess (and same would apply to the other plugins using maven-shade-plugin)

@lidavidm
Copy link
Member

Hmm, the POM should reflect the main file. The shaded version might need to be split into a separate artifact...

vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…1628)

### Rationale for this change

The commit in question caused a lot of CI issues

### Are these changes tested?

N/A

### Are there any user-facing changes?

N/A
* GitHub Issue: apache#41571

Authored-by: David Li <li.davidm96@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
@kiszk
Copy link
Member

kiszk commented May 28, 2024

FYI: This is my trial. In the current configuration, we cannot upgrade 3.3 or later. It is consistent with @laurentgo 's observation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants