Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot build gluten (velox backend) in Amazon Linux 2 #489

Closed
CodingCat opened this issue Oct 30, 2022 · 18 comments
Closed

cannot build gluten (velox backend) in Amazon Linux 2 #489

CodingCat opened this issue Oct 30, 2022 · 18 comments
Labels
bug Something isn't working velox backend works for Velox backend

Comments

@CodingCat
Copy link
Contributor

CodingCat commented Oct 30, 2022

Describe the bug

I am trying to build gluten in Amazon Linux 2 (for EMR env) (I have upgraded cmake to 3.16.9, using clang/clang++ 14.0.6 as compilers)

but I always get the error when trying to compile Velox No SOURCES given to target: velox_all_link

To Reproduce
Steps to reproduce the behavior:

I am using EMR 6.6 image (https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-6x.html), but you do need to install tons of dependencies manually since it is using a very old version of linux

 mvn package -Pbackends-velox -Pspark-3.2 -Pfull-scala-compiler -DskipTests -Dcheckstyle.skip -Dbuild_cpp=ON -Dbuild_velox=ON -Dbuild_velox_from_source=ON -Dbuild_arrow=OFF -Darrow_home=/home/hadoop/arrow_install/

I used the above command to build (Arrow is already built successfully), hopefully it is an easy fix

(I did turn off LIBHDFS3 build in Velox which is another mess tho, but I don't think it is relevant?)

Expected behavior

can build Gluten

@CodingCat CodingCat added the bug Something isn't working label Oct 30, 2022
@FelixYBW
Copy link
Contributor

What's the OS and version do you use? Velox currently only support Ubuntu20.04+. We are adding support to Centos 8. On old OS, there are much dependency issues to solve. Velox community is adding conda env support, once it's done the old OS support can be much easier.

@CodingCat
Copy link
Contributor Author

it's in Amazon Linux 2 , basically AWS's own distro....

I have installed all corresponding dependencies based on https://github.com/oap-project/velox/blob/main/scripts/setup-ubuntu.sh (either replacement in AWS Linux 2 or build from the source)

I am wondering what does error like No SOURCES given to target: velox_all_link usually mean?

@FelixYBW
Copy link
Contributor

@zhejiangxiaomai Do you know what's the issue? Looks it's used in Velox's experimental codegen

@FelixYBW FelixYBW assigned FelixYBW and unassigned FelixYBW Oct 31, 2022
@zhejiangxiaomai
Copy link
Contributor

Hi, @CodingCat what is your build option for velox? In fact, we do not use libvelox_all_link.a in gluten.
Here is velox cmake build option. https://github.com/oap-project/gluten/blob/main/tools/build_velox.sh#L63

@zhztheplayer

This comment was marked as outdated.

@CodingCat
Copy link
Contributor Author

https://github.com/oap-project/gluten/blob/main/tools/build_velox.sh#L63

yeah, I copied the command from this line

 make release EXTRA_CMAKE_FLAGS=" -DVELOX_ENABLE_PARQUET=ON -DVELOX_ENABLE_ARROW=ON -DVELOX_ENABLE_HDFS=ON"

except that I turned off HDFS support

@zhejiangxiaomai
Copy link
Contributor

@CodingCat would you give me the commit id of your velox project? I will reproduce it. In fact, we use this build option to build velox nightly.

@CodingCat
Copy link
Contributor Author

@CodingCat would you give me the commit id of your velox project? I will reproduce it. In fact, we use this build option to build velox nightly.

Thanks! it's

commit c8dc516e36b157d14baaf7869682920eab82732d
Author: Rui Mo <rui.mo@intel.com>
Date:   Thu Oct 27 15:55:25 2022 +0800

    Add support for regular anti join (#2906) (#59)

@CodingCat
Copy link
Contributor Author

hmmmm.....

I think I got what's the issue

# Turn on Codegen only for Clang and non Mac systems.
if((NOT DEFINED VELOX_CODEGEN_SUPPORT)
   AND (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
   AND NOT (${CMAKE_SYSTEM_NAME} MATCHES "Darwin"))
  message(STATUS "Enabling Codegen")
  set(VELOX_CODEGEN_SUPPORT True)
else()
  message(STATUS "Disabling Codegen")
  set(VELOX_CODEGEN_SUPPORT False)
endif()

I saw this in velox cmake file....I do use clang in the system, so it builds codegen even we don't use it....

are you using g++?

@zhejiangxiaomai
Copy link
Contributor

zhejiangxiaomai commented Nov 1, 2022

hmmmm.....

I think I got what's the issue

# Turn on Codegen only for Clang and non Mac systems.
if((NOT DEFINED VELOX_CODEGEN_SUPPORT)
   AND (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
   AND NOT (${CMAKE_SYSTEM_NAME} MATCHES "Darwin"))
  message(STATUS "Enabling Codegen")
  set(VELOX_CODEGEN_SUPPORT True)
else()
  message(STATUS "Disabling Codegen")
  set(VELOX_CODEGEN_SUPPORT False)
endif()

I saw this in velox cmake file....I do use clang in the system, so it builds codegen even we don't use it....

are you using g++?

yes, you need to disable Codegen.

@CodingCat
Copy link
Contributor Author

hmmmm.....
I think I got what's the issue

# Turn on Codegen only for Clang and non Mac systems.
if((NOT DEFINED VELOX_CODEGEN_SUPPORT)
   AND (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
   AND NOT (${CMAKE_SYSTEM_NAME} MATCHES "Darwin"))
  message(STATUS "Enabling Codegen")
  set(VELOX_CODEGEN_SUPPORT True)
else()
  message(STATUS "Disabling Codegen")
  set(VELOX_CODEGEN_SUPPORT False)
endif()

I saw this in velox cmake file....I do use clang in the system, so it builds codegen even we don't use it....
are you using g++?

yes, you need to disable Codegen.

thanks, I will give a try, and hopefully I can contribute back with a doc or something about how to run gluten in Amazon Linux 2 which is what EMR relies on (well....if S3 can be supported soon, it would make it really valuable)

@zhejiangxiaomai
Copy link
Contributor

hmmmm.....
I think I got what's the issue

# Turn on Codegen only for Clang and non Mac systems.
if((NOT DEFINED VELOX_CODEGEN_SUPPORT)
   AND (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
   AND NOT (${CMAKE_SYSTEM_NAME} MATCHES "Darwin"))
  message(STATUS "Enabling Codegen")
  set(VELOX_CODEGEN_SUPPORT True)
else()
  message(STATUS "Disabling Codegen")
  set(VELOX_CODEGEN_SUPPORT False)
endif()

I saw this in velox cmake file....I do use clang in the system, so it builds codegen even we don't use it....
are you using g++?

yes, you need to disable Codegen.

thanks, I will give a try, and hopefully I can contribute back with a doc or something about how to run gluten in Amazon Linux 2 which is what EMR relies on (well....if S3 can be supported soon, it would make it really valuable)

That's really good.

@CodingCat
Copy link
Contributor Author

I have made more progress on this, now I am hitting

/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:877:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register yy_state_type yy_current_state;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:878:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register char *yy_cp, *yy_bp;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:878:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register char *yy_cp, *yy_bp;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:879:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register int yy_act;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1248:6: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register char *dest = YY_CURRENT_BUFFER_LVALUE->yy_ch_buf;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1249:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register char *source = (yytext_ptr);
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1250:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register int number_to_move, i;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1250:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register int number_to_move, i;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1382:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register yy_state_type yy_current_state;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1383:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register char *yy_cp;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1407:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register int yy_is_jam;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1415:39: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
    void yyFlexLexer::yyunput( int c, register char* yy_bp)
                                      ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1417:2: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
        register char *yy_cp;
        ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1427:3: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
                register yy_size_t number_to_move = (yy_n_chars) + 2;
                ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1428:3: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
                register char *dest = &YY_CURRENT_BUFFER_LVALUE->yy_ch_buf[
                ^~~~~~~~~
/src/velox-intel/_build/release/velox/expression/type_calculation/Scanner.cpp:1430:3: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
                register char *source =
                ^~~~~~~~~
16 errors generated.
gmake[4]: *** [velox/expression/type_calculation/CMakeFiles/velox_type_calculation.dir/Scanner.cpp.o] Error 1
gmake[4]: *** Waiting for unfinished jobs....
In file included from /src/velox-intel/_build/release/velox/expression/type_calculation/TypeCalculation.yy.cc:50:
/src/velox-intel/_build/release/velox/expression/type_calculation/TypeCalculation.yy.h:650:26: error: definition of implicit copy constructor for 'stack_symbol_type' is deprecated because it has a user-provided copy assignment operator [-Werror,-Wdeprecated-copy-with-user-provided-copy]

I am using clang++ 14.0.6 and specified std=c++17 as velox docs says it requires that....

any idea on what might be happening here?

@FelixYBW
Copy link
Contributor

FelixYBW commented Nov 9, 2022

Looks it's Velox's compile issue. We may ask from Velox channel. Are you in?

@CodingCat
Copy link
Contributor Author

Looks it's Velox's compile issue. We may ask from Velox channel. Are you in?

not yet, today is tough for meta folks, I will ask for joining tmr

@zhejiangxiaomai
Copy link
Contributor

@CodingCat Is there still a problem now?

@weiting-chen weiting-chen added the velox backend works for Velox backend label Mar 1, 2023
@CodingCat
Copy link
Contributor Author

we can close this now, I have been able to run it in AL2

@sagarlakshmipathy
Copy link

sagarlakshmipathy commented Mar 11, 2024

@CodingCat Did you end up writing that doc you mentioned above? I'm trying to run this on AL2, and I'm facing the below error.

java.lang.ClassNotFoundException: org.apache.spark.shuffle.sort.ColumnarShuffleManager

#!/bin/bash

# Install git-core
sudo yum install -y git-core

# Clone the gluten repo
git clone https://github.com/oap-project/gluten.git

# Install Maven (assuming you have Maven installation steps)
# Please follow the link you provided: https://devopscube.com/install-maven-guide/
# Install Maven
wget https://dlcdn.apache.org/maven/maven-3/3.9.6/binaries/apache-maven-3.9.6-bin.tar.gz 
sudo tar xvf apache-maven-3.9.6-bin.tar.gz -C /opt 
sudo ln -s /opt/apache-maven-3.9.6 /opt/maven

# Update Maven environment variables
echo "export M2_HOME=/opt/maven" | sudo tee /etc/profile.d/maven.sh
echo "export PATH=\${M2_HOME}/bin:\${PATH}" | sudo tee -a /etc/profile.d/maven.sh
sudo chmod +x /etc/profile.d/maven.sh
source /etc/profile.d/maven.sh


# Build gluten with Velox
cd gluten
mvn package -Pbackends-velox -Pspark-3.3 -Pfull-scala-compiler -DskipTests -Dcheckstyle.skip -Dbuild_cpp=ON -Dbuild_velox=ON -Dbuild_velox_from_source=ON -Dbuild_arrow=ON

#export gluten_jar
export gluten_jar=/home/hadoop/gluten/backends-velox/target/backends-velox-1.2.0-SNAPSHOT-3.3.jar 

$SPARK_HOME/bin/spark-shell \
  --master yarn --deploy-mode client \
  --conf spark.plugins=io.glutenproject.GlutenPlugin \
  --conf spark.memory.offHeap.enabled=true \
  --conf spark.memory.offHeap.size=20g \
  --conf spark.driver.extraClassPath=${gluten_jar} \
  --conf spark.executor.extraClassPath=${gluten_jar} \
  --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working velox backend works for Velox backend
Projects
None yet
Development

No branches or pull requests

6 participants