Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support generic Armel builds #38798

Closed
wants to merge 35 commits into from
Closed

Support generic Armel builds #38798

wants to merge 35 commits into from

Conversation

smx-smx
Copy link
Contributor

@smx-smx smx-smx commented Jul 5, 2020

The armel (VFP) target is currently quite obscure.

It's there, but in reality it seems dedicated to Tizen only.
I'm experimenting with some Linux ARM (VFP) devices and i wanted to run coreclr on them.

To my surprise i noticed that no build for linux-armel was available, so i tried to get one going for non-tizen platforms.

Here is a list of issues discovered:

  • Tizen specific toolchain rules are hardcoded in the toolchain file
    WORKAROUND: add support for a custom toolchain file (see PR)

  • Missing armel/armv7-a handling in a few places (see PR for fixes)

  • Build errors in coreclr with old 3.x kernel (incorrect ptrace header for the older kernel, see PR for a possible fix)

  • Build errors in System.Native due to -Wimplicit-int-conversion (this happens due to getnameinfo, htons, and a few others having unsigned prototypes)
    WORKAROUND (uncommitted): drop -Wimplicit-int-conversion (dirty, needs a better fix)

  • Armel cross build is Tizen specific and does not support Portable RID build

    if [ "$buildArch" = "armel" ]; then

(ignore the prerelease commit as i initially based these fixes on the latest preview. i can remove it later)

Do you think it's possible to introduce a specific "tizen" target or convert it into a generic linux-armel target?

Thank you

@dnfadmin
Copy link

dnfadmin commented Jul 5, 2020

CLA assistant check
All CLA requirements met.

@smx-smx smx-smx changed the title WIP: Support generic Armel builds Support generic Armel builds Aug 9, 2020
@ViktorHofer
Copy link
Member

// Auto-generated message

69e114c which was merged 12/7 removed the intermediate src/coreclr/src/ folder. This PR needs to be updated as it touches files in that directory which causes conflicts.

To update your commits you can use this bash script: https://gist.github.com/ViktorHofer/6d24f62abdcddb518b4966ead5ef3783. Feel free to use the comment section of the gist to improve the script for others.

@danmoseley
Copy link
Member

@dotnet/runtime-infrastructure could someone please assign themselves this PR to help resolve it one way or another? It is 164 days old and I am trying to get to a 90th percentile of 90 days for community PR's.

@smx-smx
Copy link
Contributor Author

smx-smx commented Dec 17, 2020

While we're here, i'll try to explain the situation with this PR

The original intent was to run dotnet core on a VFP-less machine.

That turned out to be impossible but i think some of the changes that i made could be useful for others
Even if core itself won't work without a VFP, some tools and libraries can be built and used.
In particular, mono does run VFP-less

With this PR i made some adjustments to be able to build the codebase while targeting an embedded Arm machine (my modem/router combo) running Linux 3.4 and with uClibc (non -ng).
For some reason my changes broke something with some CI runners, and i had troubles fixing them

Now it's been a while since i looked into this, but maybe if you can give me some guidance i can try to fix the remaining issues

Thanks

@@ -74,7 +74,7 @@

<!-- Indicates this is not an officially supported release. Release branches should set this to false. -->
<!-- Keep it in sync with PRERELEASE in eng/native/configureplatform.cmake -->
<IsPrerelease>true</IsPrerelease>
<IsPrerelease>false</IsPrerelease>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be set to true on non release branches. It seems like you have extra commits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i started from the last prerelease branch, which included that commit.
It should be easily reversible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should put your commits on top of a branch created from master to make it more reviewable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Santi, in general your changes look reasonable and valuable for certain scenarios, we just need to figure out the necessary logistics. I would be also grateful if @janvorli could take a look at the cmake changes as he's one of our biggest linux / cmake experts.

@trylek trylek requested a review from janvorli December 18, 2020 00:34
eng/native/functions.cmake Show resolved Hide resolved
eng/common/cross/toolchain.cmake Show resolved Hide resolved
eng/native/gen-buildsys.sh Show resolved Hide resolved
src/coreclr/src/pal/src/configure.cmake Outdated Show resolved Hide resolved
src/coreclr/src/pal/src/thread/thread.cpp Outdated Show resolved Hide resolved
@@ -26,6 +26,9 @@ add_compile_options(-Wno-cast-align)
add_compile_options(-Wno-typedef-redefinition)
add_compile_options(-Wno-c11-extensions)
add_compile_options(-Wno-unknown-pragmas)
add_compile_options(-Wno-atomic-alignment)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we get the warning if it is not disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember off hand, i'll need to rebuild without it to check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to remove this change and the build succeeded fine without it and without any warnings.

Copy link
Contributor Author

@smx-smx smx-smx Dec 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm this. I might have unknowingly fixed it in the toolchain or it was related to something else.

Actually, i wonder if it's related to the GCC version used (i might have changed the gcc version used in the toolchain along the way).
According to my original commit message, it was emitting an error similar to this

__atomic_compare_exchange_n (large atomic operation may incur significant performance penalty)

A simple grep shows this is part of

  • src/libraries/Native/Unix/Common/pal_atomic.h
  • src/libraries/Native/Unix/System.Native/pal_random.c

@@ -447,10 +447,11 @@ check_symbol_exists(
HAVE_DISCONNECTX)

set(PREVIOUS_CMAKE_REQUIRED_FLAGS ${CMAKE_REQUIRED_FLAGS})
set(CMAKE_REQUIRED_FLAGS "-Werror -Wsign-conversion")
set(CMAKE_REQUIRED_FLAGS "-Werror=sign-conversion")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain this change? We want error on all warnings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is related to other warnings caused by uClibc headers with new compilers, so the change is needed to catch only the warning relevant for the check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to error on all warnings that are enabled. Our policy w.r.t. warning has been to fix them if reasonably feasible or disable the specific ones where fixing is not possible or too cumbersome. It would be good to get a complete list of warnings that we are getting to reason about what to do here. It seems that the the warnings are uClibs rather than armel specific. So we could start with disabling them all just when compiling against uClibc. Is there a reasonable way to detect we are compiling against uCLibs at build time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to remove this change too and the build succeeded without any warnings or errors.

Copy link
Contributor Author

@smx-smx smx-smx Dec 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might depend on the gcc version i used back then, similarly to -Wno-atomic-alignment
I can revert this. However, i believe that the check should only care about sign-conversion related errors since it's checking whether or not the flags are signed

@janvorli
Copy link
Member

@smx-smx I was wondering if you could share details on how to create a rootfs for the OS you are targeting. I'd like to try it locally , as it would give me more insight into the warnings etc.

@smx-smx
Copy link
Contributor Author

smx-smx commented Dec 18, 2020

@janvorli thanks for the interest, first of all.
You might use this repository to create a rootfs: https://github.com/smx-smx/bcm63138-buildroot

We are talking about Broadcom modem/router combos.
They ship with a variety of BSPs, and a common one is Linux 3.4 (with the rt patcheset) and uClibc (the old one, non-ng).

Many vendors modify uClibc to fit with their idea of "embedded", but don't actually share the modifications or the configuration required to build a binary compatible toolchain (which is needed to run programs on the device without chroot)

I created a custom fork of uClibc that aims to be compatible with said BSP, and it's already built as part of the rootfs i linked

NOTE: modifications have to do with which symbols are available in the final libc and which are not (uClibc is quite modular and, for functions that aren't already modular, it's quite easy to make them as such).
There are no changes done in the code itself. Any warning present in the headers is likely due to the header files being much older than the compiler used

You should be able to build the toolchain+rootfs combo by doing

make bcm63138_defconfig
make

The toolchain will be saved in output/host and the rootfs in output/images

My original goal was to run dotnet core on these devices, but it turns out that the CPU cannot do SMP and VFP at the same time (and SMP was preferred).

in-depth explanation:

The CPU in this device is a Cortex A9, a dual core CPU with a NEON VFPv3 FPU.
However, the designers added a VFP only to one core, and Linux cannot do SMP while differentiating between VFP and non-VFP cores.
So to take advantage of SMP the kernel must completely disable the NEON VFPv3 FPU (what a waste...)

However, even without VFP (i.e. without softfp), it's possible to build and run mono with the netcore profile.

This is how far i got, with a working mono + netcore setup (i had to package a test assembly with dotnet publish and replace files manually from my custom build of dotnet/runtime)

#!/bin/sh
scriptDir="$(readlink -f $(dirname "$0"))"
assembly="$(readlink -f "$1")"
assemblyDir="$(dirname "$assembly")"

cd "$assemblyDir"

#MONO_LOG_LEVEL=debug \
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1 \
MONOPATH="$assemblyDir" \
LD_LIBRARY_PATH="$assemblyDir" $scriptDir/mono-sgen "$assembly"
root@dlinkrouter:/netcore# ./run.sh app/bcm63138.dll
TESTING

Hope everything is clear.

@smx-smx
Copy link
Contributor Author

smx-smx commented Dec 19, 2020

Ok, i found the toolchain file and script i made to build dotnet/runtime for the target: https://gist.github.com/smx-smx/0c5738e219145996912d5e289445c9cc

@akoeplinger
Copy link
Member

the MONOPATH="$assemblyDir" in your example should probably be MONO_PATH="$assemblyDir".

@@ -26,6 +26,9 @@ add_compile_options(-Wno-cast-align)
add_compile_options(-Wno-typedef-redefinition)
add_compile_options(-Wno-c11-extensions)
add_compile_options(-Wno-unknown-pragmas)
add_compile_options(-Wno-atomic-alignment)
# Required for clang versions that don't understand -Wno-atomic-alignment
add_compile_options(-Wno-unknown-warning-option)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than disabling this you can also check whether the compiler accepts the -Wno-atomic-alignment option like this:

check_c_compiler_flag(-Wimplicit-fallthrough COMPILER_SUPPORTS_W_IMPLICIT_FALLTHROUGH)
if (COMPILER_SUPPORTS_W_IMPLICIT_FALLTHROUGH)
add_compile_options(-Wimplicit-fallthrough)
endif()

@janvorli
Copy link
Member

@smx-smx the bcm63138-buildroot's make menuconfig seems to have zillions of options to select. I can figure out the target options, but I am not sure what to select for the others. Could you please share the .config file generated by the menuconfig too?

@smx-smx
Copy link
Contributor Author

smx-smx commented Dec 21, 2020

It should be done for you if you do "make bcm63138_defconfig", which installs the default config.
You then can just run "make"
I can otherwise share my .config in a couple hours

@janvorli
Copy link
Member

Great, thank you!

@smx-smx
Copy link
Contributor Author

smx-smx commented Dec 21, 2020

Thank you too! If it can help you can also reach me on the dotnet Discord and/or Gitter rooms for quicker replies

@smx-smx
Copy link
Contributor Author

smx-smx commented Dec 21, 2020

the MONOPATH="$assemblyDir" in your example should probably be MONO_PATH="$assemblyDir".

Hmmm maybe it wasn't needed then, since the variable name was wrong but the POC runs fine without it anyways

@janvorli
Copy link
Member

@smx-smx I have built the rootfs and attempted to build stuff using your scripts. However, the build fails with:
__DistroRid: linux-armel
__RuntimeId: linux-armel
Determining projects to restore...
Tool 'coverlet.console' (version '1.7.2') was restored. Available commands: coverlet
Tool 'dotnet-reportgenerator-globaltool' (version '4.5.8') was restored. Available commands: reportgenerator
Tool 'microsoft.dotnet.xharness.cli' (version '1.0.0-prerelease.20403.2') was restored. Available commands: xharness

Restore was successful.
/home/janvorli/git/runtime3/.dotnet/sdk/5.0.100-rc.2.20479.15/NuGet.targets(131,5): error : Invalid restore input. Invalid target framework 'unsupported'. Input files: /home/janvorli/.nuget/packages/microsoft.dotnet.arcade.sdk/5.0.0-beta.20374.1/tools/Tools.proj. [/home/janvorli/.nuget/packages/microsoft.dotnet.arcade.sdk/5.0.0-beta.20374.1/tools/Tools.proj]

I have modified your scripts to point TOOLCHAIN_DIR to my local directory where I've built the rootfs.
Are the scripts up to date?

@smx-smx
Copy link
Contributor Author

smx-smx commented Dec 21, 2020

Starting from a clean state (my armel-fixes branch HEAD), pasting the build script and the toolchain file, this is what i get
https://gist.github.com/smx-smx/523a60c8b88a6367745169b03b46f466

I surely get past your situation.
I don't remember hitting the failure on cli/apphost/static/singlefilehost but it's likely that i skipped it back then, since it wasn't needed

The scripts are updated, i used make_bcm.sh to generate the log above
Make sure you point the toolchain directory to your output/host.
You can try adding (exporting) output/host/bin to your system PATH to see if it's any different

Tomorrow i can try looking into why singlefilehost fails, and if it can be fixed or skipped

Edit: I probably used the Libs+Mono subset back then instead of Libs+CoreHost+Mono

@janvorli
Copy link
Member

Ah, now it is running, I had a stale .dotnet directory in the root of the repo.

@janvorli
Copy link
Member

The mono part build has passed, but it has failed on the libraries build - it was unable to find objcopy:

  /home/janvorli/git/runtime3/src/libraries/Native/build-native.sh armel Release outconfig net5.0-Linux-Release-armel -os Linux -cross
  __DistroRid: linux-armel
  __RuntimeId: linux-armel
  Set CrossBuild for armel build
  Setting up directories for build
  Checking prerequisites...
  Commencing build of "native libraries component" for Linux.armel.Release in /home/janvorli/git/runtime3/artifacts/obj/native/net5.0-Linux-Release-armel
    Determining projects to restore...
    Restored /home/janvorli/git/runtime3/eng/empty.csproj (in 254 ms).
  Invoking "/home/janvorli/git/runtime3/eng/native/gen-buildsys.sh" "/home/janvorli/git/runtime3/src/libraries/Native/Unix" "/home/janvorli/git/runtime3/src/libraries/Native/Unix" "/home/janvorli/git/runtime3/artifacts/obj/native/net5.0-Linux-Release-armel" armel clang "" "" Release ""  -DCLR_ENG_NATIVE_DIR="/home/janvorli/git/runtime3/eng/native" -DCMAKE_STATIC_LIB_LINK=0 -DFEATURE_DISTRO_AGNOSTIC_SSL=1
  ~/git/runtime3/artifacts/obj/native/net5.0-Linux-Release-armel ~/git/runtime3/src/libraries/Native
  loading initial cache file /home/janvorli/git/runtime3/src/libraries/Native/Unix/tryrun.cmake
  -- The C compiler identification is Clang 9.0.1
  -- Check for working C compiler: /usr/bin/clang-9
  -- Check for working C compiler: /usr/bin/clang-9 -- works
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Detecting C compile features
  -- Detecting C compile features - done
  CMake Error at /home/janvorli/git/runtime3/eng/native/configuretools.cmake:37 (message):
    Unable to find toolchain executable.  Name: objcopy, Prefix:
    arm-buildroot-linux-uclibcgnueabi-.
  Call Stack (most recent call first):
    /home/janvorli/git/runtime3/eng/native/configuretools.cmake:64 (locate_toolchain_exec)
    CMakeLists.txt:10 (include)


  -- Configuring incomplete, errors occurred!

@trylek
Copy link
Member

trylek commented May 17, 2021

OK, I think I've fixed all remaining CoreCLR and library test failures. @akoeplinger, would you be able to look at the Mono failures or find someone who would? @smx-smx, a double-check whether this still builds / works for you would be certainly beneficial to verify we haven't inadvertently nullified the gist of your change.

@@ -10,7 +10,7 @@ elseif(CLR_CMAKE_TARGET_MACCATALYST)
endif()
cmake_policy(SET CMP0042 NEW)

project(CoreFX C)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should remove the "C". All corefx sources are intentionally C code (because of Mono)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In such case we need to figure out a different way to encode the HAVE_SIGACTION_ULONG_FLAGS check, I haven't found any way to do that with plain C due to its less stringent type conversion checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't understand then why the C compiler complains when compiling the actual code, but doesn't fail on the same thing in the check. I am starting to wonder whether the changes related to HAVE_SIGACTION_ULONG_FLAGS originated from the time when the libraries were still being compiled as C++ code. @smx-smx could it be the case?

Copy link
Contributor Author

@smx-smx smx-smx May 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm the test works for me in plain C

$ arm-buildroot-linux-uclibcgnueabi-gcc -Werror=incompatible-pointer-types t.c -o t
$ gcc -Werror=incompatible-pointer-types t.c -o t
t.c: In function ‘main’:
t.c:5:32: error: initialization of ‘long unsigned int *’ from incompatible pointer type ‘int *’ [-Werror=incompatible-pointer-types]
    5 |         unsigned long *flags = &action.sa_flags;
      |                                ^
cc1: some warnings being treated as errors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so maybe setting

CMAKE_REQUIRED_FLAGS=-Werror=incompatible-pointer-types

before the test code and clearing it after could do the trick.

@smx-smx
Copy link
Contributor Author

smx-smx commented May 18, 2021

Hello, I was thinking about setting up a CI pipeline for the environment and I made some progress in that regard. I'll see if I can find some time to get it finalized

@smx-smx
Copy link
Contributor Author

smx-smx commented May 19, 2021

CI Environment is ready: https://cirrus-ci.com/task/5243984923590656
Downloaded and tried the build with my hello world assembly, all good

# ./run.sh app/bcm63138.dll
TESTING

@@ -30,7 +30,7 @@ set(OS_LIBS "-framework CoreFoundation" "-lobjc" "-lc++")
elseif(HOST_ANDROID)
set(OS_LIBS m dl log)
elseif(HOST_LINUX)
set(OS_LIBS pthread m dl)
set(OS_LIBS pthread m dl atomic)
Copy link
Contributor Author

@smx-smx smx-smx May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trylek
Copy link
Member

trylek commented Jun 25, 2021

Closing for now as there's no traffic on this PR and it regularly shows up in our stale PR reports. @smx-smx, please reopen if you get back to working on this change. Thanks.

@trylek trylek closed this Jun 25, 2021
Infrastructure Backlog automation moved this from In Progress to Done Jun 25, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jul 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet