Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect CBC solver build used with Linux on ARM 64 #672

Closed
6 of 14 tasks
connor-makowski opened this issue Jul 28, 2023 · 25 comments
Closed
6 of 14 tasks

Incorrect CBC solver build used with Linux on ARM 64 #672

connor-makowski opened this issue Jul 28, 2023 · 25 comments

Comments

@connor-makowski
Copy link
Contributor

connor-makowski commented Jul 28, 2023

Details for the issue

What did you do?

When using pulp with the standard CBC solver on a ARM 64bit Linux, the CBC solver fails with the following error:

qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such file or directory
Traceback (most recent call last):
  File "//test.py", line 7, in <module>
    model.solve(pulp.PULP_CBC_CMD(msg=True))
  File "/usr/local/lib/python3.11/site-packages/pulp/pulp.py", line 1913, in solve
    status = solver.actualSolve(self, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 137, in actualSolve
    return self.solve_CBC(lp, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 206, in solve_CBC
    raise PulpSolverError(
pulp.apis.core.PulpSolverError: Pulp: Error while trying to execute, use msg=True for more details/usr/local/lib/python3.11/site-packages/pulp/solverdir/cbc/linux/64/cbc

In my case this happened when running a debian based docker image on Apple silicon (M1 Max). I was able to recreate similar issues using some of the AWS EC2 instances running on ARM64 (EG T4g).

This can be recreated on any M1/M2 machine using docker and:

Dockerfile:

FROM python:3.11.3-bullseye
RUN pip install pulp==2.7.0
COPY test.py /test.py

test.py:

import pulp

x = pulp.LpVariable('x')
model = pulp.LpProblem('test', pulp.LpMinimize)
model += x
model += x >= 0
model.solve(pulp.PULP_CBC_CMD(msg=True))
print(x.value())

Then run:

docker build . --tag coin_m1_docker_test
docker run coin_m1_docker_test python test.py

This appears to be caused by selecting the AMD 64 CBC solver in ./pulp/solverdir/cbc/linux/64

It appears that there is no ARM 64 Bit supported CBC solver in the pulp package, so an easy fix would be to supplement it with a working version (I was able to build it for ARM 64 and it worked, but an apt install does the trick too).

EG modifying the Dockerfile to the following produces working results:

FROM python:3.11.3-bullseye
RUN pip install pulp==2.7.0

# Replace CBC with an appropriate system version
# Note: This solves a PULP issue with ARM64 Linux Builds
# Not needed for AMD64, but needed for ARM64
RUN apt update
RUN apt install -y coinor-cbc coinor-libcbc-dev
RUN CBCLOC="$(whereis cbc | sed 's/cbc: //g' | sed 's/\s.*$//')" && \
    PULPLOC="$(pip show pulp | grep Location | sed 's/Location: //g')" && \
    PULPCBCLOC="${PULPLOC}/pulp/solverdir/cbc/linux/64" && \
    mv ${PULPCBCLOC}/cbc ${PULPCBCLOC}/cbc_bck && \
    ln -s $CBCLOC $PULPCBCLOC

COPY test.py /test.py

My suggested fix would be to add this CBC build to the solverdir/cbc folder and update your system identification and path resolution to support Linux 64 bit ARM machines.

What did you expect to see?

No error

What did you see instead?

qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such file or directory
Traceback (most recent call last):
  File "//test.py", line 7, in <module>
    model.solve(pulp.PULP_CBC_CMD(msg=True))
  File "/usr/local/lib/python3.11/site-packages/pulp/pulp.py", line 1913, in solve
    status = solver.actualSolve(self, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 137, in actualSolve
    return self.solve_CBC(lp, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 206, in solve_CBC
    raise PulpSolverError(
pulp.apis.core.PulpSolverError: Pulp: Error while trying to execute, use msg=True for more details/usr/local/lib/python3.11/site-packages/pulp/solverdir/cbc/linux/64/cbc

Useful extra information

The info below often helps, please fill it out if you're able to. :)

What operating system are you using?

  • Windows: ( version: ___ )
  • Linux: ( distro: ___ )
  • Mac OS: ( version: ___ )
  • Other: ___

I'm using python version:

  • 2.7
  • 3.4
  • 3.5
  • 3.6
  • Other: 3.11

I installed PuLP via:

Did you also

@connor-makowski
Copy link
Contributor Author

Working on a PR here:

c8dd2c1

I am however having trouble finding documentation on getting a monolithic build.

How do you normally get your monolithic builds in solverdir?

For example just using the cbc build yields an error (unless you uncomment the line in the Dockerfile to install coinor-libcbc-dev):

Dockerfile

FROM python:3.11.3-bullseye
RUN pip install git+https://github.com/connor-makowski/pulp.git

RUN apt update
# Uncomment the following command to get all needed deps
#RUN apt install -y coinor-libcbc-dev

COPY test.py /test.py

test.py

import pulp

x = pulp.LpVariable('x')
model = pulp.LpProblem('test', pulp.LpMinimize)
model += x
model += x >= 0
model.solve(pulp.PULP_CBC_CMD(msg=True))
print(x.value())

Terminal Commands:

docker build . --tag coin_m1_docker_test
docker run coin_m1_docker_test python test.py

Outputs

/usr/local/lib/python3.11/site-packages/pulp/solverdir/cbc/linux/arm64/cbc: error while loading shared libraries: libCbcSolver.so.3: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "//test.py", line 7, in <module>
    model.solve(pulp.PULP_CBC_CMD(msg=True))
  File "/usr/local/lib/python3.11/site-packages/pulp/pulp.py", line 1920, in solve
    status = solver.actualSolve(self, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 137, in actualSolve
    return self.solve_CBC(lp, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 206, in solve_CBC
    raise PulpSolverError(
pulp.apis.core.PulpSolverError: Pulp: Error while trying to execute, use msg=True for more details/usr/local/lib/python3.11/site-packages/pulp/solverdir/cbc/linux/arm64/cbc

@connor-makowski
Copy link
Contributor Author

connor-makowski commented Aug 3, 2023

@pchtsp The link you mentioned in #426 from AMPL does not include any static builds for Linux on ARM 64. Any other ideas here?
@tkralphs You mentioned that you had some new harness for this in #196 but you are hosting them on the now sunset bintray. Any updates on where I might find a Linux ARM64 static monolithic build for CBC (or how to build one)? I am able to get builds working locally using coinbrew following your docs or even configuring and making it, however I can not figure out how to get a monolithic build.

@pchtsp
Copy link
Collaborator

pchtsp commented Sep 24, 2023

An alternative is to just use the COIN_CMD solver in pulp, that requires to pass the path to the CBC build you have. You do not need to add it to pulp to be able to use it.

https://coin-or.github.io/pulp/technical/solvers.html#pulp.apis.COIN_CMD

@tkralphs
Copy link
Member

@connor-makowski Sorry I missed this at the time. All automated builds have now moved to Github Actions. You can find pre-built binaries here: https://github.com/coin-or/Cbc/releases. Unfortunately, it seems there aren't any runners for building on ARM. As @pchtsp pointed out, though, the easiest is just to build yourself using coinbrew and then use COIN_CMD.

@connor-makowski
Copy link
Contributor Author

@pchtsp and @tkralphs thanks for the follow up. While those solutions do work well for me, and I am able to complete them, the use case I run into pushes into a larger audience than myself.

I teach and conduct various research around optimization at MIT. To help streamline our processes, we oftentimes turn to containers (mostly Docker). While this normally reduces cross system friction for most of our research / teaching content, one glaring exception is with PULP on ARM systems (when using docker). I have run into this problem and helped 3 different research teams solve this among a number of students.

Outside of docker, on all systems (except ARM Linux - which is the real problem), pulp just works in its most simple form.

In my mind, adding similar linux arm monolithic binary would solve this for many of the other groups that also fight this war with only a few helpful hints from stack overflow.

EG: Considering that I can get a proper monolithic binary in place, the following code is all that would need to change to get this working:

c8dd2c1

@tkralphs
Copy link
Member

I guess the main problem is having an automated way to produce the binary. I suppose that with the slow evolution of Cbc these days, building a one-off binary every now and then would be fine. It looks like you are providing the binary in c8dd2c1 so to me, this makes sense, but I leave it up to PuLP's development team whether they want to do this or not.

@connor-makowski
Copy link
Contributor Author

Yeah, that makes sense to me.

Unfortunately, the binary I am providing is not a monolithic binary... That has some other system requirements and I can not seem to produce a monolithic build. I am still working on it on the side, but in any case, someone on the pulp side should build any deployed binary anyway (as a security measure).

Is there someone on the pulp maintainers team I should tag to bring this to their attention?

@tkralphs
Copy link
Member

tkralphs commented Sep 25, 2023

Building a fully static binary is a real pain and if the process fails, the reasons are not easy to discern. Gcc will silently link to shared libraries even when you request static if no static library is available. There were some lengthy discussions about exactly how to build a static executable with cbc a while back and we eventually got it working. The Github Actions workflow for Cbc does actually produce a static binary, so you can look at the recipe there. coinbrew also has a --static flag that tries to do the right things to get a static binary, but it may or may not work on a given machine.

@tkralphs
Copy link
Member

tkralphs commented Sep 25, 2023

Just for background, I think the original issue that sparked the conversation about building static binaries was #196. Later, it was raised again in coin-or/Cbc#252 and finally resolved in coin-or/Cbc#256.

@connor-makowski
Copy link
Contributor Author

connor-makowski commented Sep 25, 2023

I had bumped into 196 as I was researching this previously but not 252. That looks helpful. Let me see if I can get a static build working from a Linux ARM machine on my side and post back here if I can get it working.

@tkralphs
Copy link
Member

Oops, just a heads up, I just corrected the link to the issue where this got resolved.

@connor-makowski
Copy link
Contributor Author

Thanks for that correction by the way. That first issue threw me for a loop once I dug into it. Luckily I only lost about a minute before you followed up.

@connor-makowski
Copy link
Contributor Author

Okay so I was able to get a working build up that appears to be passing all tests on ARM linux.

@tkralphs you were right about some of those challenges related to getting the build to work. I ended up trying a swath of various linux OS and kernel options. Debugging the build process is painful. Most of the time it was system or package related, but I often had to verify that by going through the coinbrew script and building the individual repos manually to verify. I would love to see some more extensive docs with examples for coinbrew.

At one point, I accidentally ran a build script on a docker container running alpine python linux and it worked. Not sure why it would work when I struggled so painfully with Ubuntu and other Debian based systems that I was working with. I was able to simplify the process quite a bit once I had one build working. Most of this simplification came down to using an undocumented command pass through option for coinbrew. (essentially passing --build to the sub project configure steps). The --static flag was super useful to skip passing a lot of the other args I had been manually modifying for other OS tests. The only other key fix I needed was a fortran complier in the system to clear out a few build bugs and now there should be a simple way to build Cbc for ARM...

Steps to produce a monolithic build (on an M1 Mac):

Create a new folder and put in the following:

Dockerfile:

FROM python:3.11.3-bullseye
WORKDIR /app
RUN apt-get update && apt-get install -y gfortran
RUN wget https://raw.githubusercontent.com/coin-or/coinbrew/master/coinbrew
RUN chmod u+x coinbrew

dist (empty folder for volume mounting)

Run the following commands while in your new folder:

Commands:

docker build . --tag coin_m1_docker_build
docker run --volume "./dist:/app/dist" -it coin_m1_docker_build bash
./coinbrew build Cbc --static --build=aarch64-unknown-linux-gnu --no-prompt

Copy your build over to PULP.

The output should be found at: ./dist/bin/cbc

Notes:

  • This build appears to be passing all tests on my side as well as: pulp.pulpTestAll() although I am not sure what that test output should look like.

@connor-makowski
Copy link
Contributor Author

I created a PR to master, which I assume is not the correct process, but I was not able to find docs on how to contribute. Happy to follow up with next steps to get this in.

@pchtsp
Copy link
Collaborator

pchtsp commented Sep 26, 2023 via email

@connor-makowski
Copy link
Contributor Author

Looks like that is pretty standard, I actually did everything there except the black linting (which was irrelevant when applied). One thing I noticed is that your dev requirements do not include black so I had to install it separately in my venv as version 23.9.1.

Regarding tests, these would be operating system / architecture specific so you would ideally run all of your standard tests on a linux based ARM system. This looks like it would require using an ARM based self hosted runner. Rasberry Pis run on ARM so you could probably setup a minimal system for that, but I am not thinking that a self hosted runner would be a great long term solution. You could probably automate an AWS EC2 graviton based (which runs on ARM, EG: T4g) instance to spin up and run the tests, but that would also come with incremental testing costs for each test. It would also be subject to the flaky nature of aws EC2 instances - where creating a self healing EC2 architecture to run a unit test seems like an overkill.

It is not immediately clear how you currently test your cbc builds across win 32/64 vs mac vs lunux 32/64. Do you have a process in place for cross system tests?

@pchtsp
Copy link
Collaborator

pchtsp commented Sep 26, 2023 via email

@tkralphs
Copy link
Member

tkralphs commented Sep 26, 2023

We should separate testing of Cbc from testing of PuLP. I guess that it would be more important to test Cbc on ARM than PuLP. Currently, testing of Cbc is done with Github Actions and you can see the matrix. We have a very small team and a budget of $0 so there is a limit to how professional we can be, but that link that @pchtsp sent is interesting. I would be happy to add testing on ARM to what we're doing if you could help. In fact, if this really makes it possible to build and test on Github Actions, then it might provide a way of automatically producing binaries, rather than doing it one-off, as we had talked about. But this is all probably something we should talk about over in the Cbc project.

@stumitchell
Copy link
Contributor

stumitchell commented Sep 26, 2023 via email

@connor-makowski
Copy link
Contributor Author

connor-makowski commented Sep 26, 2023

You could probably modify the docker command to run a script instead that runs the build and also applies some tests to the built binary before moving the binaries to some mounted volume. Then you could guarantee that the binaries pass before they ever exit the build process. This would keep you in the $0 budget zone at least.

@tkralphs
Copy link
Member

coinbrew already runs the same set of unit tests that we use on Github Actions by default while doing the build. But I really don't know what happens with a cross-compile, which is what this is doing, right? I'm also not sure how the workflow goes when running something on Github Actions inside a Docker container. With a regular workflow run, if a process returns with an error code, this gets surfaced automatically. I'm not sure how that happens if the tests are running in a container.

@stumitchell
Copy link
Contributor

stumitchell commented Sep 27, 2023 via email

@pchtsp
Copy link
Collaborator

pchtsp commented Sep 30, 2023

I'm closing this issue as the PR was merged. I still think it may be interesting to add an ARM runner in pulp CI to be sure the binary in the project is built correctly. Right now we're just running on faith.

@connor-makowski
Copy link
Contributor Author

coinbrew already runs the same set of unit tests that we use on Github Actions by default while doing the build. But I really don't know what happens with a cross-compile, which is what this is doing, right? I'm also not sure how the workflow goes when running something on Github Actions inside a Docker container. With a regular workflow run, if a process returns with an error code, this gets surfaced automatically. I'm not sure how that happens if the tests are running in a container.

Technically this is not a cross compile as I understand cross compile. This is being built directly in an ARM Linux environment (as a docker container running on Mac OS), but the process should hold on any ARM based processor running the docker image.

@tkralphs
Copy link
Member

tkralphs commented Oct 2, 2023

Technically this is not a cross compile as I understand cross compile. This is being built directly in an ARM Linux environment (as a docker container running on Mac OS), but the process should hold on any ARM based processor running the docker image.

Yeah, makes sense. I realized this later. I guess the ARM-based runners are not actually running on ARM-based hardware, but also not cross-compiling. They using some kind of Raspberry Pi emulator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants