Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCC 14: Remove inline for rdist_csr and rdist #28541

Closed
wants to merge 1 commit into from

Conversation

hswong3i
Copy link

@hswong3i hswong3i commented Feb 27, 2024

Reference Issues/PRs

Fixes #28530

Relates to cython/cython#2747

Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=2261602

What does this implement/fix? Explain your changes.

This problem was mentioned in a bug open to keep track of the effort to remove all compilation warnings #24875 (search "Incompatible pointers types casts")

With gcc < 14 the code compiles with incompatible-pointer warning. But with gcc 14 that warning is an error.

A possible workaround is to remove the inline in the definition of the rdist_csr and rdist members.

With that change, cython generates correct code (I don't know why). But removing the inline can impact the performance.

Any other comments?

This is related with cython/cython#2747

This problem was mentioned in a bug open to keep track of the effort to remove all compilation warnings scikit-learn#24875 (search "Incompatible pointers types casts")

With gcc < 14 the code compiles with incompatible-pointer warning. But with gcc 14 that warning is an error.

Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=2261602

A posible workaround is to remove the inline in the definition of the rdist_csr and rdistmembers.

With that change, cython generates correct code (I don't know why). But removing the inline can impact the performance.

Fixes scikit-learn#28530

Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com>
Copy link

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: d13fd0b. Link to the linter CI: here

@hswong3i
Copy link
Author

Testing with podman + Fedora Rawhide:

sudo -E podman run -ti --rm fedora:rawhide
yum update -y
yum install -y git gcc gcc-c++ python3-devel python3-pip python3-setuptools python3-wheel python3-Cython python3-numpy python3-scipy python3-joblib python3-threadpoolctl
git clone https://github.com/scikit-learn/scikit-learn.git
cd scikit-learn
git checkout main
curl -skL https://github.com/scikit-learn/scikit-learn/pull/28541.patch | patch -p1
python3 setup.py build

hswong3i added a commit to alvistack/scikit-learn-scikit-learn that referenced this pull request Feb 27, 2024
    git clean -xdf
    git submodule sync
    git submodule update --init
    tar zcvf ../python-scikit-learn_1.3.2.orig.tar.gz --exclude=.git .
    debuild -uc -us
    cp python-scikit-learn.spec ../python-scikit-learn_1.3.2-1.spec
    cp ../python*-scikit-learn*1.3.2*.{gz,xz,spec,dsc} /osc/home\:alvistack/scikit-learn-scikit-learn-1.3.2/
    rm -rf ../python*-scikit-learn*1.3.2*.* ../python*-sklearn*1.3.2*.*

See scikit-learn#28541

Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com>
hswong3i added a commit to alvistack/scikit-learn-scikit-learn that referenced this pull request Feb 27, 2024
    git clean -xdf
    git submodule sync
    git submodule update --init
    tar zcvf ../python-scikit-learn_1.3.2.orig.tar.gz --exclude=.git .
    debuild -uc -us
    cp python-scikit-learn.spec ../python-scikit-learn_1.3.2-1.spec
    cp ../python*-scikit-learn*1.3.2*.{gz,xz,spec,dsc} /osc/home\:alvistack/scikit-learn-scikit-learn-1.3.2/
    rm -rf ../python*-scikit-learn*1.3.2*.* ../python*-sklearn*1.3.2*.*

See scikit-learn#28541

Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com>
hswong3i added a commit to alvistack/scikit-learn-scikit-learn that referenced this pull request Feb 27, 2024
    git clean -xdf
    git submodule sync
    git submodule update --init
    tar zcvf ../python-scikit-learn_1.4.0.orig.tar.gz --exclude=.git .
    debuild -uc -us
    cp python-scikit-learn.spec ../python-scikit-learn_1.4.0-1.spec
    cp ../python*-scikit-learn*1.4.0*.{gz,xz,spec,dsc} /osc/home\:alvistack/scikit-learn-scikit-learn-1.4.0/
    rm -rf ../python*-scikit-learn*1.4.0*.* ../python*-sklearn*1.4.0*.*

See scikit-learn#28541

Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com>
hswong3i added a commit to alvistack/scikit-learn-scikit-learn that referenced this pull request Feb 27, 2024
    git clean -xdf
    git submodule sync
    git submodule update --init
    tar zcvf ../python-scikit-learn_1.4.1.post1.orig.tar.gz --exclude=.git .
    debuild -uc -us
    cp python-scikit-learn.spec ../python-scikit-learn_1.4.1.post1-1.spec
    cp ../python*-scikit-learn*1.4.1.post1*.{gz,xz,spec,dsc} /osc/home\:alvistack/scikit-learn-scikit-learn-1.4.1.post1/
    rm -rf ../python*-scikit-learn*1.4.1.post1*.* ../python*-sklearn*1.4.1.post1*.*

See scikit-learn#28541

Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com>
@betatim
Copy link
Member

betatim commented Feb 27, 2024

Thanks for sending this patch! Let's see if we can get people who are experts in this code interested.

@Micky774 and @jjerphan could you make some time to think about this? In particular performance related things. For example is there a benchmark that we can run to measure the difference?

@jjerphan
Copy link
Member

I would run benchmarks from scikit-learn/pairwise-distances-reductions-asv-suite potentially updating the parameter list to only test for a single metric which is not "euclidean" because EuclideanDistance is not used in this case (e.g. "minkowski" would test MinkowskiDistance).

I do not have time to have a look at that for now. :/

@glemaitre
Copy link
Member

But removing the inline can impact the performance.

Indeed, we should pay some additional overhead. Since this is impacting the rdist, we should run a benchmark with a radius neighbors to see the impact of this regression. It is clear that the best solution is to have a fix in Cython but currently, we don't allow some stack to package scikit-learn and I don't see any other workaround.

@jjerphan
Copy link
Member

jjerphan commented Feb 27, 2024

Another solution would be to have the presence of inline be compiler-dependent.

I think we should only consider this second solution if the present one causes regressions.

@betatim
Copy link
Member

betatim commented Feb 27, 2024

I'm not sure about all compilers (and all versions of them) but some compilers will consider any function, with or without inline for inlining. For example in GCC there is -finline-functions which is part of -O2and -O3. I don't know if the heuristic it uses is different from the one used for functions marked as inline (which is just a hint to the compiler, it doesn't mean the function will actually be inlined)?

So I think it is possible that we find that removing the inline definition does not reduce performance. And if that is the case I think we should remove them instead of building a more complicated system/workaround.

@jjerphan
Copy link
Member

If cython/cython#6039 is merged and a new version of Cython released, then we may skip this PR.

@betatim
Copy link
Member

betatim commented Feb 28, 2024

If cython/cython#6039 is merged and a new version of Cython released, then we may skip this PR.

From a quick look at the PR and the discussion that lead to it: it seems like this just suppresses the warning? Is that right? I tried to find out more about the warning and why gcc warns you about this. But didn't find out much :( My question is: is this warning a false positive for scikit-learn's code or is it a true positive? If the latter, shouldn't we fix the source of the warning instead of turning off the warning? The "fix instead of suppress" attitude comes from the fact that usually gcc has a good reason to warn about something, even if it is annoying ;)

@jjerphan
Copy link
Member

jjerphan commented Feb 28, 2024

If the latter, shouldn't we fix the source of the warning instead of turning off the warning?

We can fix the problem in scikit-learn with this PR, but to me the issue is Cython's.

I do not know sufficiently about Cython to be able to tell if the original problem is addressable or not to properly choose a solution.

What do you think?

@jjerphan
Copy link
Member

jjerphan commented Feb 28, 2024

I cannot run the tests suite because Cython fails to compile scikit-learn for other reasons.

Here are the changes which I think need to be made to scikit-learn/pairwise-distances-reductions-asv-suite to test for regressions:

diff --git i/asv.conf.json w/asv.conf.json
index e647afb..1469459 100644
--- i/asv.conf.json
+++ w/asv.conf.json
@@ -11,7 +11,7 @@
 
     // The URL or local path of the source code repository for the
     // project being benchmarked
-    "repo": "https://github.com/scikit-learn/scikit-learn.git",
+    "repo": "https://github.com/alvistack/scikit-learn-scikit-learn.git",
 
     // The Python project's subdirectory in your repo.  If missing or
     // the empty string, the project is assumed to be located at the root
diff --git i/benchmarks/pairwise_distances_reductions.py w/benchmarks/pairwise_distances_reductions.py
index 62b952c..7b63433 100644
--- i/benchmarks/pairwise_distances_reductions.py
+++ w/benchmarks/pairwise_distances_reductions.py
@@ -23,12 +23,12 @@ class PairwiseDistancesReductionsBenchmark(Benchmark):
         "X_test",
     ]
     params = [
-        [1000, 10_000, int(1e7)],
-        [1000, 10_000, 100_000],
-        [100],
-        ["euclidean", "manhattan"],
-        ["auto", "parallel_on_X", "parallel_on_Y"],
-        [np.float32, np.float64],
+        [1000, 10_000],
+        [10_000, 100_000],
+        [10],
+        ["manhattan"],
+        ["auto"],
+        [np.float32],
         ["dense", "csr"],
         ["dense", "csr"],
     ]

and then one can run:

asv continuous main main-gcc14

@lorentzenchr
Copy link
Member

@da-woods Could you help out here? (In the past you did quite many times!)

@da-woods
Copy link

The warning's a "false" positive (i.e. it isn't hiding any real bugs in this case, but the C compiler is right to warn). So I wouldn't be worried about just suppressing it.

but to me the issue is Cython's.

Yes - this is right.

I think the plan is to get the warning at least suppressed in the next Cython release (which I think will probably be today or tomorrow). If I were you I'd just wait for that unless you're in a huge rush.

It is something that Cython should follow up and fix properly in the future though.

@lorentzenchr
Copy link
Member

@da-woods Thank you so much.

@ogrisel
Copy link
Member

ogrisel commented Mar 20, 2024

I believe we don't need this PR anymore with Cython 3.0.9 as said here:

#28530 (comment)

So let me close. We can reopen if I am wrong.

@ogrisel ogrisel closed this Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUILD gcc14 cannot compile scikit-learn
7 participants