[GH-2485] Implement minimum_bounding_circle #2488

chay0112 · 2025-11-10T22:08:21Z

Did you read the Contributor Guide?

Yes, I have read the Contributor Rules and Contributor Development Guide

Is this PR related to a ticket?

Yes, and the PR name follows the format [GH-XXX] my subject. Closes Geopandas: Implement minimum_bounding_circle #2485

What changes were proposed in this PR?

Implemented minimum_bounding_circle

How was this patch tested?

Included unit and parity tests

Did this PR include necessary documentation updates?

Yes, I have updated the documentation.

petern48

When you follow existing patterns, and things fail unexpectedly, it is totally fine to ask for help. Just be a bit more patient waiting for a reply because I (and most people) have very busy lives.

Having AI agent mode go crazy making super odd changes, is more likely to make your PR take longer to merge. I (and most reviewers) don't just care about tests passing, but also about code readability. Rule of thumb: if the generated code is harder for you to understand than the original, than it probably isn't a good one.

In honesty, I think we should revert this entire last commit, you're farther off now then you originally were. Your initial commit looked great. Just needed a bit of tweaking, which I'll point you in the right direction.

petern48 · 2025-11-11T03:57:34Z

python/sedona/spark/geopandas/geoseries.py

-    @property
-    def minimum_bounding_circle(self) -> "GeoSeries":
-        spark_expr = stf.ST_MinimumBoundingCircle(self.spark.column)
-        return self._query_geometry_column(
-            spark_expr,
-            returns_geom=True,
-        )
+    def minimum_bounding_circle(self, quadrant_segments: int = None):
+        if quadrant_segments is None:
+            spark_expr = stf.ST_MinimumBoundingCircle(self.spark.column)
+        else:
+            spark_expr = stf.ST_MinimumBoundingCircle(
+                self.spark.column, quadrant_segments
+            )
+        return self._query_geometry_column(spark_expr, returns_geom=True)


We want to follow the Geopandas API. Here's the docs for .minimum_bounding_circle(). You'll see that it doesn't have a quadrant_segments parameter, so we don't want to add it to our python API either.

petern48 · 2025-11-11T04:00:59Z

python/tests/geopandas/test_geoseries.py

-        df_result = s.to_geoframe().minimum_bounding_circle
-        self.check_sgpd_equals_gpd(df_result, gpd_res)
+        tg = getattr(s, "to_geoframe")
+        gdf = tg() if callable(tg) else tg
+        mbc = getattr(gdf, "minimum_bounding_circle")
+        df_result = mbc() if callable(mbc) else mbc
+        self.check_sgpd_equals_gpd(df_result, expected)


This is unnecesssarily complicating the code's readability. Let's keep with existing patterns. When things don't behave the way we want them to, we should adjust things bit, not change completely different parts of the code.

Totally get it. Sorry for making these a little complicated. I'll try my best to make it simpler.

What you had before is what we're looking for. Just need to add parentheses () to the end of the call to minimum_bounding_circle.

df_result = s.to_geoframe().minimum_bounding_circle() self.check_sgpd_equals_gpd(df_result, gpd_res)

chay0112 · 2025-11-11T04:37:05Z

When you follow existing patterns, and things fail unexpectedly, it is totally fine to ask for help. Just be a bit more patient waiting for a reply because I (and most people) have very busy lives.

Having AI agent mode go crazy making super odd changes, is more likely to make your PR take longer to merge. I (and most reviewers) don't just care about tests passing, but also about code readability. Rule of thumb: if the generated code is harder for you to understand than the original, than it probably isn't a good one.

In honesty, I think we should revert this entire last commit, you're farther off now then you originally were. Your initial commit looked great. Just needed a bit of tweaking, which I'll point you in the right direction.

Thank you for the feedback and for taking the time to review this. I completely understand your point and appreciate the guidance. I’ll revert the last commit and go back to the earlier version so we can refine it from there.

petern48 · 2025-11-11T04:41:13Z

Now, let's address the CI failure of your original commit. The reality is that the issue was in the test itself. I could explain why it was causing problems, but honestly it's a little complicated.

           # 2) Coverage parity — compare only where inputs are valid for both backends
           nonnull_nonempty_ps = (gs_in.isna() == False) & (gs_in.is_empty == False)

Instead, I'd like you take a step back and rewrite the test. Don't overthink it. You honestly really don't need to understand Sedona as a project much to contribute these Geopandas functions. A lot of it is just copy-paste and following patterns. Now I'll ask you to do this:

Go to test_match_geopandas_series.py, and delete all of the contents of the test_minimum_bounding_circle test.
Take a look at the following functions in test_match_geopandas_series.py. Then, guess how you should implement test_minimum_bounding_circle. It's very simple, much simpler than the original test in this PR. You shouldn't need to ask AI at all. You're welcome to copy and paste.

sedona/python/tests/geopandas/test_match_geopandas_series.py

Lines 721 to 725 in 762d6f8

    
           def test_centroid(self): 
        
               for geom in self.geoms: 
        
                   sgpd_result = GeoSeries(geom).centroid 
        
                   gpd_result = gpd.GeoSeries(geom).centroid 
        
                   self.check_sgpd_equals_gpd(sgpd_result, gpd_result)

If you're still confused, you're welcome to ask for help. We might need to make some small adjustments to make things pass, but this is a lot closer to what our desired end result is.

chay0112 · 2025-11-11T04:46:07Z

Excellent, thanks for the direction. I'll check this.

chay0112 · 2025-11-11T06:04:50Z

Now, let's address the CI failure of your original commit. The reality is that the issue was in the test itself. I could explain why it was causing problems, but honestly it's a little complicated.
           # 2) Coverage parity — compare only where inputs are valid for both backends
           nonnull_nonempty_ps = (gs_in.isna() == False) & (gs_in.is_empty == False)
Instead, I'd like you take a step back and rewrite the test. Don't overthink it. You honestly really don't need to understand Sedona as a project much to contribute these Geopandas functions. A lot of it is just copy-paste and following patterns. Now I'll ask you to do this:

Go to test_match_geopandas_series.py, and delete all of the contents of the test_minimum_bounding_circle test.

Take a look at the following functions in test_match_geopandas_series.py. Then, guess how you should implement test_minimum_bounding_circle. It's very simple, much simpler than the original test in this PR. You shouldn't need to ask AI at all. You're welcome to copy and paste.

sedona/python/tests/geopandas/test_match_geopandas_series.py

Lines 721 to 725 in 762d6f8

def test_centroid(self):

for geom in self.geoms:

sgpd_result = GeoSeries(geom).centroid

gpd_result = gpd.GeoSeries(geom).centroid

self.check_sgpd_equals_gpd(sgpd_result, gpd_result)

If you're still confused, you're welcome to ask for help. We might need to make some small adjustments to make things pass, but this is a lot closer to what our desired end result is.

@petern48 Thanks for the great suggestion! I tried a similar approach to what’s used in the centroid test.

def test_minimum_bounding_circle(self):
        for geom in self.geoms:
            sgpd_result = GeoSeries(geom).minimum_bounding_circle()
            gpd_result = gpd.GeoSeries(geom).minimum_bounding_circle()
            self.check_sgpd_equals_gpd(sgpd_result, gpd_result)

The above test fails because the tolerance level in the check_sgpd_equals_gpd method is currently set to 1e-2.
However, increasing the tolerance to 0.5 allows the test to pass. Would you prefer that I update the tolerance to 0.5 in the check_sgpd_equals_gpd method, or handle this case-specific adjustment directly within my test, as shown below ? or am I completely wrong about the analogy ?

def test_minimum_bounding_circle(self):
        for geom in self.geoms:
            sgpd_result = GeoSeries(geom).minimum_bounding_circle()
            gpd_result = gpd.GeoSeries(geom).minimum_bounding_circle()
            for s, g in zip(sgpd_result.to_pandas(), gpd_result):
                assert s.is_valid and g.is_valid
                assert abs(s.area - g.area) < 0.5

No rush ! Please take your time to reply.

petern48 · 2025-11-11T06:45:31Z

def test_minimum_bounding_circle(self):
       for geom in self.geoms:
           sgpd_result = GeoSeries(geom).minimum_bounding_circle()
           gpd_result = gpd.GeoSeries(geom).minimum_bounding_circle()
           self.check_sgpd_equals_gpd(sgpd_result, gpd_result)

Yes! Now we're on the right track. I'll explain the code a little more just for your knowledge, test_match_geopandas_series.py is for directly comparing our results GeoSeries() with the original geopandas' results gpd.GeoSeries. This code effectively does exactly that: It loops through a ton of different geometries (self.geoms) and checks that our results (sgpd_result) are equal (given a tolerance) to the original geopandas results (gpd_result).

However, increasing the tolerance to 0.5 allows the test to pass. Would you prefer that I update the tolerance to 0.5 in the check_sgpd_equals_gpd method

Nice job figuring this out. I think the best way to move forward is to add a parameter to the test function and overwrite the tolerance to use 0.5 instead just for test_minimum_bounding_circle. Something like this:

    @classmethod
    def check_sgpd_equals_gpd(
        cls,
        actual: GeoSeries,
        expected: gpd.GeoSeries,
+        tolerance: float = 1e-2
    ):
        ...
        for a, e in zip(sgpd_result, expected):
            ...
            cls.assert_geometry_almost_equal(
-                a, e, tolerance=1e-2 # 1e-2
+                a, e, tolerance
            )
            ...

Then you can call it self.check_sgpd_equals_gpd(sgpd_result, gpd_result, tolerance=0.5) in your test function. This approach should still loosen the test, so it can pass, while avoiding loosening the tests for other functions. If other functions still pass with a stricter test, we don't need to loosen it for them too. The definition of that test function is here in test_geopandas_base.py.

…ls_gpd

chay0112 · 2025-11-11T07:24:30Z

def test_minimum_bounding_circle(self):
       for geom in self.geoms:
           sgpd_result = GeoSeries(geom).minimum_bounding_circle()
           gpd_result = gpd.GeoSeries(geom).minimum_bounding_circle()
           self.check_sgpd_equals_gpd(sgpd_result, gpd_result)
Yes! Now we're on the right track. I'll explain the code a little more just for your knowledge, test_match_geopandas_series.py is for directly comparing our results GeoSeries() with the original geopandas' results gpd.GeoSeries. This code effectively does exactly that: It loops through a ton of different geometries (self.geoms) and checks that our results (sgpd_result) are equal (given a tolerance) to the original geopandas results (gpd_result).

However, increasing the tolerance to 0.5 allows the test to pass. Would you prefer that I update the tolerance to 0.5 in the check_sgpd_equals_gpd method

Nice job figuring this out. I think the best way to move forward is to add a parameter to the test function and overwrite the tolerance to use 0.5 instead just for test_minimum_bounding_circle. Something like this:
    @classmethod
    def check_sgpd_equals_gpd(
        cls,
        actual: GeoSeries,
        expected: gpd.GeoSeries,
+        tolerance: float = 1e-2
    ):
        ...
        for a, e in zip(sgpd_result, expected):
            ...
            cls.assert_geometry_almost_equal(
-                a, e, tolerance=1e-2 # 1e-2
+                a, e, tolerance
            )
            ...
Then you can call it self.check_sgpd_equals_gpd(sgpd_result, gpd_result, tolerance=0.5) in your test function. This approach should still loosen the test, so it can pass, while avoiding loosening the tests for other functions. If other functions still pass with a stricter test, we don't need to loosen it for them too. The definition of that test function is here in test_geopandas_base.py.

Awesome, that’s one of the best explanations I’ve ever received !

petern48

Looks great! Just a one last minor nit to address.

Now that we slowed down to spend a bit more time for you understand a few aspects of the code, you'll (hopefully) be better prepared to contribute more functions 😉

python/tests/geopandas/test_match_geopandas_series.py

Co-authored-by: Peter Nguyen <petern0408@gmail.com>

chay0112 · 2025-11-11T15:52:30Z

Looks great! Just a one last minor nit to address.

Now that we slowed down to spend a bit more time for you understand a few aspects of the code, you'll (hopefully) be better prepared to contribute more functions 😉

Absolutely! I really appreciate you taking the time to walk me through the details. It definitely helped me understand things better and feel more confident about contributing further.

petern48 · 2025-11-11T17:26:53Z

Thanks @chay0112!

Implemented the minimum bounding circle

272dba6

chay0112 requested a review from jiayuasu as a code owner November 10, 2025 22:08

github-actions bot added the sedona-python label Nov 10, 2025

adjusted test cases

12e1168

petern48 reviewed Nov 11, 2025

View reviewed changes

Refactored test cases and added tolerance as param to check_sgpd_equa…

96b5362

…ls_gpd

revert make valid test

1899319

petern48 approved these changes Nov 11, 2025

View reviewed changes

python/tests/geopandas/test_match_geopandas_series.py Outdated Show resolved Hide resolved

Remove unused import

bf2f390

Co-authored-by: Peter Nguyen <petern0408@gmail.com>

petern48 merged commit 8de7008 into apache:master Nov 11, 2025
30 of 31 checks passed

[GH-2485] Implement minimum_bounding_circle #2488

[GH-2485] Implement minimum_bounding_circle #2488

Uh oh!

Conversation

chay0112 commented Nov 10, 2025

Did you read the Contributor Guide?

Is this PR related to a ticket?

What changes were proposed in this PR?

How was this patch tested?

Did this PR include necessary documentation updates?

Uh oh!

petern48 left a comment

Choose a reason for hiding this comment

Uh oh!

petern48 Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

petern48 Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

chay0112 Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

petern48 Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

chay0112 commented Nov 11, 2025

Uh oh!

petern48 commented Nov 11, 2025

Uh oh!

chay0112 commented Nov 11, 2025

Uh oh!

chay0112 commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petern48 commented Nov 11, 2025

Uh oh!

chay0112 commented Nov 11, 2025

Uh oh!

petern48 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chay0112 commented Nov 11, 2025

Uh oh!

Uh oh!

petern48 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chay0112 commented Nov 11, 2025 •

edited

Loading