A collection of geospatial bug fixes #4444

mistercrunch · 2018-02-16T08:28:57Z

Also using geopy to parse single-column spatial info, supports many different types of lat/lng styles

betodealmeida · 2018-02-17T01:03:24Z

superset/viz.py

+                p = Point(s)
+                return (p.latitude, p.longitude)
+
+            df[key] = df[spatial.get('lonlatCol')].apply(tupleify)


We should definitely benchmark this with our current data before deploying, since it's using regular expressions to do the job: https://github.com/geopy/geopy/blob/master/geopy/point.py#L310-L339

In [2]: from geopy.point import Point In [3]: Point('234,239') Out[3]: Point(54.0, -121.0, 0.0) In [4]: %timeit Point('234,239') The slowest run took 4.29 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 11.7 µs per loop In [5]: %timeit (float(v) for v in '234,239'.split(',')) The slowest run took 6.84 times longer than the fastest. This could mean that an intermediate result is being cached. 1000000 loops, best of 3: 593 ns per loop

So roughly double the time. On 1M points, that means 0.5 second which to me is fine as almost negligible compare to the network time it takes to bring that over. Note that there's probably a numpy way of doing this that would be much faster.

Wait, you mean 20x the time. Calling Point on 1M points would take ~12 seconds (11.7e-6 s * 1e6).

I looked at the code and I'm not sure how to optimize this with Numpy.

Oh snap. 12 seconds is long.

betodealmeida

LGTM, and we can look at improving the perf of Point later if we need.

mistercrunch · 2018-02-20T22:41:33Z

Let's optimize Point later on.

mistercrunch force-pushed the geofixes branch from fd1ef6b to 6afed9f Compare February 16, 2018 21:55

betodealmeida reviewed Feb 17, 2018

View reviewed changes

mistercrunch added the v0.23 label Feb 19, 2018

mistercrunch force-pushed the geofixes branch from 171ad11 to 3c97a53 Compare February 20, 2018 16:42

A collection of bug fixes

034d31b

mistercrunch force-pushed the geofixes branch from 3c97a53 to 034d31b Compare February 20, 2018 17:13

betodealmeida approved these changes Feb 20, 2018

View reviewed changes

mistercrunch merged commit 5c35a2d into apache:master Feb 20, 2018

mistercrunch deleted the geofixes branch February 20, 2018 22:41

hughhhh pushed a commit to lyft/incubator-superset that referenced this pull request Apr 1, 2018

Cherry pick apache#4444 (apache#134)

d6e86e3

michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018

A collection of bug fixes (apache#4444)

c16bb6a

wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018

A collection of bug fixes (apache#4444)

1118597

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.24.0 labels Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A collection of geospatial bug fixes #4444

A collection of geospatial bug fixes #4444

mistercrunch commented Feb 16, 2018 •

edited

betodealmeida Feb 17, 2018

mistercrunch Feb 17, 2018

betodealmeida Feb 20, 2018

mistercrunch Feb 20, 2018

betodealmeida left a comment

mistercrunch commented Feb 20, 2018

A collection of geospatial bug fixes #4444

A collection of geospatial bug fixes #4444

Conversation

mistercrunch commented Feb 16, 2018 • edited

betodealmeida Feb 17, 2018

Choose a reason for hiding this comment

mistercrunch Feb 17, 2018

Choose a reason for hiding this comment

betodealmeida Feb 20, 2018

Choose a reason for hiding this comment

mistercrunch Feb 20, 2018

Choose a reason for hiding this comment

betodealmeida left a comment

Choose a reason for hiding this comment

mistercrunch commented Feb 20, 2018

mistercrunch commented Feb 16, 2018 •

edited