Skip to content

dkogan/culdesacs

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

Reference

This is all described better (with pictures!) in a blog post:

http://notes.secretsauce.net/notes/2015/08/16_least-convenient-location-in-los-angeles-from-koreatown.html

Overview

Talking to a friend, a question came up about finding the point in LA’s road network that’s most inconvenient to get to, with inconvenient being a vague notion describing a closed residential neighborhood full of dead ends; the furthest of these dead ends would be most inconvenient indeed. This repository attempts to answer that question.

I want inconvenient to mean

Furthest to reach via the road network, but nearest as-the-crow-flies.

Note that this type of metric is not a universal one, but is relative to a particular starting point. This makes sense, however: a location that’s inconvenient from one location could be very convenient from another.

This metric could be expressed in many ways. I keep it simple, and compute a relative inefficiency coefficient:

(d_road - d_direct) / d_direct

Thus the goal is to find a location within a given radius of the starting point that maximizes this relative inefficiency.

Approach

I use OpenStreetMap for the road data. This is all aimed at bicycling, so I’m looking at all roads except freeways and ones marked private. I am looking at footpaths, trails, etc.

Once I have the road network, I run Dijkstra’s Algorithm to compute the shortest path from my starting point to every other point on the map. Then I can easily compute the inefficiency for each such point, and pick the point with the highest inefficiency. I use OSM nodes as the “points”. It is possible that the location I’m looking for is inbetween a pair of nodes, but the nodes will really be close enough. Also, the “distance” between adjacent nodes can take into account terrain type, elevation, road type and so on. I ignore all that, and simply look at the distance.

Implementation

Each step in the process lives in its own program. This simplifies implementation and makes it easy to work on each piece separately.

Data import

First I query OSM. This is done with the query.pl script. It takes in the center point and the query radius. The query uses the OSM Overpass query language. I use this simple query, filling in the center point and radius:

[out:json];

way
 ["highway"]
 ["highway" !~ "motorway|motorway_link" ]
 ["access" !~ "private" ]
 ["access" !~ "no" ]
 (around:$rad,$lat,$lon);

(._;>;);

out;

Sample invocation:

$ ./query.pl --center 34.0690448,-118.292924 --rad 20miles
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  .....

$ ls -lhrt *(om[1])
-rw-r--r-- 1 dima dima 81M Aug 14 00:44 query_34.0690448_-118.292924_20miles.json

Data massaging

Now I need to take the OSM query results, and manipulate them into a form readable by the Dijkstra’s algorithm solver. This is done by the massage_input.pl script. This script does nothing interesting, but it doesn it inefficiently, so it’s CPU and RAM-hungry and takes a few minutes. Sample invocation:

$ ./massage_input.pl query_34.0690448_-118.292924_20miles.json > query_34.0690448_-118.292924_20miles.net

Neighbor list representation

An implementation choice here was how to represent the neighbor list for a node. I want the main computation (next section) to be able to query this very quickly, and I don’t want the list to take much space, and I don’t want to fragment my memory with many small allocations. Thus I have a single contiguous array of integers neighbor_pool. Each node has a single integer index into this pool. At this index the neighbor_pool contains a list of node indices that are neighbors of the node in question. A special node index of -1 signifies the end of the neighbor list for that node.

Inefficiency coefficient computation

I now feed the massaged data to Dijkstra’s algorithm implemented in compute.c. I need a priority queue where elements can be inserted, removed and updated. Apparently most heap implementations don’t have an ‘update’ mechanism, so it took a little while to find a working one. I ended up using phk’s b-heap implementation from the varnish source tree. It stores arbitrary pointers (64-bit on my box); 32-bit indices into a pool would be more efficient, but this is fast enough.

Sample invocation:

$ ./compute < query_34.0690448_-118.292924_20miles.net > query_34.0690448_-118.292924_20miles.out

$ head -n 2 query_34.0690448_-118.292924_20miles.out
34.069046 -118.292923 0.000000 0.000000
34.070034 -118.292931 109.863564 109.863564

The output is all nodes, sorted by the road distance to the node. The columns are lat,lon,d_road,d_direct.

Distance from latitude/longitude pairs

One implementation note here is how to compute the distance between two latitude/longitude pairs. The most direct way is to convert each latitude/longitude pair into a unit vector, compute the dot product, take the arccos and multiply by the radius of the Earth. This requires 9 trigonometric operations and relies on the arccos of a number close to 1, which is inaccurate. One could instead compute the arcsin of the magnitude of the cross-product, but this requires even more computation. I want something simpler:

dist = Rearth * angle

cos(angle) = dot(v0,v1) = dot( (cos(lon0)*cos(lat0), sin(lon0)*cos(lat0), sin(lat0)),
                               (cos(lon1)*cos(lat1), sin(lon1)*cos(lat1), sin(lat1)) ) =

           = cos(lat0)*cos(lat1) * ( cos(lon0)*cos(lon1) + sin(lon0)*sin(lon1) ) +
             sin(lat0)*sin(lat1) =

           = cos(lat0)*cos(lat1) * cos(diff_lon) + sin(lat0)*sin(lat1)

cos(diff_lon) ~ 1 - diff_lon^2/2 so

cos(angle) = cos(lat0)*cos(lat1) + sin(lat0)*sin(lat1) - diff_lon^2/2*cos(lat0)*cos(lat1) =
           = cos(diff_lat) - cos(lat0)*cos(lat1)*diff_lon^2/2 ~
           ~ 1 - diff_lat^2/2 - diff_lon^2/2*cos(lat0)*cos(lat1)

cos(angle) ~ 1 - angle^2/2, so

angle^2 ~ diff_lat^2 + diff_lon^2*cos(lat0)*cos(lat1)

angle ~ sqrt(diff_lat^2 + diff_lon^2 * cos(lat0)*cos(lat1))

This is nice and simple. Is it sufficiently accurate? This python script tests it:

import numpy as np
lat0,lon0 = 34.0690448,-118.292924  # 3rd/New Hampshire
lat1,lon1 = 33.93,-118.4314         # LAX

lat0,lon0,lat1,lon1 = [x * np.pi/180.0 for x in lat0,lon0,lat1,lon1]

Rearth = 6371000

v0 = np.array((np.cos(lat0)*np.cos(lon0), np.cos(lat0)*np.sin(lon0),np.sin(lat0)))
v1 = np.array((np.cos(lat1)*np.cos(lon1), np.cos(lat1)*np.sin(lon1),np.sin(lat1)))

dist_accurate = np.sqrt( (lat0-lat1)**2 + (lon0-lon1)**2 * np.cos(lat0)*np.cos(lat1) ) * Rearth
dist_approx   = np.arccos(np.inner(v0,v1)) * Rearth

print dist_accurate
print dist_approx
print dist_accurate - dist_approx

Between Koreatown and LAX there’s quite a bit of difference in both latitude and longitude. Both methods say the distance is about 20km, with a disagreement of 3mm. This is plenty good enough.

Results

I want to find the least convenient location from the intersection of New Hampshire and 3rd street in Los Angeles within 20 miles or so.

The output of compute is sorted by road distance from the start. I prepend the coefficient of inconvenience, re-sort the list and take 50 most inconvenient locations by invoking

<query_34.0690448_-118.292924_20miles.out
   awk '$4 {printf "%f %f %f %f %f\n",($3-$4)/$4,$1,$2,$3,$4}' |
   sort -n -k1 -r | head -n 50

The output is this:

InconvenienceLatitudeLongitudeRoad distance (m)Direct distance (m)
1.14205234.068104-118.290382549.216980256.397583
1.13983934.071629-118.288956994.499390464.754242
1.13914734.068066-118.290436542.721497253.709305
1.13679934.068130-118.290329554.962891259.716919
1.12763134.068031-118.290466537.980652252.854279
1.12053734.068153-118.290253562.437012265.233337
1.10677134.067982-118.290504531.442017252.254257
1.10351834.068169-118.290184568.985352270.492218
1.08334434.067940-118.290527526.321899252.633179
1.07902734.068176-118.290100576.762024277.419189
1.04181634.067883-118.290543519.805908254.580200
1.03425234.070259-118.291237418.454498205.704315
1.01909634.071594-118.2878881097.392212543.506653
0.97473134.068214-118.289680617.407532312.654022
0.97009534.068176-118.289719611.899475310.593842
0.91759834.068111-118.289383656.267517342.234131
0.91004834.068165-118.289383650.329041340.477783
0.90249134.068214-118.289383644.814758338.931915
0.77080934.067570-118.290543485.023560273.899414
0.76071134.068214-118.288643712.981384404.939484
0.75334434.068214-118.288597717.197876409.045654
0.75054134.033188-118.2797167297.5698244168.751465
0.74734934.031826-118.2799687526.4150394307.333008
0.74335734.067772-118.289474606.347107347.804382
0.74190234.067787-118.289436610.249084350.334900
0.74002434.067749-118.289505602.555115346.291290
0.73994434.031769-118.2798237511.6191414317.161621
0.73838834.031582-118.2807467499.8027344314.228516
0.73788934.067795-118.289398613.863831353.223755
0.73771634.031742-118.2798007507.9770514320.601562
0.73629734.031372-118.2802587550.4868164348.613770
0.73508334.068108-118.288734693.459473399.669403
0.73460734.067730-118.289520600.010803345.905945
0.73285134.031685-118.2797477499.9331054328.088379
0.73081734.067795-118.289352618.080322357.103241
0.73054334.031654-118.2797327496.2597664331.739746
0.72862234.031628-118.2797247493.2084964334.787109
0.72712334.067707-118.289536597.103455345.721344
0.72680234.031601-118.2797247490.2392584337.637207
0.72430934.031563-118.2797397485.7705084341.315430
0.72313834.067791-118.289307622.318115361.153992
0.72282634.031540-118.2797557482.8627934343.366211
0.72238434.094849-118.23614510273.0322275964.425293
0.72197934.094719-118.23577910309.7080085987.128906
0.72101134.094639-118.23547410339.1875006007.625977
0.72081234.094620-118.23540510345.8564456012.193359
0.72010534.031498-118.2797787477.7421884347.258789
0.72007834.094543-118.23513810371.8671886029.880859
0.71978934.031509-118.2797557475.2788094346.624512
0.71961634.095020-118.23632010248.0234385959.484863

There are 3 clusters of data. All the stuff < 500m away from the start is mostly degenerate and uninteresting. Most of the points are in walkways in Shatto Recreation Center. They’re all so close to the start that any inefficiency is exaggerated by the small d_direct. I make the rules, so I claim these aren’t the least convenient point.

Next we have the points about 4.2km away as the crow flies. These all appear in an improperly-mapped group of sidewalks around Saint James park: http://www.openstreetmap.org/#map=18/34.03173/-118.27892.

Here the sidewalks appear as separate ways that don’t connect with the roads they abut. So according to the data, connecting to the network of sidewalks can only happen in one location, making these appear less convenient than they actually are. (I think these should be removed entirely, but it looks like the OSM committee people think both ways are fine. OK; it’ll be fixed eventually in some way).

The next cluster of data is about 6km away as the crow flies. These are all at the road connecting to the Metrolink maintenance facility at Taylor Yard: http://www.openstreetmap.org/#map=17/34.09371/-118.23463. This makes sense! This location is on the other side of the LA river from Koreatown, so getting here requires a lengthy detour to the nearest bikeable bridge. The nearest one (Riverside Drive) is 2.5km by road away, but this is in the opposite direction from Koreatown. The nearest one in the other direction is Fletcher Drive, 3.8km by road.

So the least convenient point from New Hampshire / 3rd is at lat/lon 34.094849,-118.236145. This location is 10.3km away by road, but only 6.0km as the crow flies, for an inconvenience coefficient of 0.72.

License

All code I wrote is Copyright 2015 Dima Kogan, released under the terms of the Lesser GNU Public License (any version). This includes everything except the files from Varnish, whose copyright and licensing appear in the specific files. These are binary_heap.c, binary_heap.h, miniobj.h

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published