Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree shapefile dump fails because of a null geometry [Part of 1714] #1728

Closed
kdeloach opened this issue May 23, 2015 · 10 comments
Closed

Tree shapefile dump fails because of a null geometry [Part of 1714] #1728

kdeloach opened this issue May 23, 2015 · 10 comments
Assignees
Milestone

Comments

@kdeloach
Copy link
Contributor

Occurred in production on May 23 2015 10:30 AM EST.

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 240, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 438, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/app/apps/census_admin/tasks.py", line 117, in dump_trees_to_shapefile
    geom=0, geom_type='Point')
  File "/opt/app/apps/census_admin/tasks.py", line 194, in _dump_to_shapefile
    'geometry': json.loads(get_val(row, geom)),
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
@dboyer dboyer added this to the 2015-06-04 milestone May 25, 2015
@jwalgran jwalgran changed the title Exception in generating data dump shapefile Exception in generating data dump shapefile [Part of 1714] May 26, 2015
@jwalgran jwalgran self-assigned this May 26, 2015
@jwalgran
Copy link
Member

Using a production data dump, I ran trees.sql with an additional geom IS NULL clause and it returned a single row for which a geometry was not generated.

 geom |  id  | survey_id | species_id | distance_to_tree | distance_to_end | circumference | stump_diameter | curb_location  | status | species_certainty | health | stewardship | guards | sidewalk_damage | problems |          created_at           |          updated_at
------+------+-----------+------------+------------------+-----------------+---------------+----------------+----------------+--------+-------------------+--------+-------------+--------+-----------------+----------+-------------------------------+------------------------------
      | 2200 |       414 |        126 |               24 |            15.2 |            27 |                | OffsetFromCurb | Alive  | Yes               | Good   | 3or4        | None   | NoDamage        | None     | 2015-05-22 22:04:26.153112+00 | 2015-05-22 22:04:26.15314+00

Survey details

screen shot 2015-05-26 at 9 10 23 am

screen shot 2015-05-26 at 9 10 13 am

@jwalgran
Copy link
Member

https://treescount.nycgovparks.org/survey/detail/414/ showing the short median block edge with a single tree instead of 2

screen shot 2015-05-26 at 9 27 40 am

@jwalgran
Copy link
Member

Incrementally stepping through the subqueries in trees.sql. layoutBoxes returns NULL for the second tree.

SELECT
    id,
    survey_id,
    1.000*(CASE WHEN curb_location='OnCurb' THEN 2.5 ELSE 12 END) curb_offset,
    1.000*distance_to_tree dist
  FROM survey_tree
  WHERE survey_id = 414

  id  | survey_id | curb_offset | dist
------+-----------+-------------+------
 2199 |       414 |      2.5000 | 25.6
 2200 |       414 |      12.000 |   24
(2 rows)
WITH trees_for_this_survey AS(
  SELECT
    id,
    survey_id,
    1.000*(CASE WHEN curb_location='OnCurb' THEN 2.5 ELSE 12 END) curb_offset,
    1.000*distance_to_tree dist
  FROM survey_tree
  WHERE survey_id = 414
)

SELECT survey_id,
       array_agg(curb_offset ORDER BY id) width,
       array_agg(0) length,
       array_agg(dist ORDER BY id) dist,
       array_agg(id ORDER BY id) tree_ids
FROM trees_for_this_survey
GROUP BY survey_id

 survey_id |      width      | length |   dist    |  tree_ids
-----------+-----------------+--------+-----------+-------------
       414 | {2.5000,12.000} | {0,0}  | {25.6,24} | {2199,2200}
WITH trees_for_this_survey AS(
  SELECT
    id,
    survey_id,
    1.000*(CASE WHEN curb_location='OnCurb' THEN 2.5 ELSE 12 END) curb_offset,
    1.000*distance_to_tree dist
  FROM survey_tree
  WHERE survey_id = 414
),
aggregated_trees AS (
  SELECT survey_id,
         array_agg(curb_offset ORDER BY id) width,
         array_agg(0) length,
         array_agg(dist ORDER BY id) dist,
         array_agg(id ORDER BY id) tree_ids
  FROM trees_for_this_survey
  GROUP BY survey_id
)
SELECT
  s.blockface_id,
  s.is_mapped_in_blockface_polyline_direction,
  s.is_left_side as left_side,
  s.user_id, st_astext(b.geom),
  r.survey_id, width, length, dist,
  r.tree_ids
FROM
  aggregated_trees r, survey_survey s, survey_blockface b
WHERE
  r.survey_id = s.id AND b.id = s.blockface_id

 blockface_id | is_mapped_in_blockface_polyline_direction | left_side | user_id |                                       st_astext                                        | survey_id |      width      | length |   dist    |  tree_ids
--------------+-------------------------------------------+-----------+---------+----------------------------------------------------------------------------------------+-----------+-----------------+--------+-----------+-------------
       340213 | f                                         | f         |    2303 | MULTILINESTRING((-73.8513201733406 40.743059186748,-73.851234653018 40.7428932342286)) |       414 | {2.5000,12.000} | {0,0}  | {25.6,24} | {2199,2200}
(1 row)
WITH trees_for_this_survey AS(
  SELECT
    id,
    survey_id,
    1.000*(CASE WHEN curb_location='OnCurb' THEN 2.5 ELSE 12 END) curb_offset,
    1.000*distance_to_tree dist
  FROM survey_tree
  WHERE survey_id = 414
),
aggregated_trees AS (
  SELECT survey_id,
         array_agg(curb_offset ORDER BY id) width,
         array_agg(0) length,
         array_agg(dist ORDER BY id) dist,
         array_agg(id ORDER BY id) tree_ids
  FROM trees_for_this_survey
  GROUP BY survey_id
),
aggs AS (
  SELECT
    s.blockface_id,
    s.is_mapped_in_blockface_polyline_direction,
    s.is_left_side as left_side,
    s.user_id, b.geom,
    r.survey_id, width, length, dist,
    r.tree_ids
  FROM
    aggregated_trees r, survey_survey s, survey_blockface b
  WHERE
    r.survey_id = s.id AND b.id = s.blockface_id
)

SELECT survey_id,
       tree_ids,
       layoutBoxes(ST_Transform(
           CASE WHEN is_mapped_in_blockface_polyline_direction
           THEN st_geometryn(geom,1)
           ELSE ST_Reverse(st_geometryn(geom,1)) END,
           102718), -- state plane, for improved accuracy, feet units
       left_side, dist, length, width) as tbeds
FROM aggs

 survey_id |  tree_ids   |                           tbeds
-----------+-------------+-----------------------------------------------------------
       414 | {2199,2200} | {01010000203E910100608C5153744B2F417387A63A43A20941:NULL}
(1 row)

@jwalgran jwalgran changed the title Exception in generating data dump shapefile [Part of 1714] Tree shapefile dump fails because of a null geometry [Part of 1714] May 26, 2015
@jwalgran
Copy link
Member

layoutBoxes.sql contains the following logic:

-- This is an arbitrarily small number used
-- to obtain another point but on the same segment
IF curdst >= roadrec.len THEN
  distfrac := distfrac - 0.0000001;
ELSE
  distfrac := distfrac + 0.0000001;
END IF;

I added several RAISE statements to the layoutBoxes function to inspect the values of the intermediate variables and got the idea that, for short blocks, this "arbitrarily small number" may be too small. (ST_Length(line) reports block edge 340213 as 64.94 ft)

When I update the statement to

IF curdst >= roadrec.len THEN
  distfrac := distfrac - 0.0000002;
ELSE
  distfrac := distfrac + 0.0000002;
END IF;

The query produces geometry for both trees

 survey_id |  tree_ids   |                                                  tbeds
-----------+-------------+---------------------------------------------------------------------------------------------------------
       414 | {2199,2200} | {01010000203E91010020735153744B2F41D787A73A43A20941:01010000203E91010076405469744B2F41AD043EB911A30941}

strk pushed a commit to treekit/treekit that referenced this issue May 28, 2015
Fixes point placement on short segments, as reported on
azavea/nyc-trees#1728

Also fixes slighly wrong distance on offsetted points (#53)
@strk
Copy link
Contributor

strk commented May 28, 2015

It boils down to this issue:
select ST_OffsetCurve('LINESTRING(24 0,24.0000025 0)'::geometry, -3);

Filed upstream here: https://trac.osgeo.org/postgis/ticket/3138

I committed a workaround in the treekit code that also fixes another issue of slight misplacement for offsetted points:

treekit/treekit@1ef16d7

@strk
Copy link
Contributor

strk commented May 28, 2015

NOTE: the new code should also be faster (in addition to more robust)

@strk
Copy link
Contributor

strk commented May 28, 2015

Even better, upgrade to treekit/treekit@efb8e37

@kdeloach
Copy link
Contributor Author

Happened again on May 31st 2015, 01:30:10.566.

@strk
Copy link
Contributor

strk commented Jun 2, 2015

Can you provide data to reproduce the problem ? I need the input road line and the trees data array.

@jwalgran
Copy link
Member

jwalgran commented Jun 2, 2015

I think this last report was from production. #1738 fixes the issue, but is still in review and not yet deployed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants