Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CGImap release 0.7.0 #293

Closed
mmd-osm opened this issue Mar 28, 2019 · 39 comments

Comments

Projects
None yet
5 participants
@mmd-osm
Copy link

commented Mar 28, 2019

@mmd-osm

This comment has been minimized.

Copy link
Author

commented Apr 27, 2019

Adding to my previous post, here's a bit more detail as to what's included in this version of CGImap:

(1) Changeset metadata download to include number of changes

(was already part of CGImap 0.6.2)

Relevant issue: openstreetmap/openstreetmap-website#1947

Comparison:

(2) Fix confusing error message for /map download

Issue: zerebubuth/openstreetmap-cgimap#172

Comparison:

(3) JSON output

--> again disabled (no changes)

(4) Changeset upload

A quick recap of what I've written in #236 (comment)

One deployment option is to create a dedicated user with restricted rights, and use this one instead of the normal user for any database update operations. This way you could even configure all reads to go to the replication db, while all writes are sent to the main db.

  "CGIMAP_UPDATE_HOST" => "127.0.0.1",
  "CGIMAP_UPDATE_DBNAME" => "openstreetmap",
  "CGIMAP_UPDATE_USERNAME" => "api",
  "CGIMAP_UPDATE_PASSWORD" => "pass"

Outlook

Deployment wise, I would suggest the same launchpad PPA copy operation as we did for 0.6.1 last year.

(1) - (2) should be good to go.

For the changeset upload (4), an A/B deployment based on IP address ranges might be an option, where x % of the people would still use the Rails based code, and the remaining y % the new code. I leave it to the sysops to decide on a suitable deployment approach. In case of questions, etc. I'm happy to help where I can.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 22, 2019

@tomhughes : it's been a while since I published the PPA on launchpad, and I know there's always lack of time and other priorities, so that's totally ok. If there are some open questions I should be aware of, just let me know. Thanks!

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 23, 2019

This is now live for all existing request types.

I need to think about the best way to set things up for write access before turning on uploads...

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 23, 2019

In terms of the permissions listed in #236 (comment) does it not also need read access to the user table and the oauth tables?

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 24, 2019

Thanks for the update!

Re permissions:

So bottom line, we don't need additional permissions for the user and oauth tables for a dedicated "update user", as the respective queries are sent via a different db connection with different host/user/password settings.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 24, 2019

Sure. What I was thinking though was to create one user for everything cgimap needs to do (currently it uses the same user as the rails code which has select, insert, update, delete on everything) rather then trying to break it down into the separate roles.

Currently we only set CGIMAP_OAUTH_HOST so that it has a writable connection.

BTW does this mean cgimap will be opening an extra database connection? I guess it does so I should check how we're doing on the limits...

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 24, 2019

I think for a dedicated user covering everything needed by cgimap, we need at least the following additional permissions:

SELECT:

  • user_blocks
  • users
  • oauth_tokens
  • client_applications
  • user_roles

SELECT, INSERT:

  • oauth_nonces

This list is based on queries in:

BTW does this mean cgimap will be opening an extra database connection? I guess it does so I should check how we're doing on the limits...

Yes, that's correct. On top of the two existing database connections, there will be a third one solely used for the changeset update. I hope that's not a major issue in terms of overall number of database connections.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 24, 2019

I’m seeing a constant number of http 401 errors since this morning, which is somewhat unexpected. If this is triggered by the same ip, wouldn’t that be blocked by fail2ban? Last thing we want is people brute forcing accounts.

https://munin.openstreetmap.org/openstreetmap.org/thorn-01.openstreetmap.org/api_calls_status.html

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 24, 2019

Huh? We don't use fail2ban on the api?

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 24, 2019

I can’t see where those requests are coming from and more importantly what endpoint they’re trying to call.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 24, 2019

There are actually two separate sources of 401s - the early one which was a bit higher was trying to scrape ways but with incorrect credentials and then the second one is scraping the map endpoint again with incorrect credentials.

Both are almost certainly just misconfigurations.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 24, 2019

Ok, that’s good. I guess they show up now because of Basic auth being handled by cgimap as well.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 25, 2019

For the record SELECT permission is also needed on changeset_tags and changeset_comments.

I guess we don't need anything else on those even with upload as those aren't changed by the diff upload?

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 25, 2019

Ah, yes. That’s what’s happening if you only look at the code rather than testing cgimap with the respective user permissions.

Changeset tags and comments aren’t needed for the diff upload.
If we decide one day to also create new changesets via cgimap (PUT /api/0.6/changeset/create), those permissions would have to be added then.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 25, 2019

Interestingly it did't return any HTTP error or log anything in the cgimap log by the looks of it but the XML response to thinks like /api/0.6/node/NNN had the permission denied message embedded in it.

So the API error graphs didn't show any change and I only discovered it when somebody complained on IRC that browsing a node on the web site didn't show the node on the map.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 25, 2019

Hmm, that’s not too good. I never ran any tests with one cgimap role for everything, though my expectation was that this would show up as an error when starting up cgimap.
I think it would be good to have an issue for this on the cgimap repo for further investigation.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 25, 2019

So I've routed my IP address to cgimap for uploads and it is failing with the following in the cgimap log:

[2019-05-25T14:40:03 #15126] Started request for changeset/upload 70612621 from 2001:8b0:bd:1:1881:14ff:fe46:3cc7
[2019-05-25T14:40:03 #15126] Executed prepared statement lock_current_ways in 1 ms, returning 2 rows, 2 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement lock_future_nodes_in_ways in 5 ms, returning 13 rows, 13 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement check_current_way_versions in 0 ms, returning 0 rows, 0 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement calc_way_bbox in 6 ms, returning 1 rows, 1 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement delete_current_way_tags in 2 ms, returning 0 rows, 14 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement delete_current_way_nodes in 1 ms, returning 0 rows, 15 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement update_current_ways in 1 ms, returning 2 rows, 2 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement insert_new_current_way_tags in 1 ms, returning 0 rows, 24 affected rows
[2019-05-25T14:40:03 #15126] Executed prepared statement insert_new_current_way_nodes in 1 ms, returning 0 rows, 15 affected rows
[2019-05-25T14:40:03 #15126] Returning with http error 500 with reason ERROR:  column "timestamp" is of type timestamp without time zone but expression is of type bigint

Curiously what apache returns to my browser is an apache generated 503 rather than the 500 cgimap claims to have returned.

The full error from the postgres log is:

2019-05-25 14:36:42 GMT ERROR:  column "timestamp" is of type timestamp without time zone but expression is of type bigint at character 61
2019-05-25 14:36:42 GMT HINT:  You will need to rewrite or cast the expression.
2019-05-25 14:36:42 GMT STATEMENT:
               INSERT INTO ways
                ( SELECT id AS way_id, changeset_id, timestamp, version, visible
                  FROM current_ways
                  WHERE id = ANY($1)
                )
@tomhughes

This comment has been minimized.

Copy link
Member

commented May 25, 2019

Ah, that SQL is relying on the column ordering in the table which is very dangerous. The production database appears to have the ways table ordered differently:

                        Table "public.ways"
    Column    |            Type             |       Modifiers       
--------------+-----------------------------+-----------------------
 way_id       | bigint                      | not null default 0
 timestamp    | timestamp without time zone | not null
 version      | bigint                      | not null
 visible      | boolean                     | not null default true
 changeset_id | bigint                      | not null
 redaction_id | integer                     | 
@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 25, 2019

Thanks a lot for digging through the log files and providing a PR for this issue! It's really weird that I wasn't using explicit column names consistently for all INSERTs.

I already merged both PRs and will release 0.7.1 shortly.

@GerdP

This comment has been minimized.

Copy link

commented May 26, 2019

No idea if this is the right place to mention it: Something is broken, the API returns wrong results for
History: GET /api/0.6/[node|way|relation]/#id/history
See https://forum.openstreetmap.org/viewtopic.php?id=66347
The page https://www.openstreetmap.org/way/158601617/history still shows reasonable data.
The trigger for my research was this issue https://josm.openstreetmap.de/ticket/17760

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 26, 2019

@GerdP : like the JOSM ticket suggests, I could only reproduce this on way/NNN/history. It doesn't occur on nodes and relations.

@GerdP

This comment has been minimized.

Copy link

commented May 26, 2019

Dunno about relations or ways. The result looks like the tags are a join of all versions. For the given way the repated tag description appears 12 times for each version, and there are 12 versions.

@GerdP

This comment has been minimized.

Copy link

commented May 26, 2019

Yes, seems that only ways are affected. My guess: A Join is missing the version number

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 26, 2019

We switched from read only to write backend yesterday and this seems to have uncovered this old bug...

@GerdP

This comment has been minimized.

Copy link

commented May 26, 2019

Not sure if this is related:
https://www.openstreetmap.org/api/0.6/node/324302111/history
shows the versions in a strange order:
<?xml version="1.0" encoding="UTF-8"?> <osm version="0.6" generator="CGImap 0.7.1 (20026 thorn-01.openstreetmap.org)" copyright="OpenStreetMap and contributors" attribution="http://www.openstreetmap.org/copyright" license="http://opendatacommons.org/licenses/odbl/1-0/"> <node id="324302111" visible="true" version="1" changeset="703578" timestamp="2009-01-01T16:29:18Z" user="hoermen" uid="65871" lat="48.0045842" lon="8.1161105"> <tag k="created_by" v="Merkaartor 0.13"/> </node> <node id="324302111" visible="true" version="3" changeset="21013588" timestamp="2014-03-09T20:26:40Z" user="EdAllen" uid="559935" lat="48.0046174" lon="8.1160160"/> <node id="324302111" visible="true" version="2" changeset="845425" timestamp="2009-01-27T02:43:00Z" user="hoermen" uid="65871" lat="48.0045871" lon="8.1160952"> <tag k="created_by" v="Merkaartor 0.13"/> </node> </osm>

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 26, 2019

Please if you have a problem open a new ticket don't just post a comment on an existing ticket as it makes it impossible to determine what problems are dealt with and what is still an issue.

Also please provide full details, don't just say "returns wrong results for History" and expect us to somehow guess what exactly is wrong about them.

So please, new issues (probably at https://github.com/zerebubuth/openstreetmap-cgimap/issues) for both the relation history issue and the node history version ordering.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 26, 2019

I wonder if we should switch back to the read only backend for the time being until those history issues are fixed.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 26, 2019

If you think that is the cause then sure.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 26, 2019

Done - seems to have fixed the version ordering at least.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 26, 2019

Thanks! Way history looks sane again, too: https://www.openstreetmap.org/api/0.6/way/158601617/history

@GerdP

This comment has been minimized.

Copy link

commented May 26, 2019

So please, new issues (probably at https://github.com/zerebubuth/openstreetmap-cgimap/issues) for both the relation history issue and the node history version ordering.
Done. If I got that right they should stay open until the problem is fixed in the write backend?

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 26, 2019

If I got that right they should stay open until the problem is fixed in the write backend?

Thanks for opening the issues. Yes, please leave them open, those issues in write backend need to be fixed, even though we're back on the read only backend at the moment.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 26, 2019

Originally, I wanted to stay with the read-only backend, and have the separate database connection for the update only (=write enabled). The other thing I had in mind was that from time to time, we would have to set the database in read only mode for maintenance, requiring all write activities from cgimap to be disabled.

I think, there's currently no suitable command line parameter to distinguish between
(a) backend end in read only mode, changeset update enabled
(b) everything has to be read only, changeset update disabled.

One option could be to introduce an additional parameter, which is different from readonly, to disable the changeset update for maintenance purposes.

Today, the readonly parameter works as described in the command line help, meaning that the changeset update would be blocked in readonly mode:

--readonly                use the database in read-only mode
@tomhughes

This comment has been minimized.

Copy link
Member

commented May 26, 2019

The readonly mode was specifically designed for use when we are running against a replica and can't even create temporary tables - it was an accident that we were still using it when we didn't need to and as I understand it using the original backend should give better performance.

@don-vip

This comment has been minimized.

Copy link

commented May 26, 2019

So please, new issues (probably at https://github.com/zerebubuth/openstreetmap-cgimap/issues)

It's not always easy to know where to create issues, especially for components that are not hosted under @openstreetmap organization. I was about to create a new issue at https://github.com/openstreetmap/openstreetmap-website if Gerd hadn't linked this ticket to the JOSM bug tracker.

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 27, 2019

@don-vip : in any case, please do report it on OSM Github repos, even it's the wrong one. I've just seen your tweet https://twitter.com/VincentPrivat/status/1132600691850194944 today, which is like 2 days too late.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 27, 2019

@mmd-osm are you likely to do a new build with the latest fixes soon?

@mmd-osm

This comment has been minimized.

Copy link
Author

commented May 27, 2019

Yes, new release 0.7.2 is out since a few minutes.

Now that the API write feature is no longer disabled when using the readonly backend, there's no immediate need to switch to the write backend at the same time. I think that's better as it avoids too many changes at the same time.

I was thinking to do a more thorough review of the write backend queries, but due to this change, we could do this as a separate activity.

@tomhughes

This comment has been minimized.

Copy link
Member

commented May 28, 2019

This now fully deployed and all uploads are now being processed with cgimap.

@tomhughes tomhughes closed this May 28, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.