New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce attic query memory consumption #429

Open
mmd-osm opened this Issue Sep 3, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@mmd-osm
Contributor

mmd-osm commented Sep 3, 2017

Test case

[date:"2017-01-01T00:00:00"][out:json][timeout:4000];
area["type"="boundary"]["ISO3166-2"="DE-NW"];
foreach(
    node(area)[~"^(opening_hours|opening_hours:kitchen|opening_hours:warm_kitchen|happy_hours|delivery_hours|opening_hours:delivery|lit|smoking_hours|collection_times|service_times|fee)$"~"."][fee!=no][fee!=yes][lit!=no][lit!=yes]->.t; .t out tags;
    way(area)[~"^(opening_hours|opening_hours:kitchen|opening_hours:warm_kitchen|happy_hours|delivery_hours|opening_hours:delivery|lit|smoking_hours|collection_times|service_times|fee)$"~"."][fee!=no][fee!=yes][lit!=no][lit!=yes]->.t; .t out tags;
);

Issue

Query uses too much memory and aborts

Analysis

collect_items.h:

template < class Index, class Object, class Current_Iterator, class Attic_Iterator, class Predicate >
void collect_items_by_timestamp(const Statement* stmt, Resource_Manager& rman,
                   Current_Iterator current_begin, Current_Iterator current_end,
                   Attic_Iterator attic_begin, Attic_Iterator attic_end,
                   const Predicate& predicate, uint64 timestamp,
                   std::map< Index, std::vector< Object > >& result,
                   std::map< Index, std::vector< Attic< Object > > >& attic_result)
{
  std::vector< std::pair< typename Object::Id_Type, uint64 > > timestamp_by_id;

  reconstruct_items(stmt, rman, current_begin, current_end, predicate, result, timestamp_by_id, timestamp);
  reconstruct_items(stmt, rman, attic_begin, attic_end, predicate, attic_result, timestamp_by_id, timestamp);

  std::sort(timestamp_by_id.begin(), timestamp_by_id.end());

Line 205 reconstruct_items(stmt, rman, current_begin, current_end, predicate, result, timestamp_by_id, timestamp);

Result: timestamp_by_id = std::vector of length 72374789

Line 206 reconstruct_items(stmt, rman, attic_begin, attic_end, predicate, attic_result, timestamp_by_id, timestamp);

Result: timestamp_by_id = std::vector of length 75855450

Most objects in timestamp_by_id have been stored with timestamp NOW. By separating those entries in two separate vectors, almost half of the memory could be saved:

  • Vector 1: timestamp_by_id_attic: all entries where timestmap != NOW, format as today
  • Vector 2: timestamp_by_id_now: only entries where timestamp == NOW. It is implicitly clear that any object id in this vector has timestamp NOW, i.e. we only have to store object ids but no timestamp.

Calculation:

(timestamp_by_id_now * 8 bytes + timestamp_by_id_attic * 16 bytes) / (timestamp_by_id * 16 bytes)

(72374789 * 8 + (75855450 - 72374789) * 16) / (75855450 * 16)
= 0.5229427219797654

(that's based on 64bit node ids, savings in % will be larger for way/relation ids).

Testcase 2

[timeout:1200][adiff:"2017-08-06T22:39:23Z","2017-08-06T22:43:13Z"];
(node(36.3181693,5.5767073,47.8357181,18.9969694)(changed);
way(36.3181693,5.5767073,47.8357181,18.9969694)(changed););
out meta geom(36.3181693,5.5767073,47.8357181,18.9969694);

Current:

	User time (seconds): 126.50
	System time (seconds): 9.21
	Percent of CPU this job got: 85%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 2:38.45
	Maximum resident set size (kbytes): 8573240               <<<<<<
	Minor (reclaiming a frame) page faults: 7166159
	Page size (bytes): 4096
	Exit status: 0

Improved:

	User time (seconds): 110.13
	System time (seconds): 5.31
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 1:55.71
	Maximum resident set size (kbytes): 4379196               <<<<<<
	Minor (reclaiming a frame) page faults: 3409217
	Page size (bytes): 4096
	Exit status: 0

mmd-osm added a commit to mmd-osm/Overpass-API that referenced this issue Sep 5, 2017

@mmd-osm

This comment has been minimized.

Show comment
Hide comment
@mmd-osm

mmd-osm Sep 30, 2017

Contributor

See mmd-osm@eac8cad for a test implementation.

(reminder: also #174 needs revisiting).

Contributor

mmd-osm commented Sep 30, 2017

See mmd-osm@eac8cad for a test implementation.

(reminder: also #174 needs revisiting).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment