You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Con: More contention; mutex also used for making labels in use. Faster to add another?
Con: This puts t->clinched_segments.push_back(hs); within the lock. Possible things to do include:
Leave it there and accept the performance hit. Is there a net gain?
Create an array of bools to track insertion success, iterate it, then push? Prolly not; sounds almost as expensive as just pushing within the lock.
Allow redundant data to be added to t->clinched_segments, creating a performance hit when augmenting concurrencies. Either...
Iterate again after the lock, pushing everything
Have TravelerList::clinched_segments store pairs of waypoints. Push beginning & end once per call.
Pro: Less pushing, less reallocation.
Meh: Less iteration thru vector is replaced by more iteration from 1st to last segments.
Slightly more complex overall? For example, single-segment entries with 1st & last segments equal.
Con: Potential for redundant data. How much will this bog down concurrency augments?
Pro: Less stuff to pull in from memory.
For concurrency augments, multiple successive segments are likely to have the same route due to being added in series by Route::store_traveled_segments. We could...
Detect whether the route has changed since previous segment; unlock & lock accordingly.
Con: A little overhead in managing variables for that. Branching.
If using begin/end pairs as suggested above, optionally skip the check for route change.
Pro: We already have groups of segments from one route. Lose the overhead.
Con: There will often be multiple non-contiguous entries from same route; we lose whatever advantage that'd bring. Fewer checks necessary -- between groups of, rather than individual, segments.
Nope. The source route has all those successive segments. The target segment needs the lock. We don't know what our concurrent route is until iterating the concurrency list. Concurrencies of 3+ routes guarantee different successive routes. A previous route check may still benefit 2-plexes, but there's added complexity. Would it save time?
Or alternately, any of a number of more esoteric, incompletely-thought-out ideas to go mutex-free, involving 2D arrays, a separate threaded task for storing traveler with segments, etc. How about this?
For each HighwaySystem Region, clinched_segments = new std::vector<std::pair<HighwaySegment*,size_t>>[args::numthreads];
Convert TravelerList::clinched_segments to a TMBitset.
Nope. Not all stored in contiguous memory. Would have to do one of...
Use an unordered_set. Yecch.
Move all segments into contiguous memory after creation, then set up a TMSlice. Would the benefit be worth the cost?
LOL pointers to pointers. Admittedly a half-baked idea. How would I get these, let alone inexpensively?
Accept some redundancy. Would it be so bad? There's already redundancy in the .list files, and it's taken care of. Redundant segments comprise only 0.34% of segments .listed. Plus, this opens the door to using begin/end pairs.
Get threadnum into Route::store_traveled_segments, however I wanna do it.
When insertion succeeds, system->clinched_segments[threadnum].emplace_back(hs, index);
New threaded task iterates thru HighwaySystems Regions. For each,
Iterate through the vectors in clinched_segments, then thru the pairs in each vector. si.first->clinched_by.add_index[si.second];
delete[] clinched_segments;
Why regions?
For initial storage, a segment can be in only one regions & thus only one thread. No mutex needed.
For concurrency augments, concurrencies across different systems are commonplace. In theory, a concurrency list should never span multiple regions outside of testing a region split. But it can happen: By accident, like that one Israel/Palestine route that one time. Or in theory, routes along a border like AB/SK or Texarkana. Sure, these cases have the border segments in just one region, but it could happen. For robustness' sake, best to still use a lock. So while we'll still have the overhead, this will minimize contention.
This all comes with its own overhead, and may or may not be faster than whatever the fastest option above is.
For siteupdateST, skip this extra step and do things more or less as done now.
Con: More mutex contention; multiple travelers could be trying to add the same route even if not the same segment.
No noticeable difference. With so many routes in the system, the odds of 2+ travelers trying to add the same one at the same time are already vanishingly small. Performs better.
Con: Indirection.
No, scratch that. Same amount.
Con: More contention; mutex also used for marking labels in use. Faster to add another?
Still not enough contention to degrade performance.
On the contrary, 1 mutex performs better than 2, whether it's smaller sizeof, just noise in the data, or whatever.
So let's keep the code simple & save RAM.
t->clinched_segments.push_back(hs); yadda yadda
Leaving as is & accepting "the performance hit" is already a big performance boost.
Plus, ConcAugThread & TravelerList::clinched_segments can be removed entirely.
See TravelMapping#151 & 0b6711d on BiggaTomato.
Deleted branch on BiggaTomato
y253 @ e7dad8c, more locks/unlocks, fewer ternaries
Benchmark. Does this perform better or worse?
Appears to perform every-so-slightly worse. Plus, parent had cleaner code.
Undid a few things from parent, 90f08a1:
Removed 1 ternary each from beginning & ending chopped route
Restored 5 equivalent lines (incl. else block) to forward or backward? block just above.
Plus mutexes: r1->mtx.lock(); r1->mark_label_in_use(fields[5]); r1->mtx.unlock(); etc.
Carved out of #221:
Eliminating HighwaySegment::clin_mtx would get
sizeof(HighwaySegment)
down to 72 B, with potential to get even smaller. This could potentially speed up multiple tasks.Calls to add_clinched_by could be removed, instead calling TMBitset::add_index directly and locking/unlocking
Route::mtx
before/after, however's most efficient.Overall pros & cons:
sizeof(HighwaySegment)
.Route::store_traveled_segments could handle all the segments it needs to with one lock/unlock per call.
t->clinched_segments.push_back(hs);
within the lock. Possible things to do include:t->clinched_segments
, creating a performance hit when augmenting concurrencies. Either...TravelerList::clinched_segments
storepair
s of waypoints. Push beginning & end once per call.Slightly more complex overall? For example, single-segment entries with 1st & last segments equal.
For concurrency augments, multiple successive segments are likely to have the same route due to being added in series by
Route::store_traveled_segments
. We could...If using begin/end pairs as suggested above, optionally skip the check for route change.Pro: We already have groups of segments from one route. Lose the overhead.Con: There will often be multiple non-contiguous entries from same route; we lose whatever advantage that'd bring. Fewer checks necessary -- between groups of, rather than individual, segments.Or alternately, any of a number of more esoteric, incompletely-thought-out ideas to go mutex-free, involving 2D arrays, a separate threaded task for storing traveler with segments, etc. How about this?
HighwaySystemRegion,clinched_segments = new std::vector<std::pair<HighwaySegment*,size_t>>[args::numthreads];
ConvertTravelerList::clinched_segments
to aTMBitset
.Nope. Not all stored in contiguous memory. Would have to do one of...
unordered_set
. Yecch.threadnum
intoRoute::store_traveled_segments
, however I wanna do it.system->clinched_segments[threadnum].emplace_back(hs, index);
HighwaySystemsRegions. For each,si.first->clinched_by.add_index[si.second];
delete[] clinched_segments;
This all comes with its own overhead, and may or may not be faster than whatever the fastest option above is.
For siteupdateST, skip this extra step and do things more or less as done now.
See also TravelMapping#151.
The text was updated successfully, but these errors were encountered: