Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Region mapping advanced matching #924

Merged
merged 105 commits into from Oct 15, 2015
Merged

Region mapping advanced matching #924

merged 105 commits into from Oct 15, 2015

Conversation

@stevage
Copy link
Collaborator

stevage commented Sep 21, 2015

Significant refactoring of the region mapping code. Major benefits:

  • Region mapping code split out from CsvCatalogItem to RegionProvider, RegionProviderList, DataVariable, DataTable and TableDataSource
  • added functionality: matching by regex (enabling matching LGA names), disambiguation using a second column
  • dozens of new unit tests
  • fixed many edge cases, particularly with drag-and-drop CSV files (which lack explicit styling).
  • temporal datasets work much better: each datapoint by default is shown until the next time point.
  • new architecture will make it much easier to add new region provider types
stevage added 30 commits Jul 1, 2015
discard CSV if no rows match.
LGAs using this method though.
- fuzzy matches ('City of Blah' = 'Blah (C)'), through regexes
- multi-column matches (State+LGA for disambiguation)
Conflicts:
  lib/Models/CsvCatalogItem.js
  lib/Styles/PopupMessage.less
issue with "lone_person" column classification as longitude.
@stevage
Copy link
Collaborator Author

stevage commented Sep 29, 2015

Hmm I'd argue we shouldn't even be trying to serialize RegionProvider and RegionProviderList for sharing. Is there a public property somewhere that shouldn't be public?

Fixed.

stevage added 6 commits Sep 29, 2015
@kring
Copy link
Member

kring commented Sep 30, 2015

There are still performance problems here. Enable the Age layer under ABS, and then switch to SA2. On my system the app freezes for about 5 seconds. On nationalmap.gov.au there is no freezing.

@kring
Copy link
Member

kring commented Sep 30, 2015

Here's where the time is spent:
image

@kring
Copy link
Member

kring commented Oct 1, 2015

Some ideas for making codeMatchesRegionID much faster:

  • Don't do trim, toLowerCase, etc. any more than necessary. Those operations add up because they make a copy of the string.
  • When IDs are numeric (as they are in the SAx case), don't convert them to strings. Comparing numbers is much faster than comparing strings.
@kring
Copy link
Member

kring commented Oct 1, 2015

And one more:

  • Construct RegExp instances once. Currently a new instance is constructed for each regex string for every attempted match. Unless the browser is being clever, this should result in a huge speedup. Even if the browser is clever, it'll still be noticeably faster.

This one won't help with the ABS case, though, because there are no replacements.

stevage added 7 commits Oct 1, 2015
…nced-matching

Conflicts:
	lib/Map/DataTable.js
* cesiumUpgrade:
  Removed unused var.
  Make handleInitialMessage() actually call the callback if no message is to be shown.
  Update CHANGES.md.
  Use terriajs-cesium 1.13.0.
@stevage
Copy link
Collaborator Author

stevage commented Oct 5, 2015

There are still performance problems here. Enable the Age layer under ABS, and then switch to SA2. On my system the app freezes for about 5 seconds.

This is intriguing to me - I don't get this on my Macbook Pro. I enable the layer, and immediately grab the map and start panning around. It takes maybe 2-3 seconds for all the SA2s to display, but the app is responsive and panning during that time. There are two brief glitches (<0.5 seconds) when you could say it's "frozen", but nothing like what you're seeing.

I'll still try to fix it. :)

@stevage
Copy link
Collaborator Author

stevage commented Oct 6, 2015

Ok, the really inefficient bit was applying all the replacements to each region code every time it was checked against each server side ID. That was dumb.

Currently on my Macbook, the whole call to updateRegionMapping takes 150ms for the ABS SA2 case. And the whole block of loadRegionsFromXML + updateRegionMapping + finishTableLoad (everything under loadWithXhr.load.xhr.onload) takes 215ms.

The equivalent running on a fresh master build on my machine is actually slower: 250ms. Significantly faster running on nationalmap.research.nicta.com.au, which suggests that the gulp release process is actually doing something good :)

@stevage
Copy link
Collaborator Author

stevage commented Oct 6, 2015

Don't do trim, toLowerCase, etc. any more than necessary. Those operations add up because they make a copy of the string.

Removed a few of these.

When IDs are numeric (as they are in the SAx case), don't convert them to strings. Comparing numbers is much faster than comparing strings.

"Numeric" IDs come out of xml2json as strings. It's possible to convert them there to integers, but there are messy edge cases (leading zeroes) to deal with. Leaving this out for now.

Construct RegExp instances once.

Done.

Thanks very much for these tips btw, I'm definitely learning a lot about writing less sucky JavaScript :)

kring added 2 commits Oct 15, 2015
kring added a commit that referenced this pull request Oct 15, 2015
Region mapping advanced matching
@kring kring merged commit a63619c into master Oct 15, 2015
2 of 4 checks passed
2 of 4 checks passed
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/travis-ci/push The Travis CI build is in progress
Details
clahub All contributors have signed the Contributor License Agreement.
Details
licence/cla Contributor License Agreement is signed.
Details
@kring kring deleted the region-mapping-advanced-matching branch Oct 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.