Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perform lat long matching respecting case #161

Merged
merged 1 commit into from
Jul 10, 2018
Merged

perform lat long matching respecting case #161

merged 1 commit into from
Jul 10, 2018

Conversation

jameshadfield
Copy link
Member

Previously, the geo demes (in the metatdata.tsv) were converted into lowercase for matching against the parsed lat_longs file. The demes were then saved to the meta.JSON in lowercase, which meant that they may not match the demes as saved on the tree nodes (which wern't converted to lowercase).

This PR parses the geographic resolution & demes from the lat_longs file in lowercase, and then matches the metadata.tsv resolution & demes in lowercase, however the demes are exported as they appeared in the metadata.tsv. This allows (e.g.) WA to be specified in the metadata.tsv, get the lat_longs from wa in the lat_longs file, but still be exported on both the nodes & meta JSONs as WA

Resolves #160

@trvrb
Copy link
Member

trvrb commented Jul 10, 2018

Thanks for fixing this @jameshadfield.

One request: I'd strongly encourage you to keep casing matched between metadata in the build, ie WA and casing provided in WNV/config/lat_longs.tsv:

state WA 47.572116 -120.602696

for reasons of clarity.

Also, I'd remove this change: https://github.com/nextstrain/augur/pull/161/files#diff-b5772e65b5a8175ae1a542a1c7ad0e38R249

Don't want this side-effect when reading in a file. You're already lowercase when doing the match, so this shouldn't matter. Basically, I'd want a policy of not mutating values that come in.

@jameshadfield
Copy link
Member Author

After speaking with @trvrb this PR has been amended to require exact case matching between the lat_long names and the metadata names

@jameshadfield jameshadfield changed the title perform lat long matching in lowercase perform lat long matching respecting case Jul 10, 2018
@trvrb
Copy link
Member

trvrb commented Jul 10, 2018

Thank you for humoring me here @jameshadfield.

@trvrb trvrb merged commit 60fce84 into master Jul 10, 2018
@trvrb trvrb deleted the 160 branch July 10, 2018 23:10
@trvrb
Copy link
Member

trvrb commented Jul 13, 2018

@rneher: I was worried that lowercase matching of metadata to lat/longs would result in people doing things like naming their metadata as Washington and their lat/long file as washington and have it work, but then when they include NorthDakota as metadata and north_dakota as lat/long it will not work for North Dakota mysteriously.

If we do go with lowercase matching, then I think the change should also apply to colors.

@rneher
Copy link
Member

rneher commented Jul 13, 2018

Given that we supply a lat-long file as source data, I would expect it to work with Mexico and mexico. I certainly was surprised when using metadata from VIPR complained about canonical country names not being found despite them being in the file. A more generous matching here would reduce barrier of entry in my opinion. same for colors.

@trvrb
Copy link
Member

trvrb commented Jul 13, 2018

@rneher: Okay. This makes sense. Let's do matching in lowercase, but preserve casing when exporting metadata to the tree. (I know this was your original proposal @jameshadfield, sorry about that)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants