Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use geoname_id as the only lookup method to update records #144

Merged
merged 14 commits into from
Feb 22, 2017

Conversation

max-arnold
Copy link
Collaborator

This was suggested some time ago in the mailing list: https://groups.google.com/forum/#!topic/yourlabs/aYwK3UgbAd8

Changelog:

  • Geoname_id is the only lookup method during row update procedure (complex lookup logic is gone)
  • All fields are updated, including slugs (use --keep-slugs to prevent that). Previously, updates in geonames.org's source files weren't propagated to the database. Also fixed bizarre behaviour when alternate_names/search_names accumulated after each update
  • Geoname_id field is visible/searchable in Django admin
  • Several new tests for update procedure, to check that records are correctly updated or added, including --noinsert option. Import/update tests now inherit from the same base class.
  • New (failing) test to check that records are removed. The test is disabled, because removal is not yet implemented
  • Speed improvements

Unfortunately, this didn't fix #110 and the improvement in number of imported cities is very low. As is turned out, lots of cities are skipped due to database constraints unique_together = (('region', 'name'), ('region', 'slug')). Removal of these constraints is out of scope of this branch.

@codecov-io
Copy link

codecov-io commented Feb 20, 2017

Codecov Report

Merging #144 into master will increase coverage by 1.59%.
The diff coverage is 96.64%.

@@            Coverage Diff            @@
##           master    #144      +/-   ##
=========================================
+ Coverage    92.1%   93.7%   +1.59%     
=========================================
  Files          20      22       +2     
  Lines        1140    1191      +51     
=========================================
+ Hits         1050    1116      +66     
+ Misses         90      75      -15
Impacted Files Coverage Δ
cities_light/admin.py 85.71% <ø> (ø)
cities_light/tests/test_import.py 100% <100%> (ø)
cities_light/tests/test_fixtures.py 100% <100%> (ø)
cities_light/receivers.py 88.73% <100%> (+1.06%)
cities_light/tests/test_form.py 100% <100%> (ø)
cities_light/abstract_models.py 95.89% <100%> (-0.11%)
cities_light/settings.py 100% <100%> (ø)
cities_light/tests/base.py 100% <100%> (ø)
cities_light/tests/test_unicode.py 96% <100%> (-1.15%)
cities_light/tests/test_update.py 90% <90%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c4cc1d...e7f5ab6. Read the comment docs.

@max-arnold
Copy link
Collaborator Author

max-arnold commented Feb 21, 2017

@jpic If you have some time, review/approval would be nice.

@jpic
Copy link
Member

jpic commented Feb 21, 2017

Well, I've read it and you know the story: give a developer 10 lines of code it'll find 10 problems, give him 1000 lines and it'll say "no problem !". But if we take time I'd be happy to talk about it ;)

I've seen some patching feature, is this enabling you to patch geonames data before import like we discussed at europycon ?

Perhaps a good improvement to this PR could be done in the docs/ directory ?

@max-arnold
Copy link
Collaborator Author

max-arnold commented Feb 21, 2017

No, there is no patching yet. Just more robust update procedure to actually update the data. The only field which does not change during update is geoname_id. The main commit in this branch is 4448ee2

The docs have very minor update regarding new --keep-slugs option.

@max-arnold
Copy link
Collaborator Author

Patcher will be developed in a separate branch, but there is a long way to go: https://github.com/max-arnold/django-cities-light/commits/rewrite

@jpic
Copy link
Member

jpic commented Feb 21, 2017

Awesome, you've proven your dedication by supporting our userbase, you've got my approval to deploy this and move on, if you want me to actually try the code and make a proper line by line review I can try this week end.

@max-arnold
Copy link
Collaborator Author

Ok, I'll merge it to the master tomorrow. Also I think it will be wise to wait some time before making a new release, giving people a chance to spot possible problems in the master branch.

@max-arnold max-arnold merged commit 01d13a5 into yourlabs:master Feb 22, 2017
@max-arnold max-arnold deleted the update-by-geoname-id branch April 27, 2017 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not importing all cities
4 participants