Zip #2012

lainsworth8801 · 2019-11-12T17:37:29Z

Issue #1480
Added leading zeros for postal codes under 5 digits;
Added leading zeros for postal codes with 4 digits extension;
Added test excel sheet with postal codes added (fake column name for user mapping purpose).

original uploaded:

Map Data	Property Saved

nllong · 2019-11-12T17:39:59Z

@lainsworth8801 -- thanks for the PR. Can you add the issue ID into the PR description? That way we know what issue to test. Thanks!

nllong

Looking good. This seems like it should work. Few changes:

Add tests
update this branch with current develop

Then re-tag me to review.

Thanks!

…r property and taxlot postal code.

coveralls · 2019-11-13T18:43:20Z

Coverage increased (+0.03%) to 75.724% when pulling 86af8b5 on zip into 321745c on develop.

nllong

Thanks for the updates! I made a small change to support the owner postal code as well.

Is a zip code of 0 suppose to stay 0 or resolve to 00000?

Also, note that the mapping screen does not pad the zip codes; however, I think this is right because the user is seeing the raw data from the spreadsheet. Once they choose to map to postal code then it will get converted.

Also, do you think we should go back and update all the zip codes in the database with a migration (and rehash)? I think this is mostly a question for @adrian-lara, @axelstudios and/or @ClearlyEnergy.

nllong · 2019-11-18T16:14:07Z

seed/data_importer/tests/integration/test_data_import.py

+                if '-' in ts.postal_code:
+                    self.assertEqual(len(ts.postal_code.split('-')[0]), 5)
+                    self.assertEqual(len(ts.postal_code.split('-')[1]), 4)
+                    self.assertEqual(ts.postal_code.split('-')[0].lstrip('0'),


making these explicit checks would be nice, (i.e., something like self.assertEqual(ts.postal_code.split('-')[0].lstrip('0'), '05005'))

I can change that. Was thinking about doing a general format check instead of spot checking.

For whether '0' zip should stay '0' or be packed with 0s, my reason of packing it with 0s is to show users that this is supposed to be a zip code but it's 0. Although I'm not sure if this is a better idea than leaving it as it is which is just a single digit 0.

Yeah, I think we want this to be packed, that is 0 -> 00000. Right now it doesn't appear to do that.

nllong · 2019-11-18T16:14:53Z

seed/lib/mcm/mapper.py

        table_name, mapped_column_name, display_name, is_extra_data = mapping.get(raw_column_name)

+        # special postal case:
+        if mapped_column_name == 'postal_code':


looks good.

nllong · 2019-11-18T16:35:14Z

That would be nice. Thank you. It allows the developer to see what it is expecting instead of having to .split and .zfill.

lainsworth8801 · 2019-11-18T16:36:36Z

seed/lib/mcm/mapper.py

+        if mapped_column_name in ['postal_code', 'owner_postal_code']:
            if column_value:
                if '-' in str(column_value):
                    postal = str(column_value).split('-')[0].zfill(5)


Thanks for catching the owner_postal_code.

@nllong I recalled wrong...the reason it's not packing 0 zip with 0s is because i checked column_value here before I convert. Will change that.

adrian-lara · 2019-11-18T17:50:27Z

Also, do you think we should go back and update all the zip codes in the database with a migration (and rehash)? I think this is mostly a question for @adrian-lara, @axelstudios and/or @ClearlyEnergy.

I think a migration to update zip codes on pre-existing records and recalculating hash_object would probably be needed.

lainsworth8801 · 2019-11-18T18:08:40Z

seed/lib/mcm/mapper.py

-                    ext = str(column_value).split('-')[1].zfill(4)
-                    column_value = postal + '-' + ext
-                column_value = str(column_value).zfill(5)
+            if '-' in str(column_value):


axelstudios · 2019-11-18T18:23:02Z

Yeah, I like the idea of a migration to fix existing data and then rehash

RDmitchell · 2019-11-18T19:05:35Z

@axelstudios -- I would like testing the migration on existing data to be something that is in the Test column of the Dec 19 Github project. Not sure how to do that, but if a PR comes through for that migration, I will add it to the Test column.

nllong · 2019-11-19T16:12:37Z

@lainsworth8801 -- can you add in a migration for this? See this as an example: https://github.com/SEED-platform/seed/blob/develop/seed/migrations/0111_rehash.py
Thanks!

nllong · 2019-12-06T18:09:03Z

@lainsworth8801 -- can you add in a migration for this? See this as an example: https://github.com/SEED-platform/seed/blob/develop/seed/migrations/0111_rehash.py
Thanks!

Hey @lainsworth8801 -- can you implement this rehashing migration before we merge this PR?

lainsworth8801 · 2019-12-06T18:19:56Z

@nllong yes sorry was gonna work on this next

nllong

@lainsworth8801 Thanks! Did you test running this migration on a dump of the existing production database?

lainsworth8801 · 2019-12-10T22:57:32Z

I only ran it on my local db which is a very small size db. i'll add a run on the production.

nllong · 2019-12-10T23:01:02Z

I only ran it on my local db which is a very small size db. i'll add a run on the production.

Thanks. Mainly concerned because you are updating the postal codes here and want to make sure that it works. What if postal code is a number/string/double/etc.

adrian-lara · 2019-12-11T22:46:28Z

I think a migration to update zip codes on pre-existing records and recalculating hash_object would probably be needed.

Back tracking on this previous comment after conversations with Lin and a closer look at the proposed migration. I don't think an additional rehashing is needed after updating the postal_code values since the .save() is being run for each -State object. SEED already triggers an update of hashes given this command.

@nllong @axelstudios What do you guys think?

nllong · 2019-12-11T23:50:09Z

I think you are right @adrian-lara. Since we are calling .save(), then we only need to find the records that need to be updated. I think we should run with SQL (or Django)... something like:

from django.db.models import Q
PropertyState.objects.filter(Q(postal_code__iregex=r'\b\d{4,5}-\d{1,4}\b/') | Q(postal_code__iregex='\b\d{4,5}\b'))

I did not test the above command... but this would save iterating over all the records.

adrian-lara · 2019-12-12T00:10:01Z

seed/migrations/0114_rehash_postal_code.py

+
+    property_sql = (
+        "UPDATE seed_propertystate " +
+        "SET created = seed_propertyauditlog.created, updated = seed_propertyauditlog.created " +


I think we'd only need to restore the updated field and not the created field here and in the tax lot query.

I actually think we don't need to SET the created nor updated field in either of these... just the postal_code, right?

I updated my comment (meant to switch created and updated). I'd think we'd need to "revert" the updated field since changing postal_codes one at a time using save() will change the updated time for each record.

I guess, I'm just assuming we wouldn't actually want the updated times to change, but I could definitely understand wanting those values to change.

Ah, that makes sense. I think it is okay if the updated times change as it really is a change.

adrian-lara · 2019-12-12T00:10:51Z

I think you are right @adrian-lara. Since we are calling .save(), then we only need to find the records that need to be updated. I think we should run with SQL (or Django)... something like:
from django.db.models import Q
PropertyState.objects.filter(Q(postal_code__iregex=r'\b\d{4,5}-\d{1,4}\b/') | Q(postal_code__iregex='\b\d{4,5}\b'))
I did not test the above command... but this would save iterating over all the records.

Yeah something like that would help decrease the number of records getting updated.

We need to handle the rehash better. See comments in the PR.

nllong · 2019-12-14T03:12:12Z

@lainsworth8801 -- did you have a chance to update this rehash method to filter down to only the fields that are going to be updated? Let me know if you can get to it otherwise I'll work on it.

lainsworth8801 · 2019-12-14T03:59:00Z

@nllong was trying to fix the Xcode version issue for timescales on Friday to work on this but didn’t get far. I’ll be back on Monday it could wait through the weekend. My apologies.

nllong · 2019-12-16T13:42:05Z

@lainsworth8801 I just updated the migration to fix the dependency error. I think this PR is really close can you update the migration to only find records that have 4 zip code characters and update only those records? As Adrian points out you only need to run .save() to have the hash get updated.

Note that you probably will need to recreate your local database since I merged in some other db migrations.

lainsworth8801 · 2019-12-16T13:53:01Z

@nllong definitely. Thanks nick.

lainsworth8801 added 4 commits November 12, 2019 10:11

Added leading zeros to postal code for w/ and w/o extension cases.

065c7a1

Modify data properties excel with postal codes for testing

c5c2d0b

clean up

212bcbf

mapper cleanup

2283bf4

lainsworth8801 requested a review from nllong November 12, 2019 17:37

nllong requested changes Nov 12, 2019

View reviewed changes

lainsworth8801 added 2 commits November 12, 2019 14:02

Merge branch 'develop' of https://github.com/SEED-platform/seed into zip

691844b

Modify postal excel file and util.py for testing; added test cases fo…

4c5f944

…r property and taxlot postal code.

lainsworth8801 requested a review from nllong November 13, 2019 17:58

nllong previously approved these changes Nov 18, 2019

View reviewed changes

lainsworth8801 commented Nov 18, 2019

View reviewed changes

Added leading zeros to 0 zip; modified test to do explicit check

d2c39ec

lainsworth8801 force-pushed the zip branch from e3e3fe3 to d2c39ec Compare November 18, 2019 17:46

Added owner postal for filling zeros

a9bc1d7

lainsworth8801 commented Nov 18, 2019

View reviewed changes

lainsworth8801 requested a review from nllong November 18, 2019 18:09

nllong added the DO NOT MERGE label Dec 6, 2019

Db rehash with zero-packed zips

e970d47

lainsworth8801 force-pushed the zip branch from 99bcffa to 53c1583 Compare December 8, 2019 00:49

Merge branch 'develop' of https://github.com/SEED-platform/seed into zip

a8a52cd

lainsworth8801 force-pushed the zip branch from 53c1583 to a8a52cd Compare December 9, 2019 20:15

lainsworth8801 added 2 commits December 9, 2019 14:25

rerun travis

8da33ee

Merge branch 'develop' of https://github.com/SEED-platform/seed into zip

3f76064

nllong reviewed Dec 10, 2019

View reviewed changes

lainsworth8801 added 3 commits December 10, 2019 16:44

Merge branch 'develop' of https://github.com/SEED-platform/seed into zip

a751163

Minor tweak

1b7b06f

Merge branch 'develop' of https://github.com/SEED-platform/seed into zip

88993b3

lainsworth8801 force-pushed the zip branch from 7f1a8d3 to 88993b3 Compare December 11, 2019 17:13

adrian-lara reviewed Dec 12, 2019

View reviewed changes

Merge branch 'develop' into zip

1dd389c

Merge branch 'develop' into zip

024c3eb

fix migration order

5e8115f

fix migration to only find postal codes of length 4 to fix

1864338

nllong removed the DO NOT MERGE label Dec 16, 2019

Merge branch 'develop' into zip

86af8b5

nllong merged commit 1746131 into develop Dec 19, 2019

nllong deleted the zip branch December 19, 2019 18:34

Zip #2012

Zip #2012

Uh oh!

Conversation

lainsworth8801 commented Nov 12, 2019 • edited by nllong Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nllong commented Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nllong left a comment

Choose a reason for hiding this comment

Uh oh!

coveralls commented Nov 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nllong left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nllong commented Nov 18, 2019 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrian-lara commented Nov 18, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

axelstudios commented Nov 18, 2019

Uh oh!

RDmitchell commented Nov 18, 2019

Uh oh!

nllong commented Nov 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nllong commented Dec 6, 2019

Uh oh!

lainsworth8801 commented Dec 6, 2019

Uh oh!

nllong left a comment

Choose a reason for hiding this comment

Uh oh!

lainsworth8801 commented Dec 10, 2019

Uh oh!

nllong commented Dec 10, 2019

Uh oh!

adrian-lara commented Dec 11, 2019

Uh oh!

nllong commented Dec 11, 2019

Uh oh!

adrian-lara Dec 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrian-lara commented Dec 12, 2019

Uh oh!

nllong commented Dec 14, 2019

Uh oh!

lainsworth8801 commented Dec 14, 2019

lainsworth8801 commented Nov 12, 2019 •

edited by nllong

Loading

nllong commented Nov 12, 2019 •

edited

Loading

coveralls commented Nov 13, 2019 •

edited

Loading

nllong left a comment •

edited

Loading

nllong commented Nov 18, 2019 via email •

edited

Loading

nllong commented Nov 19, 2019 •

edited

Loading

adrian-lara Dec 12, 2019 •

edited

Loading