Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special postal codes not handled #4

Closed
kinow opened this issue Dec 1, 2020 · 8 comments
Closed

Special postal codes not handled #4

kinow opened this issue Dec 1, 2020 · 8 comments
Labels
enhancement New feature or request

Comments

@kinow
Copy link

kinow commented Dec 1, 2020

Hi,

Saw that the version 0.2.0 was out and that it had migrated from JSON to Sqlite. I've never used a Python library that does that (not that I am aware) nor packaged one. So decided to try and see if that worked, if that'd be slow, etc.

Installation was super smooth 👍 no issues found.

Then decided to test with a random address. Picked Tamana/Kumamoto (my distant family hometown), then googled a random address, and found this website: https://www.town.nagasu.lg.jp/default.html

The footer of the page contains: " 〒869-0198 熊本県玉名郡長洲町大字長洲2766番地 Tel:0968-78-3111 Fax:0968-78-1092"

I got the postal code, and tried the following code:

>>> import posuto
>>> posuto.get('〒869-0198')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/venv/lib/python3.8/site-packages/posuto/posuto.py", line 52, in get
    base = dict(_fetch_code(code))
  File "/tmp/venv/lib/python3.8/site-packages/posuto/posuto.py", line 21, in _fetch_code
    raise KeyError("No such postal code: " + code)
KeyError: 'No such postal code: 8690198'
>>> 

Searching the same postal code on Google.co.jp returns the right location on the map.

Untitled

Not sure how to provide a pull request, but thought it could be useful to report this missing postal code?

Anyway, great library, and nice trick of including an sqlite DB, might come in handy some day.

Thanks!
Bruno

@polm
Copy link
Owner

polm commented Dec 1, 2020

Glad you had no trouble with the library except for the missing code!

This had me puzzled for a while, but it turns out this is a special postal code, known as a 大口事業所個別番号. So it seems it's only used for that one building. You can read more about the codes here:

https://www.post.japanpost.jp/zipcode/dl/jigyosyo/readme.html

If you use the general JP Post postal code search you'll see there's no result.

https://www.post.japanpost.jp/cgi-zip/zipcode.php?zip=869-0198

These special postal codes are provided in a separate CSV file that I haven't added to posuto. I guess I should work on doing that...

@kinow
Copy link
Author

kinow commented Dec 1, 2020

Ah, makes sense. I searched for the Sky Tree (〒131-8634) and it also didn't return anything (posuto or Japan Post). #TodayILearned.

@kristate
Copy link

Yes, it would be great if you could parse jigyosyo.csv [0] and add this information to the library.

[0] https://www.post.japanpost.jp/zipcode/dl/jigyosyo/readme.html

@polm polm added the enhancement New feature or request label Jan 29, 2021
@polm
Copy link
Owner

polm commented Jan 29, 2021

Did not work on this this month, I'll look at it again next month.

@polm polm changed the title Address not found in Nagasu town, Tamana District, Kumamoto Special postal codes not handled Feb 8, 2021
polm added a commit that referenced this issue Feb 26, 2021
This could use more testing, but it adds initial support for the office
codes (大口事業所の個別番号).

These data attached to these codes has little in common with normal
postal codes, so it's saved using a different data structure.
@polm
Copy link
Owner

polm commented Feb 26, 2021

OK, I think this is working in the latest release, v0.4.0.

It turns out the data for these codes is different enough that converting it into the same format as normal postal codes doesn't make sense, so I just return them with a completely different structure. The JSON data is saved in a separate file.

Some other things about these codes:

  • it's possible for the same company to have up to three codes at the same address. Meaning of this is unclear.
  • It's common for multiple companies to share the same postal code (this always has the same physical address).
  • Because companies can request their code be unlisted, posuto can not tell for sure if a code is valid or not.
  • The JP Post page isn't explicit but the JIS codes are presumably from jisx0402.

I have used the term "company" above, but technically the organizations that get codes can be government offices or other organizations.

Also noting this here because it was hard to understand, but while the postal data uses five-digit JIS codes, the reference page uses six digit codes everywhere. Turns out the sixth digit is a check digit that's calculated in an odd way.

@kinow
Copy link
Author

kinow commented Feb 26, 2021

It turns out the data for these codes is different enough that converting it into the same format as normal postal codes doesn't make sense, so I just return them with a completely different structure. The JSON data is saved in a separate file.

Sounds really tricky to handle these codes.

I have used the term "company" above, but technically the organizations that get codes can be government offices or other organizations.

👍

Also noting this here because it was hard to understand, but while the postal data uses five-digit JIS codes, the reference page uses six digit codes everywhere. Turns out the sixth digit is a check digit that's calculated in an odd way.

How did you figure it out? Had a look at that page (with my broken Japanese) and couldn't see an explanation of how to parse that code in the page or PDF files. 🤓

It's working for me now 👍

(venv) kinow@ranma:/tmp$ pip install -U posuto
Collecting posuto
  Downloading https://files.pythonhosted.org/packages/af/97/8626d71e45e3f38bec91dd7558acd9b40246892aa66ce16531296a58e708/posuto-0.4.0.tar.gz (6.7MB)
     |████████████████████████████████| 6.7MB 1.3MB/s 
Installing collected packages: posuto
  Running setup.py install for posuto ... done
Successfully installed posuto-0.4.0
WARNING: You are using pip version 19.2.3, however version 21.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(venv) kinow@ranma:/tmp$ python
Python 3.8.3 (default, May 19 2020, 18:47:26) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import posuto
>>> posuto.get('〒869-0198')
OfficeCode(jis='43368', kana='ナガスマチヤクバ', name='長洲町役場', prefecture='熊本県', city='玉名郡長洲町', neighborhood='大字長洲', banchi='2766', postal_code='8690198', old_code='86901', post_office='長洲', type='office', multiple=False, new=False, alternates=[])
>>> 

@polm
Copy link
Owner

polm commented Feb 27, 2021

How did you figure it out? Had a look at that page (with my broken Japanese) and couldn't see an explanation of how to parse that code in the page or PDF files.

Well, first I checked what jisx0402 was. That led me to the reference page, which doesn't use the term jisx0402, and I saw it was all six digit codes. Then I checked the Wikipedia article and that mentioned the check digit, but when I calculated the check digit for the first entry in the offices file it didn't match up. Then I found the README on the reference page, which has some special rules about the check digit buried in it, and then it matched up and I was able to confirm the codes were the same.

It is not well organized data. :/

Closing since this seems to work for now.

@polm polm closed this as completed Feb 27, 2021
@kinow
Copy link
Author

kinow commented Feb 27, 2021

Thanks for fixing and for the explanation. Kudos on the detective work!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants