Offline ZIP/postal derivation #539
Replies: 5 comments
-
@SamarthTechKing That sounds duplicate to #438 |
Beta Was this translation helpful? Give feedback.
-
|
Hi! I'd like to take a stab at this. I'm working on a Happy to adjust the approach or library choice if it doesn't fit the project direction. Will open a PR shortly. |
Beta Was this translation helpful? Give feedback.
-
|
Hey guys @vharkins1 @abhishek-8081 @chetanr25, how are you feeling about this? Since the beginning of this project we've been thinking about the fact of enriching AI input in different ways and clearly this one was one of the most feasible to start with. There are multiple approaches regarding zipcodes. Let me address my concerns:
I want to hear your opinions guys. |
Beta Was this translation helpful? Give feedback.
-
|
1. US-only vs global from the beginning Yes, we should go global from day one. I did some quick research and is what I think. The main problem is there's no such thing as one universal "zip code" format (As i have researched,Please let me know if I am wrong here). Every country has its own postal code system and they're all over the place: some are alphanumeric like Canada and UK, some are purely numeric like US and India, and a few countries like UAE, Hong Kong, and Ireland (before 2015) don't even use postal codes for general addressing. This is what just came to my mind: We can use something like 2. Automatic in pipeline vs show it to user I went through packages on pip and could find many Indian specific pin code packages. Not sure if there are more pin code packages for other countries that I'm missing, or if it's showing me Indian ones because of my current location. After some browsing I came across geopy, we can just test this out and see if it works for our case. If it doesn't fit well, we go look for other libraries, and external APIs would be the last option. I strongly feel we shouldn't rely on AI prompts to get zip codes, this needs to be a proper deterministic lookup, not something we ask an LLM to guess. Since this kind of data also won't always be fully accurate, we should decide early on what happens when there's no match. My take is we leave the field empty and let the user fill it in manually instead of guessing a wrong value. I think we can add zipcode automatically, They can anyways update it later if the zipcode is wrong. We will have a human in loop to confirm the data at last anyways. We have to see if it's possible to achieve this "offline", If we can't with this package - we can look into the tradeoffs of choosing another library which can handle this offline or can assume that the server would have internet conenction. 3. Visual maps This should definitely be doable in the first place. I had worked with Google Maps APIs in a few of my apps before (yes, I was an app developer before this), but that was for personal or hackathon projects without much traffic, so Google Maps worked fine there. For this, we could look at an alternative approach since it needs to scale a bit more. If we could use OpenStreetMap since it's open source and free to use. I read that geopy works well with OpenStreetMap, correct me if I'm wrong on this. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks @marcvergees for moving this into a discussion, and thanks to @Aryama-srivastav and @CrepuscularIRIS for already putting up working implementations, and to @chetanr25 for the research write-up. Sharing some thoughts to help us converge — please take these as suggestions, happy to be corrected on any of them. I think the cleanest way to settle most of the open questions is to anchor on the one requirement we all seem to agree on: this should work offline. That's the field-reporting rationale in the original issue, and it lines up with FireForm's local-first positioning. A lot of the choices fall out of that. Library: I'd lean toward pgeocodeBoth current PRs are solving the right problem; my suggestion would be to converge on the pgeocode approach rather than carry two implementations. The reason: pgeocode does fully offline place-name → postal lookup from a bundled GeoNames dataset, and it already covers ~83 countries keyed by ISO country code (docs). That actually gives us both things at once — @marcvergees's wish to be global-ready from day one, and the offline guarantee — without us having to maintain anything country-specific. It also speaks directly to @chetanr25's point about a and a place-name lookup returns structured rows we can map straight onto form fields: >>> import pgeocode
>>> nomi = pgeocode.Nominatim('us')
>>> nomi.query_location('Sacramento', top_k=1)
# country_code postal_code place_name state_name state_code county_name
# US 94203 Sacramento California CA SacramentoOn geopy / online services@chetanr25 raised exactly the right question — whether this can be done offline, and noted external APIs should be a last resort. Building on that: it's worth knowing that geopy isn't itself an offline database — it's a client that calls remote geocoding services, and its default (OpenStreetMap's Nominatim) is an online server. So it wouldn't meet the offline bar we're aiming for. There's also a policy angle: the Nominatim Usage Policy caps the public endpoint at 1 request/second, explicitly disallows systematic queries like looking up lists of postcodes, and states that applications whose core function is geocoding should run their own instance. On top of that, sending reporters' addresses to an outside server would cut against our "data stays local" stance. Self-hosting a Nominatim instance is possible but heavy infrastructure for what we need here. So I'd suggest we keep the lookup deterministic and offline (pgeocode/GeoNames), which I think matches what @chetanr25 was already leaning toward — and I fully agree with him that we should not ask the LLM to guess ZIPs. (Minor note to avoid confusion later: pgeocode has its own class also named Accuracy & no-match handlingI agree with @chetanr25's approach: leave the field empty when there's no confident match, never overwrite a ZIP the user already entered, and keep the human-in-the-loop confirmation at the end. Two small additions from testing the library live:
Visual mapsI think the map idea is genuinely valuable, and OpenStreetMap + a library like Leaflet/MapLibre is a sound open choice as @chetanr25 suggested. My only suggestion is to spin it into its own discussion — interactive maps and reverse-geocoding generally need online tiles/services, which is a different connectivity and privacy profile from offline ZIP fill, and I'd hate for it to entangle this (smaller, shippable) piece. Happy to open that thread if it's useful. Suggested pathLand this as an offline-only ZIP derivation built on pgeocode, ship US-first for the Cal Fire use case using a global-ready (country_code + postal_code) shape, require state context, and move maps + any online enrichment into separate discussions. Keen to hear what others think — especially on whether we enable additional country datasets now or after the US path is solid. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
name: 🚀 Feature Request
about: Suggest an idea or a new capability for FireForm.
title: "[FEAT]: Offline ZIP Code Derivation for Missing Postal Fields"
labels: enhancement
assignees: ''"
📝 Description
Add an offline ZIP/postal derivation capability so FireForm can auto-fill missing ZIP fields when users provide district, town, city, county, state, or full address text but do not recall ZIP code.
💡 Rationale
Cal Fire and partner agencies often require complete address blocks including ZIP code. During field reporting, responders may remember location context (town/district/address) but not ZIP. FireForm should infer ZIP offline to reduce manual correction and speed report completion in low-connectivity environments.
📌 Additional Context
Scope is US-first (Cal Fire use case).
Resolver should handle common input patterns such as:
"Pine Valley, CA"
"Sacramento County, California"
"123 Main St, Sacramento, CA"
Beta Was this translation helpful? Give feedback.
All reactions