Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CZ countrywide #7080

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

StephanGeorg
Copy link
Contributor

@StephanGeorg StephanGeorg commented Feb 19, 2024

Update outdated CZ countrywide source with:

ETL script for DuckDB:

LOAD spatial;

CREATE TEMP TABLE ruian AS FROM (
SELECT
  "Kód ADM"
  , "Kód obce"
  , "Název ulice"
  , "Typ SO"
  , "Číslo domovní"
  , "Číslo orientační"
  , "Znak čísla orientačního"
  , "PSČ"
  , "Název obce"
  , "Název MOMC"
  , "Název části obce"
  , coalesce("Název MOMC", "Název obce") as "city"
  , coalesce("Název ulice", "Název části obce") as "street"
  , '' as "number"
  , '' as "postal_code"
  , 0.0::double AS "lat"
  , 0.0::double AS "lon"
  ,  ST_FlipCoordinates(ST_Transform(ST_POINT("Souřadnice X", "Souřadnice Y"), 'EPSG:5513', 'EPSG:4326')) as "geometry"
FROM 
  read_csv('*_ADR.csv', AUTO_DETECT=TRUE)
);

-- Update coordinates
UPDATE ruian SET "lat" = ST_Y("geometry"), "lon" = ST_X("geometry");
-- Update postal code
UPDATE ruian SET "postal_code" = CONCAT(regexp_extract("PSČ", '(\d{3})(\d{2})', 1), ' ', regexp_extract("PSČ", '(\d{3})(\d{2})', 2)) WHERE "PSČ" IS NOT NULL;
-- Update house number
UPDATE ruian SET "number" = NULL;
UPDATE ruian SET "number" = "Číslo domovní" WHERE "Číslo orientační" IS NULL;
UPDATE ruian SET "number" = "Číslo domovní" || '/' || "Číslo orientační" WHERE "Číslo orientační" IS NOT NULL;
UPDATE ruian SET "number" = "Číslo domovní" || '/' || "Číslo orientační" || "Znak čísla orientačního" WHERE "Znak čísla orientačního" IS NOT NULL;
UPDATE ruian SET "number" = CONCAT("Typ SO", ' ', "Číslo domovní") WHERE "Číslo orientační" IS NULL AND "street" = "city" AND "Typ SO" = 'č.ev.';


-- Export data
COPY(SELECT "Kód ADM", street, "number", postal_code, city, lat, lon FROM ruian) TO 'CZ-addresses.csv';

Pls upload data from tmp Google Drive folder to S3 bucket.

@data-please
Copy link

data-please bot commented Feb 19, 2024

@StephanGeorg
Copy link
Contributor Author

@sbma44
I don't know if you're still interested, but you added the source originally. Maybe you'd like to take a look at the update? If not, totally fine, I just wanted to ask.

UPDATE ruian SET "number" = "Číslo domovní" WHERE "Číslo orientační" IS NULL;
UPDATE ruian SET "number" = "Číslo domovní" || '/' || "Číslo orientační" WHERE "Číslo orientační" IS NOT NULL;
UPDATE ruian SET "number" = "Číslo domovní" || '/' || "Číslo orientační" || "Znak čísla orientačního" WHERE "Znak čísla orientačního" IS NOT NULL;
UPDATE ruian SET "number" = CONCAT("Typ SO", ' ', "Číslo domovní") WHERE "Číslo orientační" IS NULL AND "street" = "city" AND "Typ SO" = 'č.ev.';

@StephanGeorg StephanGeorg reopened this Feb 20, 2024
@data-please
Copy link

data-please bot commented Feb 20, 2024

@data-please
Copy link

data-please bot commented Feb 20, 2024

@data-please
Copy link

data-please bot commented Feb 20, 2024

@bertday bertday mentioned this pull request Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant