# Wissensaggregator Mittelalter und frühe Neuzeit

## Strukturdaten einlesen/aktualisieren
*2022-11-07* Prüfe die feature class für Orte in der Klosterdatenbank, verwende `place_all`

Inhalt

- [Länder einlesen](#Länder-einlesen)
- [Verwaltungsebenen einlesen](#Verwaltungsebenen-einlesen)
- [Orte aus GeoNames einlesen](#Orte-aus-GeoNames-einlesen)
  - [Fremsprachliche Namen](#Fremdsprachliche-Namen)
  - [Deutsche Namen eintragen](#Deutsche-Namen-eintragen)

Die Funktionen, welche Daten laden und umwandeln sind im Modul `WiagDataSetup` enthalten. Dieses verwendet seinerseits folgende Module:

- MySQL
- DataFrames
- JSON
- Infiltrator

Diese werden am besten vorab direkt in einem Julia-Terminal installiert.
``` julia
cd("C:\\Users\\Georg\\Documents\\projekte\\WiagDataSetup.jl")
Pkg.activate(".")
Pkg.add("MySQL")
Pkg.add("DataFrames")
Pkg.add("JSON")
Pkg.add("Infiltrator")
```

Pfad zum Modul `WiagDataSetup`

In [1]:
wds_path="../.."

"../.."

In [2]:
cd(wds_path)

In [3]:
using Pkg

In [4]:
Pkg.activate(".")

[32m[1m  Activating[22m[39m project at `C:\Users\georg\Documents\projekte\WiagDataSetup.jl`


Nur für die Entwicklung des Moduls relevant.

In [5]:
using Revise

Modul laden

In [6]:
using WiagDataSetup; Wds=WiagDataSetup

┌ Info: Precompiling WiagDataSetup [522c5ebb-a018-4020-8ed4-420cb1a9f084]
└ @ Base loading.jl:1664


WiagDataSetup

Verbinde die Datenbank.

Im Verlauf der Einleseschritte kann die folgende Fehlermeldung erscheinen:
*Commands out of sync; you can't run this command now*.
In diesem Fall diesen Befehl nochmal absetzen.

In [7]:
Wds.setDBWIAG(user="georg", db="wiag_in")

Passwort für User georg: ········


MySQL.Connection(host="127.0.0.1", user="georg", port="3306", db="wiag_in")

Verzeichnis für Basisdaten, z.b. SKOS-Schemes, Länder

In [8]:
data_path="C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup"

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup"

## Länder einlesen
Die Quelle enthält im Wesentlichen Deutschland und seine Nachbarn. Die numerische ID ist der numerische ISO-Code.

Quelle: https://download.geonames.org/export/dump/countryInfo.txt

In [None]:
using CSV, MySQL, DataFrames

In [None]:
country_file=joinpath(data_path, "GeoNames", "countryInfo.txt")

In [None]:
df_country = CSV.read(country_file, DataFrame);

In [None]:
first(df_country, 5)

In [None]:
df_country[end-5:end, [Symbol("ISO-Numeric"), :Country]]

In [None]:
select_cols = [
    Symbol("ISO-Numeric") => :id,
    :Country => :name,
    :ISO => :country_code,
    :Capital => :capital
]

In [None]:
df_country_db = select(df_country, select_cols...);

In [None]:
Wds.filltable!("country", df_country_db, clear_table = true)

GeoNames-IDs ergänzen

Quelle: `wiag_bundeslaender_normdaten_lhofman.xls` übertragen in `country.csv`

In [None]:
using MySQL, DataFrames, CSV

In [None]:
cy_file=joinpath(data_path, "csv", "country.csv")

In [None]:
df_cy=CSV.read(cy_file, DataFrame)

In [None]:
sql="DROP TABLE IF EXISTS country_gn_id"
DBInterface.execute(Wds.dbwiag, sql);

In [None]:
sql="CREATE TEMPORARY TABLE country_gn_id (" *
"id INT," *
"country_code VARCHAR(31)," *
"country_code_3 VARCHAR(31)," *
"geonames_id INT)"
DBInterface.execute(Wds.dbwiag, sql);

In [None]:
Wds.filltable!("country_gn_id", df_cy[!, [:id, :country_code, :country_code_3, :geonames_id]])

In [None]:
sql="UPDATE country AS cy, (SELECT id, country_code, country_code_3, geonames_id FROM country_gn_id) as gn " *
"SET cy.country_code_3 = gn.country_code_3, cy.geonames_id = gn.geonames_id " *
"WHERE cy.id = gn.id"
DBInterface.execute(Wds.dbwiag, sql);

externe IDs ergänzen  
2022-04-08: Es liegen nur externe IDs für Länder vor. Diese werden nicht übernommen; bzw. wieder gelöscht.

In [None]:
cei_file=joinpath(data_path, "csv", "country_id_external.csv")

In [None]:
df_cei=CSV.read(cei_file, DataFrame, types=[Int, String, String, String, Int, String]);

In [None]:
size(df_cei)

In [None]:
df_cei[1:7, :]

In [None]:
Wds.filltable!("place_id_external", df_cei[!, [:geonames_id, :authority_id, :value]])

externe IDs für Bundesländer ergänzen

Quelle: Quelle: `wiag_bundeslaender_normdaten_lhofman.xls` übertragen in `country_level_1_id_external.csv`

In [None]:
cei_l1_file=joinpath(data_path, "csv", "country_level_1_id_external.csv")

In [None]:
df_cei_l1=CSV.read(cei_l1_file, DataFrame, types=[Int, String, String, String, Int, String]);

In [None]:
df_cei_l1[1:7, :]

In [None]:
Wds.filltable!("place_id_external", df_cei_l1[!, [:geonames_id, :authority_id, :value]])

Kontrolle

``` sql
SELECT name, country_code, authority_id, value 
FROM place_id_external AS pei
JOIN country_level_1 AS cl1 ON pei.geonames_id = cl1.geonames_id;
```

Es fehlen Hamburg, Bremen

## Verwaltungsebenen einlesen
Die Quelle enthält die oberste Verwaltungsebene nach dem Land selbst, also für Deutschland die Bundesländer, für Frankreich die Regionen. Die Quelle umfasst alle Daten in GeoNames und wird daher gefiltert.

In [None]:
using DataFrames; using CSV

In [None]:
cl1_file=joinpath(data_path, "Geonames", "admin1CodesASCII.txt")

In [None]:
df_cl1=CSV.read(cl1_file, DataFrame, header=["code", "name", "ascii_name", "geonames_id"]);

In [None]:
first(df_cl1, 7)

Länder Code extrahieren

In [None]:
get_country_code(code)=split(code::AbstractString, ".")[1]

In [None]:
df_cl1[!, :country_code] .= get_country_code.(df_cl1[!, :code]);

In [None]:
first(df_cl1, 7)

Code der Verwaltungsebene extrahieren

In [None]:
get_admin1_code(code)=split(code::AbstractString, ".")[2]

In [None]:
df_cl1[!, :admin1_code] .= get_admin1_code.(df_cl1[!, :code]);

Länder auslesen

In [None]:
using MySQL

In [None]:
sql = "SELECT id as country_id, country_code FROM country " * 
"WHERE country_code in (SELECT distinct(country_code) FROM place)"
df_country_m = DBInterface.execute(Wds.dbwiag, sql) |> DataFrame;

In [None]:
delete!(df_country_m, 16)

In [None]:
df_ccl1 = leftjoin(df_country_m, df_cl1, on = :country_code);

Index ergänzen

In [None]:
df_ccl1[!, :id] .= 1:size(df_ccl1, 1);

In [None]:
first(df_ccl1[!, [:id, :country_id, :country_code, :name]], 5)

In [None]:
size(df_ccl1)

In [None]:
import_cols = [:id, :country_id, :country_code, :admin1_code, :name, :ascii_name, :geonames_id]

In [None]:
names(df_ccl1)

In [None]:
Wds.filltable!("country_level_1", select(df_ccl1, import_cols), clear_table=true)

## Orte aus GeoNames einlesen
Geonames stellt eine Sammlung aller Objekte (features) zur Verfügung. Alle Objekte eines Landes sind in jeweils einer Datei enthalten.
https://download.geonames.org/export/dump/

Lies Daten zu 
- Deutschland
- Nachbarländer
- Baltikum
- Kroatien
- Kaliningrad/Königsberg

Ländercodes: DE, NL, BE, FR, IT, CH, AT, DK, PL, LU, CZ, LI, LV, LT, HR, EE, RU

Pfad zu den Daten

In [9]:
gn_path = "C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames"

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames"

Wähle aus den Features in den Daten von GeoNames nur Orte (P) aus, sowie Klöster (MSTY) und Konvente (CVNT).  
Wähle wegen der Überprüfung der Orte in der Klosterdatenbank auch Administrative Einheiten aus.  
Siehe http://www.geonames.org/export/codes.html.

Filterfunktion definieren

In [10]:
function filter_places(feature_class, feature_code)
    return ((!ismissing(feature_class) && feature_class in ("P", "A"))
            || (!ismissing(feature_code) && feature_code in ("MSTY", "CVNT")))
end

filter_places (generic function with 1 method)

Spaltennamen und Spaltentypen

Länder auslesen

In [11]:
using MySQL
using DataFrames
using CSV

In [12]:
df_country_m = DBInterface.execute(Wds.dbwiag, "SELECT id as country_id, country_code FROM country") |> DataFrame;

In [13]:
first(df_country_m, 5)

Row,country_id,country_code
Unnamed: 0_level_1,Int32,String?
1,0,XK
2,4,AF
3,8,AL
4,10,AQ
5,12,DZ


In [14]:
gn_header = ["geonames_id", "name", "asciiname", "alternatenames",
             "latitude", "longitude", "feature_class", "feature_code",
             "country_code", "cc2", "admin1_code", "admin2_code", "admin3_code", "admin4_code",
             "population", "elevation", "dem", "timezone", "modification_date"];

gn_types = [Int, String, String, String, Float64, Float64,
            String, String, String, String,
            String, String, String, String,
            Int, Int, Int, String, String];

In [29]:
place_cols = [
    :country_id => :country_id, 
    :name => :name, 
    :asciiname => :ascii_name, 
    :latitude => :latitude, 
    :longitude => :longitude, 
    :feature_class => :feature_class, 
    :feature_code => :feature_code, 
    :country_code => :country_code, 
    :cc2 => :cc2, 
    :admin1_code => :admin1_code,
    :population => :population,
    :elevation => :elevation, 
    :dem => :dem, 
    :timezone => :timezone, 
    :modification_date => :modification_date, 
    :geonames_id => :geonames_id,
    :geonames_id => :id_in_source,
];

### Niederlande

In [16]:
gn_filename = joinpath(gn_path, "NL", "NL.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\NL\\NL.txt"

In [17]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [18]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [19]:
size(gn_dff)

(8233, 19)

In [20]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [21]:
gn_dff[1:5, [:country_id, :name, :latitude, :longitude, :elevation, :dem, :population]]

Row,country_id,name,latitude,longitude,elevation,dem,population
Unnamed: 0_level_1,Int32?,String,Float64,Float64,Int64?,Int64,Int64
1,528,Den Oord,51.9708,5.27083,missing,3,0
2,528,Gemeente Zwolle,52.5126,6.09359,missing,9,126116
3,528,Zwolle,52.5125,6.09444,missing,9,111805
4,528,Zwolle,52.0317,6.65556,missing,30,65
5,528,Zwingelspaan,51.66,4.49583,missing,-2,120


In [30]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: Rows inserted: 8233
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


8233

### Belgien

In [31]:
gn_filename = joinpath(gn_path, "BE", "BE.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\BE\\BE.txt"

In [32]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [33]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [34]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,2783081,Zwijndrecht,BE,18249
2,2783082,Zwijndrecht,BE,19056
3,2783083,Zwijnaardse Dries,BE,0
4,2783084,Zwijnaarde,BE,7379
5,2783086,Zwijn,BE,0


In [35]:
size(gn_dff)

(13326, 19)

Spalte für die Länder_ID einfügen

In [36]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [37]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 13326
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


13326

### Frankreich

In [38]:
gn_filename = joinpath(gn_path, "FR", "FR.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\FR\\FR.txt"

In [39]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [40]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [41]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,2967103,Peyrat-le-Château,FR,1140
2,2967107,Domecy-sur-le-Vault,FR,107
3,2967108,Blaye,FR,5277
4,2967109,Zuytpeene,FR,483
5,2967110,Zuydcoote,FR,1660


In [42]:
size(gn_dff)

(118825, 19)

Spalte für die Länder_ID einfügen

In [43]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [44]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 30000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 40000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 50000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 60000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 70000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 80000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 90000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 100000
└ @ 

118825

### Italien

In [45]:
gn_filename = joinpath(gn_path, "IT", "IT.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\IT\\IT.txt"

In [46]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [47]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [48]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,781059,Colognole,IT,128
2,781060,Casale Sant'Antonio,IT,59
3,2522676,Zungti,IT,0
4,2522677,Zumpano,IT,343
5,2522679,Zona,IT,0


In [49]:
size(gn_dff)

(71378, 19)

Spalte für die Länder_ID einfügen

In [50]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [51]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 30000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 40000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 50000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 60000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 70000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 71378
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


71378

### Schweiz

In [52]:
gn_filename = joinpath(gn_path, "CH", "CH.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\CH\\CH.txt"

In [53]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [54]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [55]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,2657885,Zwischbergen,CH,127
2,2657886,Zwingen,CH,2162
3,2657887,Zweisimmen,CH,2813
4,2657888,Zweilütschinen,CH,0
5,2657889,Zuzwil,CH,4226


In [56]:
size(gn_dff)

(16062, 19)

Spalte für die Länder_ID einfügen

In [57]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [58]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 16062
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


16062

### Österreich

In [59]:
gn_filename = joinpath(gn_path, "AT", "AT.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\AT\\AT.txt"

In [60]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [61]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [62]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,2598250,Mooshöhe,AT,0
2,2598294,Muttling,AT,0
3,2598296,Kleiner,AT,0
4,2598303,Dörfl,AT,0
5,2598321,Zeitschen,AT,0


In [63]:
size(gn_dff)

(23597, 19)

Spalte für die Länder_ID einfügen

In [64]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [65]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 23597
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


23597

### Dänemark

In [66]:
gn_filename = joinpath(gn_path, "DK", "DK.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\DK\\DK.txt"

In [67]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [68]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [69]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,2609911,Yttrup,DK,0
2,2609915,Yppenbjerg,DK,0
3,2609916,Ypnested,DK,0
4,2609922,Yding,DK,0
5,2609926,Yderik,DK,0


In [70]:
size(gn_dff)

(7368, 19)

Spalte für die Länder_ID einfügen

In [71]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [72]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: Rows inserted: 7368
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


7368

### Polen

In [73]:
gn_filename = joinpath(gn_path, "PL", "PL.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\PL\\PL.txt"

In [74]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [75]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [76]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,462259,Zodenen,PL,0
2,620115,Włodawka,PL,0
3,688812,Vul’ka Ugruska,PL,0
4,696099,Pshedmes’tse Vel’ke,PL,0
5,698000,Paportno,PL,0


In [77]:
size(gn_dff)

(50395, 19)

Spalte für die Länder_ID einfügen

In [78]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [79]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 30000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 40000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 50000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 50395
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


50395

### Deutschland

In [80]:
gn_filename = joinpath(gn_path, "DE", "DE.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\DE\\DE.txt"

In [81]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [82]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [83]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,2657946,Wyhlen,DE,0
2,2803460,Märkischer Kreis,DE,410222
3,2803461,Landkreis Hildesheim,DE,275817
4,2803463,Landkreis Aichach-Friedberg,DE,134655
5,2803468,Zyfflich,DE,0


In [84]:
size(gn_dff)

(92536, 19)

Spalte für die Länder_ID einfügen

In [85]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [86]:
gn_dff[1:5, :]

Row,country_id,country_code,geonames_id,name,asciiname,alternatenames,latitude,longitude,feature_class,feature_code,cc2,admin1_code,admin2_code,admin3_code,admin4_code,population,elevation,dem,timezone,modification_date
Unnamed: 0_level_1,Int32?,String,Int64,String,String,String?,Float64,Float64,String,String?,String?,String?,String?,String?,String?,Int64,Int64?,Int64,String?,String
1,276,DE,2657946,Wyhlen,Wyhlen,Wyhlen,47.5473,7.69331,P,PPLX,missing,1,83,8336,08336105,0,missing,269,Europe/Berlin,2020-11-12
2,276,DE,2803460,Märkischer Kreis,Maerkischer Kreis,"Iserlohn,Markischer Kreis,Märkischer Kreis",51.2639,7.74167,A,ADM3,missing,7,59,5962,missing,410222,missing,186,Europe/Berlin,2021-04-18
3,276,DE,2803461,Landkreis Hildesheim,Landkreis Hildesheim,"Arrondissement de Hildesheim,Circondario di Hildesheim,Distrikto Hildesheim,Distrito de Hildesheim,HI,Hildesheim,Landkreis Hildesheim,Landkreis Hildesheim-Marienburg,Loundkring Hildesheim,Powiat Hildesheim,xi er de si hai mu xian,希尔德斯海姆县",52.1267,9.965,A,ADM3,missing,6,0,3254,missing,275817,missing,80,Europe/Berlin,2021-04-18
4,276,DE,2803463,Landkreis Aichach-Friedberg,Landkreis Aichach-Friedberg,"Aichach,Aichach-Friedberg,Augsburg-Ost,Friedberg,Landkreis Aichach-Friedberg",48.4172,11.0547,A,ADM3,missing,2,97,9771,missing,134655,missing,483,Europe/Berlin,2021-04-18
5,276,DE,2803468,Zyfflich,Zyfflich,Zyfflich,51.8234,5.97297,P,PPL,missing,7,51,5154,05154040,0,missing,14,Europe/Berlin,2016-03-10


In [87]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 30000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 40000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 50000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 60000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 70000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 80000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 90000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows insert

92536

### Luxemburg

In [88]:
gn_filename = joinpath(gn_path, "LU", "LU.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\LU\\LU.txt"

In [89]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [90]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [91]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,2959958,Zittig,LU,38
2,2959959,Wormeldange,LU,786
3,2959960,Wues,LU,0
4,2959961,Wolwelange,LU,358
5,2959964,Wolpert,LU,0


In [92]:
size(gn_dff)

(832, 19)

Spalte für die Länder_ID einfügen

In [93]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [95]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: Rows inserted: 832
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


832

### Tschechien

In [96]:
gn_filename = joinpath(gn_path, "CZ", "CZ.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\CZ\\CZ.txt"

In [97]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [98]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [99]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,3059873,Hrčava,CZ,0
2,3061283,Janská Hut,CZ,0
3,3061284,Dvůr Králové nad Labem,CZ,16150
4,3061285,Zvůle,CZ,0
5,3061286,Zvotoky,CZ,69


In [100]:
size(gn_dff)

(22876, 19)

Spalte für die Länder_ID einfügen

In [101]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [102]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 22876
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


22876

### Liechtenstein

In [103]:
gn_filename = joinpath(gn_path, "LI", "LI.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\LI\\LI.txt"

In [104]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [105]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [106]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,3042029,Vorderer Schellenberg,LI,0
2,3042030,Vaduz,LI,5197
3,3042031,Vaduz,LI,5197
4,3042033,Triesenberg,LI,2689
5,3042034,Triesenberg,LI,2689


In [107]:
size(gn_dff)

(378, 19)

Spalte für die Länder_ID einfügen

In [108]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [109]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: Rows inserted: 378
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


378

### Estland

Die erste Zeile der Daten kann nicht gelesen werden. Sie wird manuell aus den Quelldaten gelöscht. Das ist unerheblich, weil es sich nicht um ein relevantes Feature handelt.

In [110]:
gn_filename = joinpath(gn_path, "EE", "EE-x1.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\EE\\EE-x1.txt"

In [111]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [112]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [113]:
size(gn_dff)

(12098, 19)

In [114]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,453733,Republic of Estonia,EE,1320884
2,456463,Puijas,EE,0
3,587436,Nehatu,EE,0
4,587437,Mereküla,EE,0
5,587438,Krundiküla,EE,0


In [115]:
size(gn_dff)

(12098, 19)

Spalte für die Länder_ID einfügen

In [116]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [117]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 12098
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


12098

### Lettland

In [118]:
gn_filename = joinpath(gn_path, "LV", "LV.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\LV\\LV.txt"

In [119]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [120]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [121]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,453754,Valmiera,LV,26963
2,453756,Jaunzemji,LV,0
3,453758,Grīžukrogs,LV,0
4,453759,Čolēni,LV,0
5,453764,Zvirgzdene,LV,0


In [122]:
size(gn_dff)

(4802, 19)

Spalte für die Länder_ID einfügen

In [123]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [124]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: Rows inserted: 4802
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


4802

### Litauen

In [125]:
gn_filename = joinpath(gn_path, "LT", "LT.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\LT\\LT.txt"

In [126]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [127]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [128]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,592647,Bileišiai,LT,0
2,592648,Abakai,LT,0
3,592650,Zypliai,LT,0
4,592651,Žyniai,LT,0
5,592652,Žyniai,LT,0


In [129]:
size(gn_dff)

(20471, 19)

Spalte für die Länder_ID einfügen

In [130]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [131]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 20471
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


20471

### Kroatien

In [132]:
gn_filename = joinpath(gn_path, "HR", "HR.txt")

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\HR\\HR.txt"

In [133]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [134]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [135]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

Row,geonames_id,name,country_code,population
Unnamed: 0_level_1,Int64,String,String,Int64
1,3186233,Vranjic,HR,1110
2,3186247,Zvonik,HR,87
3,3186248,Zvoneća,HR,0
4,3186263,Zverinac,HR,43
5,3186265,Zvekovac,HR,193


In [136]:
size(gn_dff)

(10510, 19)

Spalte für die Länder_ID einfügen

In [137]:
gn_dff = rightjoin(df_country_m, gn_dff, on = :country_code);

In [138]:
Wds.filltable!("place_all", select(gn_dff, place_cols), clear_table = false)

┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1193
┌ Info: Rows inserted: 10510
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1216


10510

### Orte mit `feature_class` = "A"

In [141]:
sql = "select gs.id_places, gs.place_name, gs.longitude, gs.latitude,
    gs.geonames_id, p.id, feature_class, feature_code, p.country_id
    from gs_klosterdatenbank.gs_places as gs
    left join wiag_in.place_all as p on p.geonames_id = gs.geonames_id
    where feature_class = 'A'";
df_fca = Wds.sql_df(sql);

In [142]:
size(df_fca)

(221, 9)

In [146]:
fca_file = joinpath(data_path, "Kloester", "places_feature_code_A.csv");
CSV.write(fca_file, df_fca)

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\Kloester\\places_feature_code_A.csv"

### Königsberg - Russland

In [None]:
gn_filename = joinpath(gn_path, "RU", "RU.txt")

In [None]:
gn_df = CSV.read(gn_filename, DataFrame, header=gn_header, types=gn_types);

filtern nach Art des Features

In [None]:
gn_dff = filter([:feature_class, :feature_code] => filter_places, gn_df);

In [None]:
gn_dff[1:5, [:geonames_id, :name, :country_code, :population]]

In [None]:
size(gn_dff)

Extrahiere Königsberg

In [None]:
gn_dff_kb = filter(:name => isequal("Kaliningrad"), gn_dff)

Spalte für die Länder_ID einfügen

In [None]:
gn_dff_kb = rightjoin(df_country_m, gn_dff_kb, on = :country_code);

In [None]:
Wds.filltable!("place", gn_dff_kb[!, place_cols], clear_table = false)

In [None]:
DBInterface.execute(Wds.dbwiag, "SELECT COUNT(*) FROM place") |> DataFrame

### Fremdsprachliche Namen

In [17]:
using CSV, DataFrames, MySQL

In [18]:
gnl_path = "C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\alternatenames"

"C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\GeoNames\\alternatenames"

In [19]:
gnl_header = ["id", "geonames_id", "lang", "label", 
              "is_preferred", "isShort", "isColloquial", "is_historic", "from", "to"];

gnl_types = [Int, Int, String, String,
             Int, Int, Int, Int, String, String];

In [20]:
lang_codes = ["la", "fr", "cz", "de", "pl", "en", "nl", "it"];

In [21]:
filter_lang(lc) = !ismissing(lc) && lc in lang_codes

filter_lang (generic function with 1 method)

In [22]:
country_codes = ["DE", "NL", "BE", "FR", "IT", "CH", "AT", "DK",
    "PL", "LU", "CZ", "LI", "LV", "LT", "HR", "EE", "RU"];

### Schleife über die Länder

Orte einlesen, um nur relevante Namen zu übernehmen

In [287]:
sql = "SELECT id as place_id, name, geonames_id " *
      "FROM place WHERE place_type_id = 1"
p_df = DBInterface.execute(Wds.dbwiag, sql) |> DataFrame;

In [290]:
size(p_df)

(391240, 3)

In [291]:
function labels_by_country(cc)
    gnl_filename = joinpath(gnl_path, cc, cc * ".txt")
    gnl_df = CSV.read(gnl_filename, DataFrame, header=gnl_header, types=gnl_types);
    gnl_df = filter(:lang => filter_lang, gnl_df);
    gnl_p_df = innerjoin(gnl_df, p_df, on = :geonames_id);
    @info cc
    n_cc = Wds.filltable!("place_label", select(gnl_p_df, Not([:isShort, :isColloquial, :from, :to, :name])))    
end
    

labels_by_country (generic function with 1 method)

In [294]:
labels_by_country.(country_codes)

┌ Info: DE
└ @ Main In[291]:6
┌ Info: Rows inserted: 4364
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1209
┌ Info: NL
└ @ Main In[291]:6
┌ Info: Rows inserted: 1848
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1209
┌ Info: BE
└ @ Main In[291]:6
┌ Info: Rows inserted: 862
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1209
┌ Info: FR
└ @ Main In[291]:6
┌ Info: 10000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1186
┌ Info: 20000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1186
┌ Info: 30000
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1186
┌ Info: Rows inserted: 33108
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1209
┌ Info: IT
└ @ Main In[291]:6
┌ Info: 10000
└ @ WiagDataSetup

17-element Vector{Int64}:
  4364
  1848
   862
 33108
 32162
  6277
   728
   669
  3859
   203
  1240
    72
   529
   787
   481
   663
    11

### Deutsche Namen eintragen

Trage in `place` für Orte in Deutschland, Österreich und der Schweiz den deutschen Namen ein.

In [None]:
using MySQL, DataFrames

In [None]:
db_exec(sql) = DBInterface.execute(Wds.dbwiag, sql) |> DataFrame

In [None]:
sql = "SELECT label, p.name FROM place_label AS pl " *
"JOIN place AS p ON pl.geonames_id = p.geonames_id " *
"WHERE pl.lang = 'de' AND p.country_code = 'DE' " *
"AND pl.label <> p.name " *
"LIMIT 12"
df_name_udt = db_exec(sql)

In [None]:
sql = "SELECT count(*) FROM place_label AS pl " *
"JOIN place AS p ON pl.geonames_id = p.geonames_id " *
"WHERE pl.lang = 'de' AND p.country_code = 'DE' " *
"AND pl.label <> p.name "
n = db_exec(sql)

Es scheint im Allgemeinen keine gute Idee zu sein, generell den deutschen Eintrag aus `place_label` zu übernehmen. Man wird einzelne Orte evtl. redaktionell bearbeiten müssen, indem man einen deutschen Namen in `place_label` als bevorzugten Namen auszeichnet.

## Neue Organisation der Orte
Unterscheide Orte nach ihren Quellen (analog zu Items) (Tabelle `place_type`)

### Tabelle `place_type`
über DbSchema

In [None]:
out_path = "C:\\Users\\georg\\Documents\\projekte-doc\\WiagDataSetup\\data_sql"

In [11]:
using DataFrames, Dates

In [12]:
df_place_type = DataFrame();

In [13]:
insertcols!(df_place_type,
    :id => [1],
    :name => ["Ort GN"],
    :note => ["Orte aus GeoNames"],
    :created_by => 7,
    :date_created => now(),
    :changed_by => 7,
    :date_changed => now(),
    :table_name => "place",
    :name_app => "place",
)

Unnamed: 0_level_0,id,name,note,created_by,date_created,changed_by
Unnamed: 0_level_1,Int64,String,String,Int64,DateTime,Int64
1,1,Ort GN,Orte aus GeoNames,7,2022-04-11T09:06:26.483,7


In [14]:
rec_place_ut = (
    id = 2,
    name = "Ort Utrecht",
    note = "Orte der Priester aus Utrecht",
    created_by = 7,
    date_created = now(),
    changed_by = 7,
    date_changed = now(),
    table_name = "place",
    name_app = "place_ut",
)

(id = 2, name = "Ort Utrecht", note = "Orte der Priester aus Utrecht", created_by = 7, date_created = DateTime("2022-04-11T09:06:29.865"), changed_by = 7, date_changed = DateTime("2022-04-11T09:06:29.865"), table_name = "place", name_app = "place_ut")

In [15]:
push!(df_place_type, rec_place_ut)

Unnamed: 0_level_0,id,name,note,created_by,date_created
Unnamed: 0_level_1,Int64,String,String,Int64,DateTime
1,1,Ort GN,Orte aus GeoNames,7,2022-04-11T09:06:26.483
2,2,Ort Utrecht,Orte der Priester aus Utrecht,7,2022-04-11T09:06:29.865


In [16]:
table_name = "place_type";
Wds.filltable!(table_name, df_place_type)

┌ Info: Rows inserted: 2
└ @ WiagDataSetup C:\Users\georg\Documents\projekte\WiagDataSetup.jl\src\WiagDataSetup.jl:1209


2

`place_type_id` nachtragen
```sql
UPDATE place SET place_type_id = 1;
```

`id_in_source` nachtragen
```sql
UPDATE place SET id_in_source = geonames_id;
```

`geonames_id` als Index ersetzen durch `place_id`

```sql
UPDATE place_label AS pll, (SELECT id, geonames_id FROM place) as p
SET pll.place_id = p.id
WHERE pll.geonames_id = p.geonames_id;
```

Einträge in `place_label` löschen, die sich auf Länder, Kantone, Bundesländer und also nicht auf Orte beziehen

```sql
DELETE FROM place_label WHERE place_id IS NULL;
```

In `place_label` die Namen aus der GeoNames Ortetabelle übernehmen, wie von bk vorgeschlagen.  
Die Einträge haben dann keine Angabe für die Sprache, weil die Angaben in der Quelle fehlt.

```sql
UPDATE place_label SET is_geonames_name = false;
```

```sql
INSERT INTO place_label (SELECT NULL, geonames_id, name, NULL, 0, 0, id, 1 FROM place where place_type_id = 1);
```