Austria monuments #64

Vesihiisi · 2017-05-20T17:56:33Z

Base class for processing Austria in German.

Preview: https://www.wikidata.org/wiki/Wikidata:WikiProject_WLM/Mapping_tables/at_(de)/preview

Task: https://phabricator.wikimedia.org/T166771

Add an option in the report generator to also include the P-id to which the value should be matched. It's optional so that it can be omitted, e.g. if it's data for a label/description. Example of report: https://tools.wmflabs.org/coh/_total_se-ship_new.json Task: https://phabricator.wikimedia.org/T166648

… austria

lokal-profil

Could you also add a bad address example to the preview?

lokal-profil · 2017-06-29T08:28:05Z

tests/test_importer_utils.py

@@ -64,6 +64,12 @@ def test_last_char_is_vowel_pass(self):
    def test_last_char_is_vowel_fail(self):
        self.assertFalse(utils.last_char_is_vowel("bar"))

+    def test_first_char_is_number_true(self):
+        self.assertTrue(utils.first_char_is_number("2 foo"))


Change to 2foo to make it closer to the failing example and ensure it is not related to "2" being a single word.

lokal-profil · 2017-06-29T08:30:42Z

importer/importer_utils.py

+    Get the longest string(s) in a list.
+
+    :param in_list: list of strings
+    :return: single string if there's only one with the max length,


wouldn't it be easier to always return a list with either zero, one or many entries? That way you could always make iterate over the results (or use len()).

lokal-profil · 2017-06-29T08:32:58Z

importer/importer_utils.py

+    contains both a simple word ("bro") and its compound ("järnvägsbro"),
+    we only get the more specific one:
+        * "götaälvsbron" -> "bro"
+        * "en stor järnvägsbro" -> "järnvägsbro"


define the parameters

lokal-profil · 2017-06-29T08:34:31Z

importer/importer.py

                   "monuments_za_(en)": {"class": ZaEn, "data_files": {}},
                   "monuments_cm_(fr)": {"class": CmFr, "data_files": {}},
+                   "monuments_at_(de)": {"class": AtDe, "data_files": {"municipalities":
+                                                                       "austria_municipalities.json",
+                                                                       "types": "at_(de)_types.json"}},


The at_(de)_types.json should be added to the repo.

Or changed to use "lookup_downloads"

lokal-profil · 2017-06-29T08:39:40Z

importer/AtDe.py

+        raw_name = self.name.lower()
+        types = self.data_files["types"]["mappings"]
+        keywords = list(types.keys())
+        keywords = [x.lower() for x in keywords]


No need to convert keys() to a list since you can iterate over them anyway and the end result will give a list.
keywords = [x.lower() for x in types.keys()]

lokal-profil · 2017-06-29T08:47:58Z

importer/AtDe.py

+        """
+        types = self.data_files["types"]["mappings"]
+        type_match = self.get_type_keyword()
+        if type_match:


if get_longest_match was changed to always return a list the logic could here be simplified to:

if len(type_match) == 1: do stuff else: self.add_to_report("type", self.name, "is")

since you are not making a reporting distinction between none and many matches

lokal-profil · 2017-06-29T08:53:25Z

importer/AtDe.py

+        """
+        type_match = self.get_type_keyword()
+        municipality_name = self.get_municipality_name()
+        base_desc_german_english = "{} in {}"


maybe do these as small dicts instead

base_desc = { "en": "{} in {}", "de": "{} in {}", "sv": "{} i {}" } place_desc = { "en": "Austria", "de": "Österreich", "sv": "Österike" } type_desc = { "en": "cultural property", "de": "Denkmalgeschütztes Objekt", "sv": "kulturarv" }

That way you should quite easily be able to loop through them in the end.

lokal-profil · 2017-06-29T08:54:14Z

importer/AtDe.py

+        place_english = "Austria"
+        place_swedish = "Österike"
+        type_german = "Denkmalgeschütztes Objekt"
+        type_english = "cultural property"


Where do the English and Swedish names come from? Wondering since I would expect the English one to be "cultural heritage property"

Note: rm english and swedish.

lokal-profil · 2017-06-29T09:01:29Z

importer/AtDe.py

+        * 76
+          Filtering by containing only digits.
+        """
+        bad_address = None


Would it be clearer to use bad_address = False as you don't make use of the later contents of bad_address?

lokal-profil · 2017-06-29T09:03:47Z

importer/AtDe.py

+            elif self.adresse.isdigit():
+                bad_address = self.adresse
+            else:
+                address = self.adresse


Would it be clearer to have this as the else for if bad_address? Separating the finding-bad-ones logic from the report-or-add logic.

Also not sure address = self.adresse is needed since you only use it on the line below.

lokal-profil · 2017-06-29T09:37:12Z

importer/AtDe.py

+            municip_code = str(self.gemeindekennzahl)
+            match = [x["item"]
+                     for x in all_codes if x["value"] == municip_code]
+            if len(match) == 0:


is it safer to do

if len(match) == 1: add stuff else: report

Vesihiisi added 18 commits May 20, 2017 19:55

start austria branch!

ae14f15

municip mappings

f2d8170

Merge branch 'master' into austria

9529fae

process street addr

f999a84

Merge branch 'austria' of https://github.com/Vesihiisi/COH-tools into…

7941bb3

… austria

more stuff

86bc14f

utilities

5fa297c

stuff

4a79e5d

austr mapping

0867a96

utils

c870293

broken setuptools hack

ed204e3

some horrible string matching logic

2a64a8d

extract p31 from name

4963bf3

find existing

f7b6029

descriptions

944567a

desc

6bda01a

some document

ddbda24

Vesihiisi requested a review from lokal-profil June 1, 2017 11:00

Vesihiisi added 2 commits June 7, 2017 08:44

newest from master

11b8cbd

Merge remote-tracking branch 'origin/master' into austria

84fbce5

lokal-profil requested changes Jun 29, 2017

View reviewed changes

lokal-profil reviewed Jun 29, 2017

View reviewed changes

Vesihiisi added 3 commits August 10, 2017 14:38

tox.ini, monument types

0663c6f

logic

b37c754

typo

c6171b0

lokal-profil approved these changes Aug 10, 2017

View reviewed changes

Vesihiisi merged commit 1b1e556 into master Aug 10, 2017

Vesihiisi deleted the austria branch September 7, 2017 08:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Austria monuments #64

Austria monuments #64

Vesihiisi commented May 20, 2017 •

edited

lokal-profil left a comment

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

Vesihiisi Aug 10, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

lokal-profil Jun 29, 2017

Austria monuments #64

Austria monuments #64

Conversation

Vesihiisi commented May 20, 2017 • edited

lokal-profil left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Vesihiisi commented May 20, 2017 •

edited