Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Austria monuments #64

Merged
merged 23 commits into from Aug 10, 2017
Merged

Austria monuments #64

merged 23 commits into from Aug 10, 2017

Conversation

Vesihiisi
Copy link
Owner

@Vesihiisi Vesihiisi commented May 20, 2017

Copy link
Collaborator

@lokal-profil lokal-profil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a bad address example to the preview?

@@ -64,6 +64,12 @@ def test_last_char_is_vowel_pass(self):
def test_last_char_is_vowel_fail(self):
self.assertFalse(utils.last_char_is_vowel("bar"))

def test_first_char_is_number_true(self):
self.assertTrue(utils.first_char_is_number("2 foo"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to 2foo to make it closer to the failing example and ensure it is not related to "2" being a single word.

Get the longest string(s) in a list.

:param in_list: list of strings
:return: single string if there's only one with the max length,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be easier to always return a list with either zero, one or many entries? That way you could always make iterate over the results (or use len()).

contains both a simple word ("bro") and its compound ("järnvägsbro"),
we only get the more specific one:
* "götaälvsbron" -> "bro"
* "en stor järnvägsbro" -> "järnvägsbro"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define the parameters

"monuments_za_(en)": {"class": ZaEn, "data_files": {}},
"monuments_cm_(fr)": {"class": CmFr, "data_files": {}},
"monuments_at_(de)": {"class": AtDe, "data_files": {"municipalities":
"austria_municipalities.json",
"types": "at_(de)_types.json"}},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The at_(de)_types.json should be added to the repo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or changed to use "lookup_downloads"

raw_name = self.name.lower()
types = self.data_files["types"]["mappings"]
keywords = list(types.keys())
keywords = [x.lower() for x in keywords]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to convert keys() to a list since you can iterate over them anyway and the end result will give a list.
keywords = [x.lower() for x in types.keys()]

"""
types = self.data_files["types"]["mappings"]
type_match = self.get_type_keyword()
if type_match:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if get_longest_match was changed to always return a list the logic could here be simplified to:

if len(type_match) == 1:
   do stuff
else:
  self.add_to_report("type", self.name, "is")

since you are not making a reporting distinction between none and many matches

importer/AtDe.py Outdated
"""
type_match = self.get_type_keyword()
municipality_name = self.get_municipality_name()
base_desc_german_english = "{} in {}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe do these as small dicts instead

base_desc = {
  "en": "{} in {}",
  "de": "{} in {}",
  "sv": "{} i {}"
}
place_desc = {
  "en": "Austria",
  "de": "Österreich",
  "sv": "Österike"
}
type_desc = {
  "en": "cultural property",
  "de": "Denkmalgeschütztes Objekt",
  "sv": "kulturarv"
}

That way you should quite easily be able to loop through them in the end.

importer/AtDe.py Outdated
place_english = "Austria"
place_swedish = "Österike"
type_german = "Denkmalgeschütztes Objekt"
type_english = "cultural property"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do the English and Swedish names come from? Wondering since I would expect the English one to be "cultural heritage property"

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: rm english and swedish.

* 76
Filtering by containing only digits.
"""
bad_address = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be clearer to use bad_address = False as you don't make use of the later contents of bad_address?

elif self.adresse.isdigit():
bad_address = self.adresse
else:
address = self.adresse
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be clearer to have this as the else for if bad_address? Separating the finding-bad-ones logic from the report-or-add logic.

Also not sure address = self.adresse is needed since you only use it on the line below.

importer/AtDe.py Outdated
municip_code = str(self.gemeindekennzahl)
match = [x["item"]
for x in all_codes if x["value"] == municip_code]
if len(match) == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it safer to do

if len(match) == 1:
  add stuff
else:
   report

@Vesihiisi Vesihiisi merged commit 1b1e556 into master Aug 10, 2017
@Vesihiisi Vesihiisi deleted the austria branch September 7, 2017 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants