Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preview #14

Merged
merged 13 commits into from Jun 13, 2017
Merged

Preview #14

merged 13 commits into from Jun 13, 2017

Conversation

lokal-profil
Copy link
Owner

@lokal-profil lokal-profil commented Jun 8, 2017

Adds the -preview_file argument which, if provided, outputs a
preview of the changes to that file rather than making the actual
changes. Can be run together with the -simulate argument for
paranoid users.

Also:

  • Fix broken import in RBD
  • Fix novalue/somevalue handling in impact type
  • Fix single value in impact type
  • Implement set_common_values fully for RBD
  • Merge process_all_rbd and process_country_rbd in RBD
  • Fix minor bugs/typos

Task: T166673

Adds the `-preview_file` argument which, if provided, outputs a
preview of the changes to that file rather than making the actual
changes. Can be run together with the `-simulate` argument for
paranoid users.

Also:
* Fix broken import in RBD
* Fix novalue/somevalue handling in impact type
* Fix single value in impact type
* Fix minor bugs/typos

@todo:
* Add tests for PreviewItem
* Handle/tasks for any @todos

Task: [T166673](https://phabricator.wikimedia.org/T166673)
This allows us to drop the country parameter which was being passed
around and to merge process_all and process_country


class PreviewItem(object):
"""Base bot to enrich Wikidata with info from WFD."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

WFD/RBD.py Outdated
# @todo: T167662
def process_single_rbd(self, data, item):
"""
Process a rbd (whether item exists or not).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an rbd; capitalize consistently as RBD since that's what's used in other places?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standardised RBD, SWB and WFD throughout.

@@ -76,11 +80,11 @@ def set_common_values(self, data):
"""
Set and validate values shared by every SWB in the dataset.

:param data: dict of all the SWB:s in the RBD
:param data: SWB component of thexml data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the xml

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -95,8 +99,9 @@ def process_all_swb(self, data):

Only increments counter when an swb is updated.

:param data: dict of all the swb:s in the RBD
:param data: list of all the swb:s in the RBD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since you then run helpers.listify on it, it means that something else than a list can be passed here?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to the source data changing one item lists to just the item. Clarifying.

else:
raise ValueError(
'Sorry but "{}" is not a recognized special '
'value/snacktype.'.format(wd_id))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snaktype

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -43,9 +47,11 @@
class RbdBot(WfdBot):
"""Bot to enrich/create info on Wikidata for RBD objects."""

def __init__(self, mappings, year, new=False, cutoff=None):
def __init__(self, mappings, year, new=False, cutoff=None,
preview_file=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use some default path if none given? At least for me it was the most comfortable option, and could be used after moving this to Wikidatastuff, e.g. using the format name of dataset + timestamp.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is that I use -preview_file to control if a preview is generated rather than an upload. If I set a default then I would need to separate out the trigger mechanism into one more commandline argument.

@@ -311,6 +335,8 @@ def validate_indata(data, mappings):
"""
# validate mapping of each <surfaceWaterBodyCategory> and
# <swSignificantImpactType>
pywikibot.output('Target file containes {} entries, validating...'.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contains

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"""
Create a label(s) preview.

The first label is italicised (to indicate it is the preferred label
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the presentation of labels / aliases -- the current format works, but I think putting every label / alias on a separate line would make it even more visually clear. I'm saying this with the future generalization in Wikidatastuff in mind (multiple or long labels), but even here when the labels are long it can get hard to read:

en: 2. Bothnian Sea (International drainage basin Trondelagsfylkene - Sweden) | SE1102

If you happen to have an alias that's equally long, or two aliases, they might get hard to parse when they're all on one line.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I broke this out as T167791. Separating per line is probably the best solution but I want to ensure it works (without swamping the preview page) even when there are multiple languages.

if self.item:
return PreviewItem.make_wikidata_template(self.item)
else:
return 'New item created'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this isn't visual noise. Leaving this blank works as well, and is easier to parse by the reader.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be right but I'm loath to leave it completely blank.Changed it to . Would that work?

"""
Make a wikidata template for items and properties.

:param item: a Q/P prefixed item/property id or a ItemPage/PropertyPage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter is called wd_entry here.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lokal-profil
Copy link
Owner Author

@lokal-profil lokal-profil merged commit 002c103 into master Jun 13, 2017
@lokal-profil lokal-profil deleted the preview branch June 13, 2017 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants