New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preview #14
Preview #14
Conversation
Adds the `-preview_file` argument which, if provided, outputs a preview of the changes to that file rather than making the actual changes. Can be run together with the `-simulate` argument for paranoid users. Also: * Fix broken import in RBD * Fix novalue/somevalue handling in impact type * Fix single value in impact type * Fix minor bugs/typos @todo: * Add tests for PreviewItem * Handle/tasks for any @todos Task: [T166673](https://phabricator.wikimedia.org/T166673)
This allows us to drop the country parameter which was being passed around and to merge process_all and process_country
WFD/PreviewItem.py
Outdated
|
||
|
||
class PreviewItem(object): | ||
"""Base bot to enrich Wikidata with info from WFD.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
WFD/RBD.py
Outdated
# @todo: T167662 | ||
def process_single_rbd(self, data, item): | ||
""" | ||
Process a rbd (whether item exists or not). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an rbd; capitalize consistently as RBD since that's what's used in other places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
standardised RBD, SWB and WFD throughout.
WFD/swb_import.py
Outdated
@@ -76,11 +80,11 @@ def set_common_values(self, data): | |||
""" | |||
Set and validate values shared by every SWB in the dataset. | |||
|
|||
:param data: dict of all the SWB:s in the RBD | |||
:param data: SWB component of thexml data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the xml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
WFD/swb_import.py
Outdated
@@ -95,8 +99,9 @@ def process_all_swb(self, data): | |||
|
|||
Only increments counter when an swb is updated. | |||
|
|||
:param data: dict of all the swb:s in the RBD | |||
:param data: list of all the swb:s in the RBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But since you then run helpers.listify on it, it means that something else than a list can be passed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is due to the source data changing one item lists to just the item. Clarifying.
WFD/PreviewItem.py
Outdated
else: | ||
raise ValueError( | ||
'Sorry but "{}" is not a recognized special ' | ||
'value/snacktype.'.format(wd_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snaktype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -43,9 +47,11 @@ | |||
class RbdBot(WfdBot): | |||
"""Bot to enrich/create info on Wikidata for RBD objects.""" | |||
|
|||
def __init__(self, mappings, year, new=False, cutoff=None): | |||
def __init__(self, mappings, year, new=False, cutoff=None, | |||
preview_file=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use some default path if none given? At least for me it was the most comfortable option, and could be used after moving this to Wikidatastuff, e.g. using the format name of dataset + timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is that I use -preview_file
to control if a preview is generated rather than an upload. If I set a default then I would need to separate out the trigger mechanism into one more commandline argument.
WFD/swb_import.py
Outdated
@@ -311,6 +335,8 @@ def validate_indata(data, mappings): | |||
""" | |||
# validate mapping of each <surfaceWaterBodyCategory> and | |||
# <swSignificantImpactType> | |||
pywikibot.output('Target file containes {} entries, validating...'.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contains
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
""" | ||
Create a label(s) preview. | ||
|
||
The first label is italicised (to indicate it is the preferred label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the presentation of labels / aliases -- the current format works, but I think putting every label / alias on a separate line would make it even more visually clear. I'm saying this with the future generalization in Wikidatastuff in mind (multiple or long labels), but even here when the labels are long it can get hard to read:
en: 2. Bothnian Sea (International drainage basin Trondelagsfylkene - Sweden) | SE1102
If you happen to have an alias that's equally long, or two aliases, they might get hard to parse when they're all on one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I broke this out as T167791. Separating per line is probably the best solution but I want to ensure it works (without swamping the preview page) even when there are multiple languages.
WFD/PreviewItem.py
Outdated
if self.item: | ||
return PreviewItem.make_wikidata_template(self.item) | ||
else: | ||
return 'New item created' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this isn't visual noise. Leaving this blank works as well, and is easier to parse by the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be right but I'm loath to leave it completely blank.Changed it to –
. Would that work?
WFD/PreviewItem.py
Outdated
""" | ||
Make a wikidata template for items and properties. | ||
|
||
:param item: a Q/P prefixed item/property id or a ItemPage/PropertyPage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter is called wd_entry
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Adds the
-preview_file
argument which, if provided, outputs apreview of the changes to that file rather than making the actual
changes. Can be run together with the
-simulate
argument forparanoid users.
Also:
novalue
/somevalue
handling in impact typeset_common_value
s fully for RBDprocess_all_rbd
andprocess_country_rbd
in RBDTask: T166673