# Mix'n'match Mismatch Generation

This notebook is used to generate mismatches for [Mismatch Finder](https://www.wikidata.org/wiki/Wikidata:Mismatch_Finder) via a request to [Mix'n'match](https://meta.wikimedia.org/wiki/Mix%27n%27match) data stores. Data will be formatted for upload given the [directions for creating a mismatch file](https://github.com/wmde/wikidata-mismatch-finder/blob/main/docs/UserGuide.md#creating-a-mismatches-import-file).

In [1]:
# pip install jupyter-black

In [2]:
%load_ext jupyter_black

In [3]:
import json
import sys
import urllib

PATH_TO_UTILS = "../"  # change based on your directory structure
sys.path.append(PATH_TO_UTILS)

from utils import check_mf_formatting

## Get Data

In [4]:
mnm_mismatch_request_url = (
    "https://mix-n-match.toolforge.org/api.php?query=all_issues&mode=time_mismatch"
)

In [5]:
with urllib.request.urlopen(mnm_mismatch_request_url) as url:
    mnm_mismatch_data = json.load(url)

In [6]:
print(f"{len(mnm_mismatch_data['data']):,}")

82,996


In [7]:
mnm_mismatch_data["data"][:2]

[{'issue_id': '85584',
  'entry_id': '44032422',
  'time_mismatch': {'prop': 'P569',
   'wd_time': '+1925-01-01T00:00:00Z',
   'mnm_time': '+1926-07-04T00:00:00Z',
   'q': 'Q329124'}},
 {'issue_id': '564195',
  'entry_id': '115714460',
  'time_mismatch': {'prop': 'P569',
   'wd_time': '+1998-09-19T00:00:00Z',
   'mnm_time': '+1987-04-17T00:00:00Z',
   'q': 'Q107654539'}}]

In [8]:
mnm_mismatch_data_expanded = []
for d in mnm_mismatch_data["data"]:
    d["source"] = f"https://mix-n-match.toolforge.org/#/entry/{d['entry_id']}"
    d.pop("issue_id", None)
    d["time_mismatch"]["pid"] = d["time_mismatch"].pop("prop")
    d["time_mismatch"]["qid"] = d["time_mismatch"].pop("q")

    mnm_mismatch_data_expanded.append(d)

In [9]:
mnm_mismatch_data_expanded[:2]

[{'entry_id': '44032422',
  'time_mismatch': {'wd_time': '+1925-01-01T00:00:00Z',
   'mnm_time': '+1926-07-04T00:00:00Z',
   'pid': 'P569',
   'qid': 'Q329124'},
  'source': 'https://mix-n-match.toolforge.org/#/entry/44032422'},
 {'entry_id': '115714460',
  'time_mismatch': {'wd_time': '+1998-09-19T00:00:00Z',
   'mnm_time': '+1987-04-17T00:00:00Z',
   'pid': 'P569',
   'qid': 'Q107654539'},
  'source': 'https://mix-n-match.toolforge.org/#/entry/115714460'}]