# Compare Alma and HM Journal Collection Lists

## 1. Upload the Title Lists to OpenRefine
### 1. Upload the Alma title list
1. Find the electronic collection in the institution zone
2. Select the portfolio list
3. Download an extended export of the portfolio list
4. Upload the Excel file that downloaded to OpenRefine
5. Enter the name of the project in the cell below and run the cell

In [6]:
Alma_project = "EAN-Alma"

### 2. Upload the HM title list
1. Find the package in HM
2. Go to the downloads page, select a package download, and select the package
3. When the ststus is completed, download the title list
4. Upload the Excel file that downloaded to OpenRefine
5. Enter the name of the project in the cell below and run the cell

In [7]:
HM_project = "EAN-HM"

## 2. Reformat the Title Lists
### 1. Reformat the Alma title list
1. Apply "Prepare_Alma_Title_List.json" to the Alma project in OpenRefine
### 2. Reformat HM title list
1. Apply "Prepare_HM_Title_List.json" to the HM project in OpenRefine

## 3. Match Titles in the Two Projects
### 1. Find ISSN matches
1. Run the cell below, then apply the output to the HM project

In [8]:
JSON_Step = f"""
[
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "ISSN",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"All ISSN\\\").cells.Portfolio_ID.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Portfolio_ID",
        "columnInsertIndex": 1,
        "description": "Create column ``Portfolio_ID`` with the portfolio ID for the ISSN in that row"
    }}
]
"""

print(JSON_Step)


[
    {
        "op": "core/column-addition",
        "engineConfig": {
            "facets": [],
            "mode": "record-based"
        },
        "baseColumnName": "ISSN",
        "expression": "grel:cell.cross(\"EAN-Alma\",\"All ISSN\").cells.Portfolio_ID.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Portfolio_ID",
        "columnInsertIndex": 1,
        "description": "Create column ``Portfolio_ID`` with the portfolio ID for the ISSN in that row"
    }
]



2. Apply "Find_ISSN_Matches.json" to HM project in OpenRefine
3. Set a duplicates filter on `Portfolio_ID` to confirm that there aren't any duplicate portfolio IDs
> If there are, then... (what to do in event of ISSNs of different titles matching single portfolio TBD)
4. Switch back to Alma project
5. Run the cell below, then apply the output to the Alma project

In [9]:
JSON_Step = f"""
[
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{HM_project}\\\",\\\"Portfolio_ID\\\").cells.KBID.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Matched",
        "columnInsertIndex": 1,
        "description": "Create column ``Matched`` with the KBID matched to that portfolio ID"
    }}
]
"""

print(JSON_Step)


[
    {
        "op": "core/column-addition",
        "engineConfig": {
            "facets": [],
            "mode": "record-based"
        },
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\"EAN-HM\",\"Portfolio_ID\").cells.KBID.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Matched",
        "columnInsertIndex": 1,
        "description": "Create column ``Matched`` with the KBID matched to that portfolio ID"
    }
]



6. In rows mode, blank down `Portfolio_ID`
7. In records mode, set a blanks filter on `Matched` to true to check if any titles haven't been matched
### 2. Find remaining matches via title matches
1. Apply "Find_Title_Matches_1.json" to Alma project
2. Switch to HM project
3. Apply "Find_Title_Matches_2.json" to HM project
4. Run the cell below, then apply the output to the HM project

In [19]:
JSON_Step = f"""
[
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "All_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID1",
        "columnInsertIndex": 6,
        "description": "Create column ``ID1`` with all the portfolio IDs in the rows with titles matching the values in column ``All_Titles`` separated by pipes"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "Alt_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID2",
        "columnInsertIndex": 6,
        "description": "Create column ``ID2`` with all the portfolio IDs in the rows with titles matching the values in column ``Alt_Titles`` separated by pipes"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "All_Titles",
        "expression": "value.replace(/[‘’‚‛‹›‚]/,\\\"\\\\\'\\\").replace(/[“”«»„]/,\\\"\\\\\\"\\\")",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Replace smart quotes in column ``All_Titles``"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "Alt_Titles",
        "expression": "value.replace(/[‘’‚‛‹›‚]/,\\\"\\\\\'\\\").replace(/[“”«»„]/,\\\"\\\\\\"\\\")",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Replace smart quotes in column ``Alt_Titles``"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "All_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID3",
        "columnInsertIndex": 6,
        "description": "Create column ``ID3`` with all the portfolio IDs in the rows with titles matching the values in column ``All_Titles`` separated by pipes"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "Alt_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID4",
        "columnInsertIndex": 6,
        "description": "Create column ``ID4`` with all the portfolio IDs in the rows with titles matching the values in column ``Alt_Titles`` separated by pipes"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "All_Titles",
        "expression": "value.replace(\\\"-\\\",\\\" \\\").replace(\\\"&\\\",\\\"And\\\").toTitlecase()",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Change column ``All_Titles`` so the titles are in titlecase with hyphens replaced by spaces and ampersands replaced by `And`"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "Alt_Titles",
        "expression": "value.replace(\\\"-\\\",\\\" \\\").replace(\\\"&\\\",\\\"And\\\").toTitlecase()",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Change column ``Alt_Titles`` so the titles are in titlecase with hyphens replaced by spaces and ampersands replaced by `And`"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "All_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID5",
        "columnInsertIndex": 6,
        "description": "Create column ``ID5`` with all the portfolio IDs in the rows with titles matching the values in column ``All_Titles`` separated by pipes"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "Alt_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID6",
        "columnInsertIndex": 6,
        "description": "Create column ``ID6`` with all the portfolio IDs in the rows with titles matching the values in column ``Alt_Titles`` separated by pipes"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "All_Titles",
        "expression": "grel:value.replace(\\\".\\\",\\\"\\\").replace(\\\",\\\",\\\"\\\").trim()",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Change column ``All_Titles`` to remove periods and commas"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "Alt_Titles",
        "expression": "grel:value.replace(\\\".\\\",\\\"\\\").replace(\\\",\\\",\\\"\\\").trim()",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Change column ``Alt_Titles`` sto remove periods and commas"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "All_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID7",
        "columnInsertIndex": 6,
        "description": "Create column ``ID7`` with all the portfolio IDs in the rows with titles matching the values in column ``All_Titles`` separated by pipes"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "Alt_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID8",
        "columnInsertIndex": 6,
        "description": "Create column ``ID8`` with all the portfolio IDs in the rows with titles matching the values in column ``Alt_Titles`` separated by pipes"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "All_Titles",
        "expression": "grel:if(isNull(value.match(/^(The|the|An|an|A|a)\\\\s(.*)/)),value,value.match(/^(The|the|An|an|A|a)\\\\s(.*)/)[1])",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Remove `A`, `An`, and `The` from the beginning of ``All_Titles``"
    }},
    {{
        "op": "core/text-transform",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "columnName": "Alt_Titles",
        "expression": "grel:value.value.match(/^(The|the|An|an|A|a)\\\\s(.*)/)[1]",
        "onError": "keep-original",
        "repeat": false,
        "repeatCount": 10,
        "description": "Remove `A`, `An`, and `The` from the beginning of ``Alt_Titles``"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "All_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt1\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt2\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt3\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt5\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt7\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID9",
        "columnInsertIndex": 6,
        "description": "Create column ``ID9`` with all the portfolio IDs in the rows with titles matching the values in column ``All_Titles`` separated by pipes"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "Alt_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value),\\\"\\\",cell.cross(\\\"{Alma_project}\\\",\\\"Alt4\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt6\\\").cells.ID.value.join(\\\"|\\\"))+if(isNull(cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value),\\\"\\\",\\\"|\\\"+cell.cross(\\\"{Alma_project}\\\",\\\"Alt8\\\").cells.ID.value.join(\\\"|\\\")),\\\"|\\\").uniques().join(\\\"|\\\")",
        "onError": "set-to-blank",
        "newColumnName": "ID10",
        "columnInsertIndex": 6,
        "description": "Create column ``ID10`` with all the portfolio IDs in the rows with titles matching the values in column ``Alt_Titles`` separated by pipes"
    }}
]
"""

print(JSON_Step)


[
    {
        "op": "core/column-addition",
        "engineConfig": {
            "facets": [
                {
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {
                            "v": {
                                "v": false,
                                "l": "false"
                            }
                        }
                    ],
                    "selectBlank": false,
                    "selectError": false
                }
            ],
            "mode": "record-based"
        },
        "baseColumnName": "All_Titles",
        "expression": "grel:split(if(isNull(cell.cross(\"EAN-Alma\",\"Alt1\").cells.ID.value),\"\",cell.cross(\"EAN-Alma\

5. Apply "Find_Title_Matches_3.json" to HM project
6. Run the cell below, then apply the output to the HM project

In [21]:
JSON_Step = f"""
[
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }},
                {{
                    "type": "list",
                    "name": "Without_Subtitles",
                    "expression": "isBlank(value)",
                    "columnName": "Without_Subtitles",
                    "invert": false,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "With_Subtitles",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Title.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Title_1",
        "columnInsertIndex": 4,
        "description": "Create column ``Alma_Title_1`` with the Alma titles for the portfolio IDs in ``With_Subtitles`` for the records that don't already have a portfolio ID"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }},
                {{
                    "type": "list",
                    "name": "Without_Subtitles",
                    "expression": "isBlank(value)",
                    "columnName": "Without_Subtitles",
                    "invert": false,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }},
                {{
                    "type": "list",
                    "name": "With_Subtitles",
                    "expression": "isBlank(value)",
                    "columnName": "With_Subtitles",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }},
            ],
            "mode": "record-based"
        }},
        "baseColumnName": "Without_Subtitles",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Title.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Title_2",
        "columnInsertIndex": 5,
        "description": "Create column ``Alma_Title_2`` with the Alma titles for the portfolio IDs in ``Without_Subtitles`` for the records that don't already have a portfolio ID and where no portfolio IDs were matched based on the title with the subtitle/parenthetical"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": false,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": false,
                                "l": "false"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }},
                {{
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "facetCount(value, 'value', 'Portfolio_ID') > 1",
                    "columnName": "Portfolio_ID",
                    "invert": false,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {{
                            "v": {{
                                "v": true,
                                "l": "true"
                            }}
                        }}
                    ],
                    "selectBlank": false,
                    "selectError": false
                }}
            ],
            "mode": "row-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"EAN-Alma\\\",\\\"Portfolio_ID\\\").cells.Title.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "temp",
        "columnInsertIndex": 2,
        "description": "Create column ``temp`` with the Alma title values of the duplicate portfolio IDs in ``Portfolio_ID``"
    }}
]
"""

print(JSON_Step)


[
    {
        "op": "core/column-addition",
        "engineConfig": {
            "facets": [
                {
                    "type": "list",
                    "name": "Portfolio_ID",
                    "expression": "isBlank(value)",
                    "columnName": "Portfolio_ID",
                    "invert": true,
                    "omitBlank": false,
                    "omitError": false,
                    "selection": [
                        {
                            "v": {
                                "v": false,
                                "l": "false"
                            }
                        }
                    ],
                    "selectBlank": false,
                    "selectError": false
                },
                {
                    "type": "list",
                    "name": "Without_Subtitles",
                    "expression": "isBlank(value)",
                    "columnName": "Without_Subtitles",
           

7. Apply "Find_Title_Matches_4.json" to the HM project
8. Manually remove all but the relevant portfolio ID from the `Alma_Title_1` and `Alma_Title_2` columns

## 4. Compare Title Data
### 1. Move Alma data to the HM project
1. Join all multi-valued cells in `All ISSN` by pipes
2. Switch to the HM project
3. Apply "Move_Alma_Data_to_HM.json" to HM project
> This nulls the Portfolio_ID column of any multiple match records that aren't resolved
4. Run the cell below, then apply the output to the HM project

In [5]:
JSON_Step = f"""
[
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells[\\\"MMS (Title ID)\\\"].value[0]",
        "onError": "set-to-blank",
        "newColumnName": "MMS (Title ID)",
        "columnInsertIndex": 2,
        "description": "Create column ``MMS (Title ID)`` with the MMS for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells[\\\"All ISSN\\\"].value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_ISSN",
        "columnInsertIndex": 4,
        "description": "Create column ``Alma_ISSN`` with the ISSN for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Title.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Title",
        "columnInsertIndex": 6,
        "description": "Create column ``Alma_Title`` with the title for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.URL.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_URL",
        "columnInsertIndex": 11,
        "description": "Create column ``Alma_URL`` with the URL for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.URL_Identifier.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "URL_Identifier",
        "columnInsertIndex": 12,
        "description": "Create column ``URL_Identifier`` with the the URL parameter Alma uses to identify the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Publisher.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Publisher",
        "columnInsertIndex": 14,
        "description": "Create column ``Alma_Publisher`` with the publisher for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Selected_Coverage_Statement.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Dates_Source",
        "columnInsertIndex": 15,
        "description": "Create column ``Alma_Dates_Source`` with the source of holdings dates selected for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Default_Start_Date.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Default_Start_Date",
        "columnInsertIndex": 16,
        "description": "Create column ``Alma_Default_Start_Date`` with the default/global start date for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Default_End_Date.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Default_End_Date",
        "columnInsertIndex": 18,
        "description": "Create column ``Alma_Default_End_Date`` with the default/global end date for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Local_Start_Date.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Local_Start_Date",
        "columnInsertIndex": 20,
        "description": "Create column ``Alma_Local_Start_Date`` with the local start date for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Local_End_Date.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Local_End_Date",
        "columnInsertIndex": 22,
        "description": "Create column ``Alma_Local_End_Date`` with the local end date for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Default_Embargo.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Default_Embargo",
        "columnInsertIndex": 24,
        "description": "Create column ``Alma_Default_Embargo`` with the default/global embargo for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Local_Embargo.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Local_Embargo",
        "columnInsertIndex": 26,
        "description": "Create column ``Alma_LocaL_Embargo`` with the local embargo for the portfolio"
    }},
    {{
        "op": "core/column-addition",
        "engineConfig": {{
            "facets": [],
            "mode": "record-based"
        }},
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\\\"{Alma_project}\\\",\\\"Portfolio_ID\\\").cells.Resource_Type.value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_Resource_Type",
        "columnInsertIndex": 28,
        "description": "Create column ``Alma_Resource_Type`` with the resource type for the portfolio"
    }},
]
"""

print(JSON_Step)


[
    {
        "op": "core/column-addition",
        "engineConfig": {
            "facets": [],
            "mode": "record-based"
        },
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\"From-Alma\",\"Portfolio_ID\").cells[\"MMS (Title ID)\"].value[0]",
        "onError": "set-to-blank",
        "newColumnName": "MMS (Title ID)",
        "columnInsertIndex": 2,
        "description": "Create column ``MMS (Title ID)`` with the MMS for the portfolio"
    },
    {
        "op": "core/column-addition",
        "engineConfig": {
            "facets": [],
            "mode": "record-based"
        },
        "baseColumnName": "Portfolio_ID",
        "expression": "grel:cell.cross(\"From-Alma\",\"Portfolio_ID\").cells[\"All ISSN\"].value[0]",
        "onError": "set-to-blank",
        "newColumnName": "Alma_ISSN",
        "columnInsertIndex": 4,
        "description": "Create column ``Alma_ISSN`` with the ISSN for the portfolio"
    },
    {
        "o

### 2. Compare data for individual holdings
1. Apply "Compare_Holdings_Data.json" to HM project