# Extract Records from .mrc Through Matching
Given a list of unique identifiers (ISBns, DOIs, URLs, ect.), extract the records containing those identifiers.

## 1. Open the MARC file the records are being extracted from and the file the selected records are going into

In [1]:
from pathlib import Path
MARC_FILE_LOCATION = Path('C:\\', 'Users', 'ereiskind', 'Downloads', 'MyMarcRecords.mrc')
NEW_MARC_FILE_LOCATION = Path('C:\\', 'Users', 'ereiskind', 'OneDrive - Florida State University', 'Attachments', 'OxfordScholarship_20210430.mrc')

starting_MARC_file = open(MARC_FILE_LOCATION, 'rb')
new_MARC_file = open(NEW_MARC_FILE_LOCATION, 'wb')

## 2. Create the list of identifiers
1. Add the unique identifiers to the cell below as a string with one identifier per line
2. Set `identifier_field` to a string of the field and possible subfield the identifiers are being compared against; if a subfield is included, use "$" as the delimeter

In [2]:
identifiers = """
"""
identifiers = identifiers.split("\n")[1:-1] # The splice removes elements created by having the opening and closing quotes on their own lines

identifier_field = "856$u"

## 3. Check the identifiers against the records
This includes keeping a list of the matched identifiers

In [3]:
from pymarc import MARCReader, Record
import sys

MARCfile = MARCReader(starting_MARC_file)
matched_identifiers = []

for record in MARCfile: # MARC records must be outer loop--when attempting to use them as inner loop, they don't reset for each outer loop iteration, so they're only checked against the first item in "identifiers"
    if identifier_field == "856$u":
        for identifier in identifiers:
            if identifier not in record['856']['u']:
                continue
            else:
                new_MARC_file.write(record.as_marc())
                matched_identifiers.append(identifier)
                break
    else:
        print(f"The MARC tag {identifier_field} hasn't been matched to a PyMARC tag in this program. The program is exiting.")
        starting_MARC_file.close()
        new_MARC_file.close()
        sys.exit()

## 4. Output the identifiers not matched

In [None]:
if len(matched_identifiers) < len(identifiers):
    identifiers_not_matched = []
    
    for identifier in identifiers:
        if identifier not in matched_identifiers:
            identifiers_not_matched.append(identifier)

print(identifiers_not_matched)

## 4. Close the MARC files
To create a .mrk file from these MARC files, this notebook needs to be closed.

In [4]:
starting_MARC_file.close()
new_MARC_file.close()