# Checking bibliographic records for required fields
This script is designed to check .mrc files for fields required for transfer to VSF. This is the first step, intended to skim known problems off to be addressed by a cataloger before the remainder of records are manually inspected for errors.

## Importing libraries

In [1]:
import pymarc
from pymarc import MARCReader
import csv

## Importing .mrc file name
Because we are working with relatively few files, I've set this up to run one at a time. This could be improved with a batching process--identify all .mrc files in a given directory and run through them.

The next two cells take user input to identify the .mrc file and provide a collection name to be used in naming output files.

In [20]:
marcFile = input('Enter file name with file extension:')

Enter file name BIBLIOGRAPHIC_50295103360002122_1.mrc


In [24]:
remedySetName = input('Enter collection name')

Enter collection name childrens2


## Checking for MARC fields
This cell creates a reader object using `pymarc`, sets up lists to store identifiers for records that need to be corrected by a cataloger. The logic is designed to match the bibliographic verification requirements for VSF. Running this cell will create lists that will be the basis of our .csv output files in the next cell.

In [46]:
reader = MARCReader(open(marcFile,'rb'))
setAsideAuthor = []
setAsideDate = []
setAsideTitle = []
remedySetList = []

for record in reader:
    marc100 = record.get_fields('100')
    if len(marc100) > 0:
        pass
    elif len(marc100) == 0:
        marc110 = record.get_fields('110')
        if len(marc110) > 0:
            pass
        elif len(marc110) == 0:
            marc111 = record.get_fields('111')
            if len(marc111) > 0:
                pass
            elif len(marc111) == 0:
                marc700 = record.get_fields('700')
                #try:
                    #print(marc700[0])
                #except:
                if len(marc700) > 0:
                    pass
                elif len(marc700) == 0:
                    setAsideAuthor.append(record['001'].value())
                    remedySetList.append(record['001'].value())
    try:
        marc260c = record['260']['c']
        pass
    except KeyError:
        try:
            marc264c = record['264']['c']
        except KeyError:
            setAsideDate.append(record['001'].value())
            remedySetList.append(record['001'].value())
    if len(record['245'].value()) > 0:
        pass
    elif len(record['245'].value()) == 0:
        setAsideTitle.append(record['001'].value())
        remedySetList.append(record['001'].value())

## Outputing verification results
In this cell, we create a list of all MMSIDs (identifiers for records exported from our Alma instance). Then, we check each MMSID against the lists we formed in the cell above. For each MMSID, conditional logic sets the value to 'okay' for fields that are present in the record and 'fix' for fields that are missing. The result is a .csv file summarized the needed actions for each record.

After that, we create a second file of just MMSID numbers for all items that need cataloging actions.

In [50]:
mmsidList = []

with open('remedySheet_%s.csv' %(remedySetName),'w',newline='') as remedySheet:
    fieldnames = ['MMSID','100|110|111|700','245','260|264']
    writer = csv.DictWriter(remedySheet,fieldnames=fieldnames)
    writer.writeheader()

    reader = MARCReader(open(marcFile,'rb'))
    for record in reader:
        marc001 = record['001'].value()
        if marc001 in setAsideDate:
            fixDate = 'fix'
        else:
            fixDate = 'okay1'
        if marc001 in setAsideAuthor:
            fixAuthor = 'fix'
        else:
            fixAuthor = 'okay2'
        if marc001 in setAsideTitle:
            fixTitle = 'fix'
        else:
            fixTitle = 'okay3'
        writer.writerow({'MMSID':marc001,'100|110|111|700':fixAuthor,'245':fixTitle,'260|264':fixDate})
        #print(marc001,fixDate,fixAuthor)


with open('remedySet_%s.csv' %(remedySetName),'w',newline='') as remedySet:
    fieldnames = ['MMSID']
    remedySetWriter = csv.DictWriter(remedySet,fieldnames=fieldnames)
    remedySetWriter.writeheader()
    
    for problem in remedySetList:
        remedySetWriter.writerow({'MMSID':marc001})