**A note on deprecation**

Deprecation typical should not occur - it is a symptom of failure in planning or design. If at all possible, term labels or definitions should be modified without actually making the URI of the term be useless. However, there may be circumstances where deprecation is necessary. For example, a temporary term might be introduced for testing or utility and then be replaced by another term in a stable vocabulary.  

![term lifespan](https://raw.githubusercontent.com/tdwg/vocab/master/graphics/version-model.png)

The diagram above represents the lifespan of a term. In the case of deprecation, the tip of the arrow at the right represents the date of death of the term (the date it's deprecated).  Note that the end of the term's life is not attended by the birth of a new version of the same term.  Optimally, a deprecated term would be replaced by a different term, and that deprecation date would be the date of issue of the first version of the new term.  

The implications of this are that the current term will have a last modified date that is the date of its deprecation.  That makes sense because the current term metadata will be changed on that date to give it an owl:deprecated value of true.  The value of tdwgutility:status of the most recent term will be changed from "recommended" to "deprecated".  

The other metadata changes that will be related to the deprecation is a record of the replacement. The new term that is replacing the deprecated term should manually have a value added in the "repalces_term" column that is the deprecated term's URI. The "x-replacements.csv" file in the deprecated term's term list folder should have a link between the replacing term URI and the local name of the replaced term.  

The "x-version-replacements.csv" file in the "x-versions" directory for the deprecated term's term list should link the deprecated term version's local name to the version URI of the replacing version of the new term. The "x-versions.csv" table for the new term should have a value in the replaces_version column of the term URI of the term that was replaced. The status of the replaced term should also be manually changed to `deprecated`.

Because these changes do not fit into the normal term modification workflow, most of them will need to be made manually. However, changes to the term list metadata can be made using this script. If other terms on the term list are going to be modified, the cells in section 5 will be run anyway. If the term deprecation is the only change to be made to the term list, create a changes file that contains only headers (no data rows), then run sections 1 and 2, then skip to section 5. Do the steps in section 5 except for the ones noted as requiring manual editing for deprecation.  Whether sections 6 and 7 need to be run depends on whether they were already run for another term list in the vocabulary or if the term list isn't part of a vocabulary (e.g. tdwgutility: terms).

# 1 Initial setup

Import modules

In [None]:
# Written by Steve Baskauf 2020-06-29 CC0

import csv
import sys
import os
import shutil
import datetime
import copy
import pandas as pd

Set variables for this run

In [None]:
namespaceUri = 'http://rs.tdwg.org/dwcpw/values/'
database = 'pathway'
date_issued = '2020-06-29'
local_offset_from_utc = '-05:00'
versions = database + '-versions'
modifications_filename = 'pathway-revised.csv'
vocab_type = 3 # 1 is simple vocabulary, 2 is simple controlled vocabulary, 3 is c.v. with broader hierarchy
version_namespace = namespaceUri + 'version/'

# For borrowed terms, specify the termlist URI here:

termlist_uri = ''

# For terms minted by TDWG that follow URI pattern conventions, leave this the empty string and the 
# termlist_uri will be set to the namespace URI.
# In both cases the termlist version URI will be constructed from the namespace URI
if termlist_uri == '':
    termlist_uri = namespaceUri

# namespace is actually the last component of the termlist URI, not of the namespace URI
pieces = termlist_uri.split('/')
namespace = pieces[len(pieces)-2]
vocabulary = pieces[len(pieces)-3]

Define utility functions

In [None]:
def readCsv(filename):
    fileObject = open(filename, 'r', newline='', encoding='utf-8')
    readerObject = csv.reader(fileObject)
    array = []
    for row in readerObject:
        array.append(row)
    fileObject.close()
    return array

def writeCsv(fileName, array):
    fileObject = open(fileName, 'w', newline='', encoding='utf-8')
    writerObject = csv.writer(fileObject)
    for row in array:
        writerObject.writerow(row)
    fileObject.close()

    # returns a list with first item Boolean and second item the index
def findColumnWithHeader(header_row_list, header_label):
    found = False
    for column_number in range(0, len(header_row_list)):
        if header_row_list[column_number] == header_label:
            found = True
            found_column = column_number
    if found:
        return [True, found_column]
    else:
        return [False, 0]
    
def isoTime(offset):
    currentTime = datetime.datetime.now()
    return currentTime.strftime("%Y-%m-%dT%H:%M:%S") + offset

## 1.1 Generate files in database directories (new term lists only)

There are a number of files that need to exist in the current terms database directory and the versions database directory. This part of the script generates them from templates.

In [None]:
# get the mutable column headers from the modifications file
modifications_metadata = readCsv(modifications_filename)
mutable_header = modifications_metadata[0][1:len(modifications_metadata[0])]

# create database directories
try:
    os.mkdir('../' + database)
    os.mkdir('../' + database + '-versions')
# do nothing if there is an error (i.e. they already exist)
except:
    pass

# copy files needed in the current terms database directory
endings = ['-classes.csv', '-replacements-classes.csv', '-replacements-column-mappings.csv', '-replacements.csv', '-versions-classes.csv', '-versions-column-mappings.csv', '-versions.csv']
source_path = 'files_for_new/current_terms/'
for file_ending in endings:
    source = source_path + 'template' + file_ending
    destination = '../' + database + '/' + database + file_ending
    dest_path = shutil.copyfile(source, destination)
dest_path = shutil.copyfile(source_path + 'namespace.csv', '../' + database + '/namespace.csv')

# select current terms column mapping file appropriate for modifications spreadsheet
if vocab_type == 1: # simple vocabulary
    in_file = 'simple-vocabulary-column-mappings.csv'
elif vocab_type == 2: # simple controlled vocabulary
    in_file = 'simple-cv-column-mappings.csv'
elif vocab_type == 3: # c.v. with skos:broader hierarchy
    in_file = 'cv-hierarchy-column-mappings.csv'
else: # This should not happen
    in_file = 'simple-vocabulary-column-mappings.csv'

frame = pd.read_csv(source_path + in_file, na_filter=False)
for index,row in frame.iterrows():
    # replace the placeholder IRIs with the namespace IRI
    if row['header'] == 'skos_inScheme':
        frame.at[index,'value'] = namespaceUri
    if row['header'] == 'skos_broader':
        frame.at[index,'value'] = namespaceUri
frame.to_csv('../' + database + '/' + database + '-column-mappings.csv', index=False)
    
# set the core class file and domain root in the constants.csv configuration file
frame = pd.read_csv(source_path + 'constants.csv', na_filter=False)
frame.at[0,'domainRoot'] = namespaceUri
frame.at[0,'coreClassFile'] = database + '.csv'
frame.to_csv('../' + database + '/constants.csv', index=False)
    
# set the versions and replacements filenames in the linked-classes.csv file
frame = pd.read_csv(source_path + 'linked-classes.csv', na_filter=False)
for index,row in frame.iterrows():
    # replace the placeholder filenames with the actual linked file names
    if row['link_column'] == 'term_localName':
        frame.at[index,'filename'] = database + '-versions.csv'
    if row['link_column'] == 'replaced_term_localName':
        frame.at[index,'filename'] = database + '-replacements.csv'
frame.to_csv('../' + database + '/linked-classes.csv', index=False)
    
# create header row for current terms metadata CSV
current_terms_header = ['document_modified', 'term_localName', 'term_isDefinedBy', 'term_created', 'term_modified', 'term_deprecated', 'replaces_term', 'replaces1_term', 'replaces2_term'] + mutable_header
current_terms_table = [current_terms_header]
file_path = '../' + database + '/' + database + '.csv'
writeCsv(file_path, current_terms_table)


# copy files needed in the versions database directory
endings = ['-versions-classes.csv', '-versions-replacements-classes.csv', '-versions-replacements-column-mappings.csv', '-versions-replacements.csv']
source_path = 'files_for_new/versions/'
for file_ending in endings:
    source = source_path + 'template' + file_ending
    destination = '../' + database + '-versions/' + database + file_ending
    dest_path = shutil.copyfile(source, destination)
#dest_path = shutil.copyfile(source_path + 'linked-classes.csv', '../' + database + '-versions/linked-classes.csv')
dest_path = shutil.copyfile(source_path + 'namespace.csv', '../' + database + '-versions/namespace.csv')

# select versions column mapping file appropriate for modifications spreadsheet
if vocab_type == 1: # simple vocabulary
    in_file = 'simple-vocabulary-versions-column-mappings.csv'
elif vocab_type == 2: # simple controlled vocabulary
    in_file = 'simple-cv-versions-column-mappings.csv'
elif vocab_type == 3: # c.v. with skos:broader hierarchy
    in_file = 'cv-hierarchy-versions-column-mappings.csv'
else: # This should not happen
    in_file = 'simple-vocabulary-versions-column-mappings.csv'

frame = pd.read_csv(source_path + in_file, na_filter=False)
for index,row in frame.iterrows():
    # replace the placeholder IRIs with the namespace IRI
    if row['header'] == 'skos_inScheme':
        frame.at[index,'value'] = namespaceUri
    if row['header'] == 'skos_broader':
        frame.at[index,'value'] = namespaceUri
    if row['header'] == 'term_localName':
        frame.at[index,'value'] = namespaceUri
frame.to_csv('../' + database + '-versions/' + database + '-versions-column-mappings.csv', index=False)

# set the core class file and domain root in the constants.csv configuration file
frame = pd.read_csv(source_path + 'constants.csv', na_filter=False)
frame.at[0,'domainRoot'] = namespaceUri + 'version/'
frame.at[0,'coreClassFile'] = database + '-versions.csv'
frame.to_csv('../' + database + '-versions/constants.csv', index=False)

# set the versions and replacements filenames in the linked-classes.csv file
frame = pd.read_csv(source_path + 'linked-classes.csv', na_filter=False)
for index,row in frame.iterrows():
    # replace the placeholder filename with the actual linked file name
    if row['link_column'] == 'replaced_version_localName':
        frame.at[index,'filename'] = database + '-versions-replacements.csv'
frame.to_csv('../' + database + '-versions/linked-classes.csv', index=False)

# create header row for versions metadata CSV
versions_header = ['document_modified', 'version', 'versionLocalName', 'version_isDefinedBy', 'version_issued', 'version_status', 'replaces_version', 'replaces1_version', 'replaces2_version'] + mutable_header + ['term_localName']
versions_table = [versions_header]
file_path = '../' + database + '-versions/' + database + '-versions.csv'
writeCsv(file_path, versions_table)


# 2 Extract information from metadata files

**Notes for all tables:** row 0 (the first row) is a header row. The columns whose headers contain `replaces1_` and `replaces2_` are used in the uncommon situation where a term or version replaces more than one other term or version.  In such cases, the tables would need to be modified manually.  

The table containing the **modifications** (term additions and changes) looks like this:

![](images/mods-table.png)

The `term_localName` column is the primary key for this table.

The table containing **current terms metadata** looks like this:

![](images/current-terms-table1.png)
![](images/current-terms-table2.png)

Notice that all of the column headers from `label` onwards to the right are the same as those in the modifications table. The column headers to the left of `label` are ideosyncratic for the current terms table type and must be handled specially.  The `term_localName` column is the primary key for this table.  The `term_deprecated` and `replaces_term` columns are for unusual situations and aren't managed by this script.  They would need to be edited manually when those situations occur.

The table containing **versions metadata** looks like this:

![](images/versions-table1.png)
![](images/versions-table2.png)

As with the current terms table, all of the column headers from `label` onwards to the right are the same as the previous two tables. The column headers to the left of `label` are ideosyncratic for the table type and must be handled specially. The `versionLocalName` column is the primary key for this table.  The `term_localName` column is a foreign key that relates rows in this table to rows in the other two tables.  

## 2.1 Read in the tables of current terms and of modifications

In [None]:
terms_metadata_filename = '../' + database + '/' + database + '.csv'
terms_metadata = readCsv(terms_metadata_filename)

modifications_metadata = readCsv(modifications_filename)
print('Current terms table headers: ', terms_metadata[0])
print()
print('Modifications table headers: ', modifications_metadata[0])

The script loads data only on the basis of the column names and not their position.  So a number of variables are defined that hold the column numbers for various fields.  

Find which column numbers in the modifications file and the metadata file hold the term local name.  This column is the primary key for the table and therefore is used to match rows for terms that need to be modified with the corresponding rows in the current term metadata table.  The `term_localName` column is the only one in the modifications table that is not potentially a vocabulary-specific metadata field.

In [None]:
result = findColumnWithHeader(modifications_metadata[0], 'term_localName')
if result[0] == False:
    print('The modifications file does not have a term_localName column')
    sys.exit()
else:
    mods_local_name = result[1]

# don't error trap here because all existing files should have a local name column header
result = findColumnWithHeader(terms_metadata[0], 'term_localName')
metadata_localname_column = result[1]
print('Modifications table local name column: ', mods_local_name)
print('Term metadata table local name column: ', metadata_localname_column)

Create a list of the local names of terms to be added or modified

In [None]:
mods_term_localName = []
for term_number in range(1, len(modifications_metadata)):
    mods_term_localName.append(modifications_metadata[term_number][mods_local_name])
print(mods_term_localName)

Find out which terms are new terms and which are modified old terms.  Create a list for each.

In [None]:
new_terms = []
modified_terms = []
for test_term in mods_term_localName:
    found = False
    for term in terms_metadata:
        if test_term == term[metadata_localname_column]:
            found = True
            modified_terms.append(test_term)
    if not found:
        new_terms.append(test_term)
print('New terms: ', new_terms)
print('Modified terms: ', modified_terms)

# 3 Update versions-related metadata (non-borrowed only)

**Note:** Skip the entire section 3 for borrowed terms that do not have versions.

The master versions metadata table (in the `xx-versions` directory) is used to generate metadata about versions.  The versions join table (in the current terms directory) is a minimal joins table used to generate the `hasVersion` links for current terms.  The versions replacements table is used to generate `dcterms:replaces` and `dcterms:isReplacedBy` links for version metadata records.

Dublin Core terms are a special case. They have versions, but those versions are managed by DCMI. So updating things like examples in our metadata cannot trigger new term version IRIs. The term version metadata must just be edited manually to reflect the changes in TDWG-specific metadata rather than using parts 3.x of this script.

## 3.1 Load master versions metadata and determine positions of special columns

Read in the term versions metadata file

In [None]:
term_versions_metadata_filename = '../' + versions + '/' + versions + '.csv'
term_versions_metadata = readCsv(term_versions_metadata_filename)
print('Version headers: ', term_versions_metadata[0])

Find the positions of the ideosyncratic version columns.  Recall:

![](images/versions-table1.png)
![](images/versions-table2.png)

In [None]:
version_modified = findColumnWithHeader(term_versions_metadata[0], 'document_modified')[1]
version_column = findColumnWithHeader(term_versions_metadata[0], 'version')[1]
version_local_name = findColumnWithHeader(term_versions_metadata[0], 'versionLocalName')[1]
version_isDefinedBy = findColumnWithHeader(term_versions_metadata[0], 'version_isDefinedBy')[1]
version_issued = findColumnWithHeader(term_versions_metadata[0], 'version_issued')[1]
version_status = findColumnWithHeader(term_versions_metadata[0], 'version_status')[1]
replaces_version = findColumnWithHeader(term_versions_metadata[0], 'replaces_version')[1]
version_term_local_name_column = findColumnWithHeader(term_versions_metadata[0], 'term_localName')[1]

## 3.2 Supersede old versions of the modified terms

Go through each version and supersede any that match the local names of the modified terms

In [None]:
print('rows in versions table to be superseded:')
for term in modified_terms:
    for version_row in range(1, len(term_versions_metadata)):
        if term_versions_metadata[version_row][version_term_local_name_column] == term and term_versions_metadata[version_row][version_status] == 'recommended':
            print(version_row, term)
            term_versions_metadata[version_row][version_status] = 'superseded'
            term_versions_metadata[version_row][version_modified] = isoTime(local_offset_from_utc)

## 3.3 Create new versions of new and modified terms

Make sure that all columns in modified terms file are in the term versions file

In [None]:
for column in modifications_metadata[0]:
    result = findColumnWithHeader(term_versions_metadata[0], column)
    if result[0] == False:
        print('The versions file is missing the ', column, ' column.')
        sys.exit()

Open the versions join table, which looks like this:

![](images/version-joins-table.png)

In [None]:
versions_join_table_filename = '../' + database + '/' + versions + '.csv'
versions_join_table = readCsv(versions_join_table_filename)
print('Versions join table headers: ', versions_join_table[0])

Create new versions and new version joins lists

In [None]:
newVersions = []
newVersionJoins = []

Create a row in the new term versions list for the added or modified terms

In [None]:
for row_number in range(1, len(modifications_metadata)):
    newVersion = []
    # create a column for every column in the term version file
    for column in term_versions_metadata[0]:
        # find the column in the modifications file that matches the version column and add its value
        result = findColumnWithHeader(modifications_metadata[0], column)
        if result[0] == True:
            newVersion.append(modifications_metadata[row_number][result[1]])
        else:
            newVersion.append('')
    # set the modification dateTime for the newly created version
    newVersion[version_modified] = isoTime(local_offset_from_utc)
    newVersions.append(newVersion)

Insert metadata specific to the new versions

In [None]:
for rowNumber in range(0, len(newVersions)):
    # need to add one to the row of modifications_metadata because it includes a header row
    currentTermLocalName = modifications_metadata[rowNumber + 1][mods_local_name]
    newVersions[rowNumber][version_issued] = date_issued
    newVersions[rowNumber][version_status] = 'recommended'
    newVersions[rowNumber][version_local_name] = currentTermLocalName + '-' + date_issued
    newVersions[rowNumber][version_isDefinedBy] = version_namespace
    newVersions[rowNumber][version_column] = version_namespace + currentTermLocalName + '-' + date_issued

    # if the new version replaces an older one for the term, we need to provide a value for the `replaces_version` column
    if currentTermLocalName in modified_terms:
        # look through metadata for old versions to find the most recent version of the term
        mostRecent = 'a' # start with a string value earlier in alphabetization than any term version URI
        for version_row in range(1, len(term_versions_metadata)):
            if term_versions_metadata[version_row][version_term_local_name_column] == currentTermLocalName:
                # Make it the mostRecent if it's later than the previous mostRecent
                if term_versions_metadata[version_row][version_column] > mostRecent:
                    mostRecent = term_versions_metadata[version_row][version_column]
        # insert the most recent version found into the appropriate column
        newVersions[rowNumber][replaces_version] = mostRecent
    
    # create a join record for each new version and add it to the list of new joins
    newVersionJoin =[ newVersions[rowNumber][version_column], modifications_metadata[rowNumber + 1][mods_local_name] ]
    newVersionJoins.append(newVersionJoin)

print(newVersions)
print()
print(newVersionJoins)

Append the new versions to the old version tables and save the file

In [None]:
revised_term_versions_metadata = term_versions_metadata + newVersions
writeCsv('../' + versions + '/' + versions + '.csv', revised_term_versions_metadata)

revised_term_versions_joins = versions_join_table + newVersionJoins
writeCsv('../' + database + '/' + versions + '.csv', revised_term_versions_joins)

## 3.4 Update the versions replacements table

The versions replacements table links a newly created version of a modified term to the previous version it is replacing. 

Since new terms do not have any previous versions, they will not have entries in this table. Thus replacement entries only need to be generated for terms on the `modified_terms` list.

The script only handles cases where there is a new version of a term with the same local name.  In cases where an old term is deprecated and is replaced by a new term with a different local name, the replacement table will need to be updated manually.

Here's an example of what a replacements table looks like:

![](images/version-replacements-table.png)

Open it:

In [None]:
versions_replacements_table_filename = '../' + versions + '/' + versions + '-replacements.csv'
versions_replacements_table = readCsv(versions_replacements_table_filename)
print('Versions replacements table headers: ', versions_replacements_table[0])

Find the most recent previous version for each term on the `modified_terms` list and generate a replacement record for it.

In [None]:
# create a list to hold the newly generated replacements rows
newReplacements = []

for modifiedTerm in modified_terms:
    # generate the newly created version URI for the modified term
    newVersion = version_namespace + modifiedTerm  + '-' + date_issued
    # step through the list of previous versions and find the one with the most recent issued date
    mostRecent = 'a'
    count = 0
    for oldVersion in versions_join_table:
        if count > 0: # skip the header row
            # the second column in the join table is the term local name
            if oldVersion[1] == modifiedTerm:
                # the first column in the join table is the full version URI
                if oldVersion[0] > mostRecent:
                    mostRecent = oldVersion[0]
        count +=1
    # once the most revent version URI is found, we need to extract the local name
    mostRecentLocal = mostRecent.split('/')[6]
    newReplacements.append([newVersion, mostRecentLocal])
print(newReplacements)

Append the new replacement list to the existing table and save as a file

In [None]:
revised_versions_replacements_table = versions_replacements_table + newReplacements
writeCsv('../' + versions + '/' + versions + '-replacements.csv', revised_versions_replacements_table)

# 4 Process current terms metadata

## 4.1 determine positions of special columns

Find the positions of the ideosyncratic current terms columns (table already loaded). Recall:

![](images/current-terms-table1.png)
![](images/current-terms-table2.png)

In [None]:
term_modified_dateTime = findColumnWithHeader(terms_metadata[0], 'document_modified')[1]
term_localName = findColumnWithHeader(terms_metadata[0], 'term_localName')[1]
term_modified = findColumnWithHeader(terms_metadata[0], 'term_modified')[1]
term_created = findColumnWithHeader(terms_metadata[0], 'term_created')[1]
term_isDefinedBy = findColumnWithHeader(terms_metadata[0], 'term_isDefinedBy')[1]

## 4.2 Modify current terms metadata table

Each item in the term modifications list will either modify existing term metadata or add new term metadata

In [None]:
print('Changed current terms rows: ')
# step through each row in the modification metadata table and modify existing current terms when applicable
for mods_rownumber in range(1, len(modifications_metadata)):
    mods_localname_string = modifications_metadata[mods_rownumber][mods_local_name]
    modified = False
    for term_name in modified_terms:
        # only make a modification if it's on the list of terms to be modified
        if mods_localname_string == term_name:
            modified = True
    # this section of code modifies existing terms
    if modified:
        # find the row in the terms metadata file for the term to be modified
        for term_rownumber in range(1, len(terms_metadata)):
            if mods_localname_string == terms_metadata[term_rownumber][term_localName]:
                terms_metadata[term_rownumber][term_modified_dateTime] = isoTime(local_offset_from_utc)
                terms_metadata[term_rownumber][term_modified] = date_issued
                # replace every column that's in the modifications metadata
                for column_number in range(0, len(modifications_metadata[0])):
                    # find the column in the current terms metadata table that matches the modifications column and replace the current term's value
                    result = findColumnWithHeader(terms_metadata[0], modifications_metadata[0][column_number])
                    if result[0] == True:
                        terms_metadata[term_rownumber][result[1]] = modifications_metadata[mods_rownumber][column_number]
                    else:
                        pass # this shouldn't really happen since there already was a check that all columns existed in the versions table
                print(terms_metadata[term_rownumber])
    # this section of code adds new term metadata
    else: 
        newTermRow = []
        for column in range(0, len(terms_metadata[0])):
            newTermRow.append('')
        newTermRow[term_modified_dateTime] = isoTime(local_offset_from_utc)
        newTermRow[term_modified] = date_issued
        newTermRow[term_created] = date_issued
        newTermRow[term_isDefinedBy] = namespaceUri
        # replace every column that's in the modifications metadata
        for column_number in range(0, len(modifications_metadata[0])):
            # find the column in the current terms metadata table that matches the modifications column and replace the current term's value
            result = findColumnWithHeader(terms_metadata[0], modifications_metadata[0][column_number])
            if result[0] == True:
                newTermRow[result[1]] = modifications_metadata[mods_rownumber][column_number]
            else:
                pass # this shouldn't really happen since there already was a check that all columns existed in the versions table
        print(newTermRow)
        terms_metadata.append(newTermRow)
writeCsv('../' + database + '/' + database + '.csv', terms_metadata)

# 5 Generate new term list version

These changes to terms trigger a new version of the term list.  The term lists are in the special directories `term-lists` and `term-lists-versions`.  

## 5.1 Open tables

The master term lists and term list versions joins tables are similar to the terms metadata tables and their joins tables:

In [None]:
term_lists_table_filename = '../term-lists/term-lists.csv'
term_lists_table = readCsv(term_lists_table_filename)
print('Term lists table headers and first row: ', term_lists_table[0:2])

In [None]:
term_lists_versions_joins_filename = '../term-lists/term-lists-versions.csv'
term_lists_versions_joins = readCsv(term_lists_versions_joins_filename)
print('Term lists versions joins table headers and first row: ', term_lists_versions_joins[0:2])

The `term-list-members.csv` file is used to create the one:many links between the term list and the terms that are members of it.

In [None]:
term_lists_members_filename = '../term-lists/term-lists-members.csv'
term_lists_members = readCsv(term_lists_members_filename)
print('Term lists members table headers and first row: ', term_lists_members[0:2])

The master term list versions metadata table is used to generate records of the versions.

In [None]:
term_lists_versions_metadata_filename = '../term-lists-versions/term-lists-versions.csv'
term_lists_versions_metadata = readCsv(term_lists_versions_metadata_filename)
print('Term lists versions metadata table headers and first row: ', term_lists_versions_metadata[0:2])

The term lists versions members table tracks all of the term versions that are part of a particular version of a term list. It's used to generate `dcterms:hasPart` and `dcterms:isPartOf` links.

In [None]:
term_lists_versions_members_filename = '../term-lists-versions/term-lists-versions-members.csv'
term_lists_versions_members = readCsv(term_lists_versions_members_filename)
print('Term lists versions members table headers and first row: ', term_lists_versions_members[0:2])

The term lists versions replacements is used to generate the `dcterms:replaces` and `dcterms:isReplacedBy` links between term list versions.

In [None]:
term_lists_versions_replacements_filename = '../term-lists-versions/term-lists-versions-replacements.csv'
term_lists_versions_replacements = readCsv(term_lists_versions_replacements_filename)
print('Term lists versions replacements table headers and first row: ', term_lists_versions_replacements[0:2])

The index of datasets is used to enable a dump of the entire TDWG metadata dataset. The datasets are all of the directories in the rs.tdwg.org repo. Generally there are two datasets for every namespace in every vocabulary, one for the current terms and one for the term versions. There are also datasets for special resources like decisions, tdwgutility terms, term lists, vocabularies, and standards. The last modified dates need to be updated for these as well as new datasets must be added to the list.

In [None]:
datasets_index_filename = '../index/index-datasets.csv'
datasets_index = readCsv(datasets_index_filename)
print('Dataset index table headers and first row: ', datasets_index[0:2])

## 5.2 Update tables

Generate the URI for the new term list version.
**Note:** need to trap for ideosyncratic URI pattern of `http://rs.tdwg.org/dwc/terms/attributes/`

In [None]:
if namespaceUri == 'http://rs.tdwg.org/dwc/terms/attributes/':
    termlistVersionUri = 'http://rs.tdwg.org/dwc/version/terms/attributes/' + date_issued
else:
    uriPieces = termlist_uri.split('/')
    # split the URI between the vocabulary and term list subpaths
    termlistVersionUri = uriPieces[0] + '//' + uriPieces[2] + '/' + uriPieces[3] + '/version/' + uriPieces[4] + '/' + date_issued
    print(uriPieces)
    print(termlistVersionUri)

Update the `list_modified` value for the focal term list. **Note:** Any modifications to the term list label or description needs to be done manually to the CSV file.

In [None]:
list_uri = findColumnWithHeader(term_lists_table[0], 'list')[1]
list_created = findColumnWithHeader(term_lists_table[0], 'list_created')[1]
list_modified = findColumnWithHeader(term_lists_table[0], 'list_modified')[1]
modified_datetime = findColumnWithHeader(term_lists_table[0], 'document_modified')[1]
standard_column = findColumnWithHeader(term_lists_table[0], 'standard')[1]
list_description = findColumnWithHeader(term_lists_table[0], 'description')[1]

aNewTermList = True
for rowNumber in range(1, len(term_lists_table)):
    # by convention, the namespace URI used for the terms is the same as the URI of the term list
    if termlist_uri == term_lists_table[rowNumber][list_uri]:
        aNewTermList = False
        term_list_rowNumber = rowNumber
        term_lists_table[rowNumber][list_modified] = date_issued
        term_lists_table[rowNumber][modified_datetime] = isoTime(local_offset_from_utc)
        # here is the opportunity to find out the standard URI for the modified term list
        standardUri = term_lists_table[rowNumber][standard_column]
        print(term_lists_table[rowNumber])
if aNewTermList:  # this will happen if the term list did not previously exist
    try:
        new_term_list = readCsv('files_for_new/new_term_list.csv')
    except:
        print('The term list was not found and there was no new_term_list.csv file.')
        sys.exit()
    # Note: no error trapping is done here, so make sure that the new_term_list columns are the same as term_lists_table
    new_term_list[1][modified_datetime] = isoTime(local_offset_from_utc)
    new_term_list[1][list_created] = date_issued
    new_term_list[1][list_modified] = date_issued
    standardUri = new_term_list[1][standard_column]
    # the length of the table (including header row) will be one more than the last row number
    term_list_rowNumber = len(term_lists_table)
    term_lists_table.append(new_term_list[1])
    # after the new row is appended, its row number will be one more than the previous last row number
    print('added a new term list row')
    print(term_lists_table[term_list_rowNumber])
    
    # The new term list's dataset directory must be added to the dataset list. 
    row_for_current_terms = [isoTime(local_offset_from_utc), # document_modified
                             database, # term_localName
                             'http://rs.tdwg.org/index', # dcterms_isPartOf
                             'http://rs.tdwg.org/index/' + database, # dataset_iri
                             date_issued, # dcterms_modified
                             new_term_list[1][list_description], # label
                             ''] # rdfs_comment
    datasets_index.append(row_for_current_terms)
    
    # New term lists will always have a new version dataset directory, so add it, too.
    row_for_versions = [isoTime(local_offset_from_utc), # document_modified
                        versions, # term_localName
                        'http://rs.tdwg.org/index', # dcterms_isPartOf
                        'http://rs.tdwg.org/index/' + versions, # dataset_iri
                        date_issued, # dcterms_modified
                        new_term_list[1][list_description] + ' versions', # label
                        ''] # rdfs_comment
    datasets_index.append(row_for_versions)
    
else: # If the term list isn't new, then its modified date needs to be updated.
    # find the row in the dataset director file for the dataset being modified
    for dataset_rownumber in range(1, len(datasets_index)):
        # update current terms modified date
        if database == datasets_index[dataset_rownumber][1]: # the name is in column 1
            datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
            datasets_index[dataset_rownumber][4] = date_issued # the date modified is in column 4
        # update versions modified date
        if versions == datasets_index[dataset_rownumber][1]:
            datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
            datasets_index[dataset_rownumber][4] = date_issued

# Update the date for the term lists, term list versions, and all higher levels regardless of whether it's new or not
for dataset_rownumber in range(1, len(datasets_index)):
    if 'term-lists' == datasets_index[dataset_rownumber][1]:
        datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
        datasets_index[dataset_rownumber][4] = date_issued
    if 'term-lists-versions' == datasets_index[dataset_rownumber][1]:
        datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
        datasets_index[dataset_rownumber][4] = date_issued
    if 'vocabularies' == datasets_index[dataset_rownumber][1]:
        datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
        datasets_index[dataset_rownumber][4] = date_issued
    if 'vocabularies-versions' == datasets_index[dataset_rownumber][1]:
        datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
        datasets_index[dataset_rownumber][4] = date_issued
    if 'standards' == datasets_index[dataset_rownumber][1]:
        datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
        datasets_index[dataset_rownumber][4] = date_issued
    if 'standards-versions' == datasets_index[dataset_rownumber][1]:
        datasets_index[dataset_rownumber][0] = isoTime(local_offset_from_utc)
        datasets_index[dataset_rownumber][4] = date_issued
        
print()
print('standard URI: ', standardUri)
writeCsv('../term-lists/term-lists.csv', term_lists_table)
writeCsv('../index/index-datasets.csv', datasets_index)

Add a row for the new term list version in the term list versions joins file.

In [None]:
term_lists_versions_joins.append([termlistVersionUri, termlist_uri])
writeCsv('../term-lists/term-lists-versions.csv', term_lists_versions_joins)

Add only the new terms to the term list members table. **When only deprecations are being made, skip this step and manually remove the deprecated term from the "term-lists-members.csv" file.** 

In [None]:
for newTerm in new_terms:
    term_lists_members.append([termlist_uri, namespaceUri + newTerm])
writeCsv('../term-lists/term-lists-members.csv', term_lists_members)

Add new term list version to the master term list version metadata file. The script assumes that all of the metadata remains the same as the previous version. If any values are different, change the CSV manually.

In [None]:
# find the columns than contain needed information
list_uri = findColumnWithHeader(term_lists_versions_metadata[0], 'list')[1]
document_modified = findColumnWithHeader(term_lists_versions_metadata[0], 'document_modified')[1]
version_uri = findColumnWithHeader(term_lists_versions_metadata[0], 'version')[1]
version_modified = findColumnWithHeader(term_lists_versions_metadata[0], 'version_modified')[1]
status_column = findColumnWithHeader(term_lists_versions_metadata[0], 'status')[1]

if aNewTermList:
    # get the template for the term list version from first data row in the new_term_list_version.csv file
    try:
        new_term_list_version = readCsv('files_for_new/new_term_list_version.csv')
    except:
        print('The term list version was not found and there was no new_term_list_version.csv file.')
        sys.exit()
    newListRow = new_term_list_version[1]
else:
    # find the most recent previous version of the term list
    mostRecent = 'a' # start the value of mostRecent as something earlier alphabetically than all of the list URIs
    mostRecentListNumber = 0 # dummy list number to be replaced when most recent list is found
    for termListRowNumber in range(1, len(term_lists_versions_metadata)):
        # the row is one of the versions of the list
        if term_lists_versions_metadata[termListRowNumber][list_uri] == termlist_uri:
            # Make the version of the row the mostRecent if it's later than the previous mostRecent
            if term_lists_versions_metadata[termListRowNumber][version_uri] > mostRecent:
                mostRecent = term_lists_versions_metadata[termListRowNumber][version_uri]
                mostRecentListNumber = termListRowNumber

    # change the status of the most recent list to superseded
    term_lists_versions_metadata[mostRecentListNumber][status_column] = 'superseded'
    term_lists_versions_metadata[mostRecentListNumber][document_modified] = isoTime(local_offset_from_utc)

    # start the new list row with the metadata from the most recent list
    newListRow = copy.deepcopy(term_lists_versions_metadata[mostRecentListNumber])

# substitute metadata to make the most recent list have the modified dates for the new list
newListRow[document_modified] = isoTime(local_offset_from_utc)
newListRow[version_uri] = termlistVersionUri
newListRow[version_modified] = date_issued
newListRow[status_column] = 'recommended'

# append the new term list row to the old list of term lists
term_lists_versions_metadata.append(newListRow)

# save as a file
writeCsv('../term-lists-versions/term-lists-versions.csv', term_lists_versions_metadata)

### TDWG minted terms ONLY

**Do not run this code block for borrowed terms that do not have versions!** The placeholder term versions will have to be manually added under the new term list version URI.  See the various Audubon Core borrowed terms in the file for examples.

Create the term list version members list for the new version of the term list.  This includes every pre-existing term version that hasn't changed, the new versions of terms that changed, and term versions for new terms that weren't previously on the list.

**For term deprecations, run this step, then manually remove the last version of the deprecated term from the most recent term version list.**

In [None]:
# create a list of every term version that was in the most recent previous list version
newTermVersionMembersList = []
# create a corresponding list of local names for those versions
termLocalNameList = []

if not aNewTermList:
    for termVersion in term_lists_versions_members:
        # the first column contains the term list version
        if term_lists_versions_metadata[mostRecentListNumber][version_uri] == termVersion[0]:
            newTermVersionMembersList.append(termVersion[1])

            # dissect the term version URI to pull out the local name of the term version
            pieces = termVersion[1].split('/')
            versionLocalNamePiece = pieces[len(pieces)-1]
            # split off the local name string from the issue date part of the version local name
            termLocalNameList.append(versionLocalNamePiece.split('-')[0])

    # For each modified term, find its previous version and replace it with the new version.
    for modified_term in modified_terms:
        for termVersionRowNumber in range(0, len(newTermVersionMembersList)):
            if modified_term == termLocalNameList[termVersionRowNumber]:
                # change the version on the list to the new one
                newTermVersionMembersList[termVersionRowNumber] = namespaceUri + 'version/' + termLocalNameList[termVersionRowNumber] + '-' + date_issued

# For each newly added term, add its new version to the list.
for new_term in new_terms:
    newTermVersionMembersList.append(namespaceUri + 'version/' + new_term + '-' + date_issued)

# Now that the list is created of new term versions that are part of the new term version list,
# add a record for each one to the term list versions members table
for termVersionMember in newTermVersionMembersList:
    term_lists_versions_members.append([termlistVersionUri, termVersionMember])

# Write the updated term list versions members table to a file
writeCsv('../term-lists-versions/term-lists-versions-members.csv', term_lists_versions_members)

Add a record to the term list versions replacements table showing that the new term list version has replaced the previous one (unless the term list is new and not replacing anything).

In [None]:
if not aNewTermList:
    term_lists_versions_replacements.append([termlistVersionUri, term_lists_versions_metadata[mostRecentListNumber][version_uri]])
    writeCsv('../term-lists-versions/term-lists-versions-replacements.csv', term_lists_versions_replacements)

# 6 Generate new vocabulary version

**Note:** The code has been designed to avoid duplication of the same new vocabulary version in the case where the script has already been run for a different term list that was updated for the same vocabulary version.

The update process is very similar to that of the term lists.

## 6.1 Open tables

The types of tables related to vocabularies and their versions are very similar to those of term lists.  The are located in the directories `vocabularies` and `vocabularies-versions`. 

In [None]:
vocabularies_table_filename = '../vocabularies/vocabularies.csv'
vocabularies_table = readCsv(vocabularies_table_filename)
print('Vocabularies table headers and first row: ', vocabularies_table[0:2])

vocabularies_versions_joins_filename = '../vocabularies/vocabularies-versions.csv'
vocabularies_versions_joins = readCsv(vocabularies_versions_joins_filename)
print('Vocabularies versions joins table headers and first row: ', vocabularies_versions_joins[0:2])

vocabularies_members_filename = '../vocabularies/vocabularies-members.csv'
vocabularies_members = readCsv(vocabularies_members_filename)
print('Vocabularies members table headers and first row: ', vocabularies_members[0:2])

vocabularies_versions_metadata_filename = '../vocabularies-versions/vocabularies-versions.csv'
vocabularies_versions_metadata = readCsv(vocabularies_versions_metadata_filename)
print('Vocabularies versions metadata table headers and first row: ', vocabularies_versions_metadata[0:2])

vocabularies_versions_members_filename = '../vocabularies-versions/vocabularies-versions-members.csv'
vocabularies_versions_members = readCsv(vocabularies_versions_members_filename)
print('Vocabularies versions members table headers and first row: ', vocabularies_versions_members[0:2])

vocabularies_versions_replacements_filename = '../vocabularies-versions/vocabularies-versions-replacements.csv'
vocabularies_versions_replacements = readCsv(vocabularies_versions_replacements_filename)
print('Vocabularies versions replacements table headers and first row: ', vocabularies_versions_replacements[0:2])

## 6.2 Update tables

**Note:** The steps here are analogous to updating term lists.

Generate the URIs for the vocabulary and the new vocabulary version

In [None]:
# find the vocabulary subpath for the updated term list
list_localName_column = findColumnWithHeader(term_lists_table[0], 'list_localName')[1]
list_localName = term_lists_table[term_list_rowNumber][list_localName_column]
print('List local name: ', list_localName)
# the vocabulary subpath is the first part of the list local name
vocab_subpath = list_localName.split('/')[0]
print('Vocabulary subpath: ', vocab_subpath)
termList_subpath = list_localName.split('/')[1]
print('Term List subpath: ', termList_subpath)

# generate the vocabulary URI
vocabularyUri = 'http://rs.tdwg.org/' + vocab_subpath + '/'

# generate the vocabulary version URI
vocabularyVersionUri = 'http://rs.tdwg.org/version/' + vocab_subpath + '/' + date_issued
print('New version URI: ', vocabularyVersionUri)

# check for the case where the script was previously run to update a different term list in the same new vocabulary version
temp = findColumnWithHeader(vocabularies_versions_metadata[0], 'version')[1]
alreadyAddedVocab = False
for versionRow in vocabularies_versions_metadata:
    if versionRow[temp] == vocabularyVersionUri:
        alreadyAddedVocab = True

print('Already added : ', alreadyAddedVocab)

Update the `vocabulary_modified` value for the focal vocabulary. **Note:** Any modifications to the vocabulary label or description needs to be done manually to the CSV file.  

This code block doesn't really do anything in the case where a second term list is being added in the creation of a new vocabulary version, but it doesn't hurt to run it.

In [None]:
vocabulary_uri = findColumnWithHeader(vocabularies_table[0], 'vocabulary')[1]
vocabulary_created = findColumnWithHeader(vocabularies_table[0], 'vocabulary_created')[1]
vocabulary_modified = findColumnWithHeader(vocabularies_table[0], 'vocabulary_modified')[1]
modified_datetime = findColumnWithHeader(vocabularies_table[0], 'document_modified')[1]

aNewVocabulary = True
for rowNumber in range(1, len(vocabularies_table)):
    if vocabularyUri == vocabularies_table[rowNumber][vocabulary_uri]:
        aNewVocabulary = False
        vocabulary_rowNumber = rowNumber
        # In the case where changes are made to a second term list of a new vocabulary, the new modified date will be the same as before
        vocabularies_table[rowNumber][vocabulary_modified] = date_issued
        vocabularies_table[rowNumber][modified_datetime] = isoTime(local_offset_from_utc)
        print(vocabularies_table[rowNumber])

if aNewVocabulary: # this will happen if the vocabulary did not previously exist 
    try:
        new_vocabulary_row = readCsv('files_for_new/new_vocabulary.csv')[1]
    except:
        print('The vocabulary was not found and there was no new_vocabulary.csv file.')
        sys.exit()
    new_vocabulary_row[vocabulary_created] = date_issued
    new_vocabulary_row[vocabulary_modified] = date_issued
    new_vocabulary_row[modified_datetime] = isoTime(local_offset_from_utc)
    vocabularies_table.append(new_vocabulary_row)

writeCsv('../vocabularies/vocabularies.csv', vocabularies_table)

Add a row for the new vocabulary version in the vocabulary versions joins file. In cases where a second term list is being updated in the same new version of the vocabulary, the first term list will already have added the vocabulary version to the list, so it won't need to be added again.

In [None]:
if not alreadyAddedVocab:
    vocabularies_versions_joins.append([vocabularyVersionUri, vocabularyUri])
    writeCsv('../vocabularies/vocabularies-versions.csv', vocabularies_versions_joins)

Add to the vocabulary members table if there is a new term list

In [None]:
if aNewTermList:
    vocabularies_members.append([vocabularyUri, termlist_uri])
    writeCsv('../vocabularies/vocabularies-members.csv', vocabularies_members)

Add vocabulary version to the master vocabulary version metadata file. The script assumes that all of the metadata remains the same as the previous version. If any values are different, change the CSV manually.

In cases where a second term list is being updated in the same new version of the vocabulary, the first term list will already have added the vocabulary version to the list, so it won't need to be added again.  That is, this code block won't really do anything new, but it doesn't hurt to run it.

In [None]:
# find the columns than contain needed information
vocabulary_uri = findColumnWithHeader(vocabularies_versions_metadata[0], 'vocabulary')[1]
document_modified = findColumnWithHeader(vocabularies_versions_metadata[0], 'document_modified')[1]
version_uri = findColumnWithHeader(vocabularies_versions_metadata[0], 'version')[1]
version_issued = findColumnWithHeader(vocabularies_versions_metadata[0], 'version_issued')[1]
status_column = findColumnWithHeader(vocabularies_versions_metadata[0], 'vocabulary_status')[1]

if not alreadyAddedVocab:
    if aNewVocabulary: # this will happen if the vocabulary did not previously exist 
        try:
            newVocabularyRow = readCsv('files_for_new/new_vocabulary_version.csv')[1]
        except:
            print('The vocabulary version was not found and there was no new_vocabulary_version.csv file.')
            sys.exit()
        # the new row will be added to the end and therefore will have an index number - number of rows before appending
        mostRecentVocabularyNumber = len(vocabularies_versions_metadata)
    else:
        # find the most recent previous version of the vocabulary
        mostRecent = 'a' # start the value of mostRecent as something earlier alphabetically than all of the vocabulary version URIs
        mostRecentVocabularyNumber = 0 # dummy vocabulary number to be replaced when most recent vocabulary version is found
        for vocabularyRowNumber in range(1, len(vocabularies_versions_metadata)):
            # the row is one of the versions of the vocabulary
            if vocabularies_versions_metadata[vocabularyRowNumber][vocabulary_uri] == vocabularyUri:
                # Make the version of the row the mostRecent if it's later than the previous mostRecent
                if vocabularies_versions_metadata[vocabularyRowNumber][version_uri] > mostRecent:
                    mostRecent = vocabularies_versions_metadata[vocabularyRowNumber][version_uri]
                    mostRecentVocabularyNumber = vocabularyRowNumber

        # change the status of the most recent vocabulary to superseded
        vocabularies_versions_metadata[mostRecentVocabularyNumber][status_column] = 'superseded'
        vocabularies_versions_metadata[mostRecentVocabularyNumber][document_modified] = isoTime(local_offset_from_utc)

        # start the new vocabulary row with the metadata from the most recent vocabulary
        newVocabularyRow = copy.deepcopy(vocabularies_versions_metadata[mostRecentVocabularyNumber])

    # substitute metadata to make the most recent vocabulary have the modified dates for the new vocabulary
    newVocabularyRow[document_modified] = isoTime(local_offset_from_utc)
    newVocabularyRow[version_uri] = vocabularyVersionUri
    newVocabularyRow[version_issued] = date_issued
    newVocabularyRow[status_column] = 'recommended'

    # append the new term list row to the old list of term lists
    vocabularies_versions_metadata.append(newVocabularyRow)

    # save as a file
    writeCsv('../vocabularies-versions/vocabularies-versions.csv', vocabularies_versions_metadata)

**This is the code block in the vocabulary section that MUST be run when more than one term list is being updated per version of the vocabulary!**

Create the vocabularies versions members list for the new version of the vocabulary. This includes every pre-existing term list version that hasn't changed and the new version of the term list that changed.

In [None]:
# If this is the second term list change for a new vocabulary version, the previous term list versions will already have been added.
# So they don't need to be added to the list.  
if not alreadyAddedVocab:
    # create a list of every term list version that was in the most recent previous vocabulary version
    newVocabularyMembersList = []
    # create a corresponding list of local names for those term list versions
    termListLocalNameList = []

    if aNewVocabulary:
        # the new term list version should be added to the list
        newVocabularyMembersList.append(termlistVersionUri)
    else:
        # find all of the term list versions for the most recent vocabulary version
        for termListVersion in vocabularies_versions_members:
            # the first column contains the vocabulary version
            if vocabularies_versions_metadata[mostRecentVocabularyNumber][version_uri] == termListVersion[0]:
                newVocabularyMembersList.append(termListVersion[1])

                # dissect the term list version URI to pull out the local name of the term list version
                pieces = termListVersion[1].split('/')
                versionLocalNamePiece = pieces[len(pieces)-2]
                termListLocalNameList.append(versionLocalNamePiece)
        if aNewTermList:
            # the new term list version needs be added to the list
            newVocabularyMembersList.append(termlistVersionUri)
        else:
            # For the modified term list, find its previous version and replace it with the new new version.
            for termListVersionRowNumber in range(0, len(newVocabularyMembersList)):
                if termList_subpath == termListLocalNameList[termListVersionRowNumber]:
                    # change the term list version on the list to the new one
                    newVocabularyMembersList[termListVersionRowNumber] = termlistVersionUri
    
    # Now that the list of new term list versions that are part of the new vocabulary version list is created,
    # add a record for each one to the vocabulary versions members table
    for termListVersionMember in newVocabularyMembersList:
        vocabularies_versions_members.append([vocabularyVersionUri, termListVersionMember])

# In the case where previous term list versions have already been added and a new vocabulary version already generated, 
# we only need to update the new term list version.
else: 
    if aNewTermList:
        # the new term list version needs be added to the list
        vocabularies_versions_members.append([vocabularyVersionUri, termlistVersionUri])
    else:
        # For a modified term list, find its previous version and replace it with the new version.
        for termListVersionRowNumber in range(1, len(vocabularies_versions_members)):
            # consider only term lists that match the vocabulary version URI
            if vocabularies_versions_members[termListVersionRowNumber][0] == vocabularyVersionUri:
                # dissect the term list version URI to pull out the local name of the term list version
                pieces = vocabularies_versions_members[termListVersionRowNumber][1].split('/')
                versionLocalNamePiece = pieces[len(pieces)-2]
                # check for a match of the term list version local name with the namespace string
                if versionLocalNamePiece == namespace:
                    # change the term list version on the list to the new one
                    vocabularies_versions_members[termListVersionRowNumber][1] = termlistVersionUri
    
# Write the updated vocabularies versions members table to a file
writeCsv('../vocabularies-versions/vocabularies-versions-members.csv', vocabularies_versions_members)

If there was a prevous vocabulary version, add a record to the vocabulary versions replacements table showing that the new vocabulary version has replaced the previous one.  Also, don't repeat this if the new vocabulary version had already been added when a different term list had already been processed for this vocabulary version.

So this code block doesn't need to be run if this is the second term list to be updated for a new vocabulary version. But it doesn't hurt to run it because it won't do anything.

In [None]:
if not(aNewVocabulary) and not(alreadyAddedVocab):
    vocabularies_versions_replacements.append([vocabularyVersionUri, vocabularies_versions_metadata[mostRecentVocabularyNumber][version_uri]])
    writeCsv('../vocabularies-versions/vocabularies-versions-replacements.csv', vocabularies_versions_replacements)

# 7 Generate new standard version

**Note:** The script tries to avoid generating duplicate versions of a standard if the script was previously run on a different term list for some vocabulary in the standard.  There isn't really any reason to run any part of section 7 if this is the second term list added to a vocabulary.  However, running the cells shouldn't hurt anything.

The update process is very similar to that of the vocabularies.

## 7.1 Open tables

The types of tables related to standards and their versions are very similar to those of vocabularies.  The are located in the directories `vocabularies` and `vocabularies-versions`. 

In [None]:
standards_table_filename = '../standards/standards.csv'
standards_table = readCsv(standards_table_filename)
print('Standards table headers and first row: ', standards_table[0:2])

In [None]:
standards_versions_joins_filename = '../standards/standards-versions.csv'
standards_versions_joins = readCsv(standards_versions_joins_filename)
print('Standards versions joins table headers and first row: ', standards_versions_joins[0:2])

In [None]:
standards_parts_filename = '../standards/standards-parts.csv'
standards_parts = readCsv(standards_parts_filename)
print('Standards parts table headers and first row: ', standards_parts[0:2])

In [None]:
standards_versions_metadata_filename = '../standards-versions/standards-versions.csv'
standards_versions_metadata = readCsv(standards_versions_metadata_filename)
print('Standards versions metadata table headers and first row: ', standards_versions_metadata[0:2])

In [None]:
standards_versions_parts_filename = '../standards-versions/standards-versions-parts.csv'
standards_versions_parts = readCsv(standards_versions_parts_filename)
print('Standards versions members table headers and first row: ', standards_versions_parts[0:2])

In [None]:
standards_versions_replacements_filename = '../standards-versions/standards-versions-replacements.csv'
standards_versions_replacements = readCsv(standards_versions_replacements_filename)
print('Standards versions replacements table headers and first row: ', standards_versions_replacements[0:2])

## 7.2 Update tables

**Note:** The steps here are analogous to updating vocabularies.

Generate the URIs for the standard and the new standard version

In [None]:
# the standard URI (variable: standardUri) was already found in section 5.2 above
print('Standard URI: ', standardUri)

# find the standard number for the standard
standard_number = standardUri.split('/')[4]
print('Standard number: ', standard_number)

# generate the standard version URI
standardVersionUri = standardUri + '/version/' + date_issued
print('New standard version URI: ', standardVersionUri)

# check for the case where the script was previously run to update a different term list in the same new standard version
temp = findColumnWithHeader(standards_versions_metadata[0], 'version')[1]
alreadyAddedStandard = False
for versionRow in standards_versions_metadata:
    if versionRow[temp] == standardVersionUri:
        alreadyAddedStandard = True

print('Already added: ', alreadyAddedStandard)

Update the `standard_modified` value for the focal standard. **Note:** Any modifications to the standard label or description needs to be done manually to the CSV file.

In [None]:
standard_uri = findColumnWithHeader(standards_table[0], 'standard')[1]
standard_created = findColumnWithHeader(standards_table[0], 'standard_created')[1]
standard_modified = findColumnWithHeader(standards_table[0], 'standard_modified')[1]
modified_datetime = findColumnWithHeader(standards_table[0], 'document_modified')[1]

aNewStandard = True
for rowNumber in range(1, len(standards_table)):
    if standardUri == standards_table[rowNumber][standard_uri]:
        aNewStandard = False
        standard_rowNumber = rowNumber
        # in cases where changes are made to a second term list of a new standard, the new modified date will be the same as before
        standards_table[rowNumber][standard_modified] = date_issued
        standards_table[rowNumber][modified_datetime] = isoTime(local_offset_from_utc)
        print(standards_table[rowNumber])

if aNewStandard: # this will happen if the standard did not previously exist 
    try:
        new_standard_row = readCsv('files_for_new/new_standard.csv')[1]
    except:
        print('The standard was not found and there was no new_standard.csv file.')
        sys.exit()
    new_standard_row[standard_created] = date_issued
    new_standard_row[standard_modified] = date_issued
    new_standard_row[modified_datetime] = isoTime(local_offset_from_utc)
    # the row is set to what the last row will be after appending
    standard_rowNumber = len(standards_table)
    print('New standard added:')
    print(new_standard_row)
    standards_table.append(new_standard_row)

writeCsv('../standards/standards.csv', standards_table)

Add a row for the new standard version in the standard versions joins file.  In cases where a second term list is being updated in the standard, the first term list will already have added the standard version to the list, so don't add it again.

In [None]:
if not alreadyAddedStandard:
    standards_versions_joins.append([standardVersionUri, standardUri])
    writeCsv('../standards/standards-versions.csv', standards_versions_joins)

If there is a new vocabulary, add it to the standards parts table

In [None]:
if aNewVocabulary:
    standards_parts.append([standardUri, vocabularyUri, 'tdwgutility:Vocabulary'])
    writeCsv('../standards/standards-parts.csv', standards_parts)

Add standard version to the master standard version metadata file. The script assumes that all of the metadata remains the same as the previous version. If any values are different, change the CSV manually.

In cases where a second term list is being updated in the same new version of the standard, the first term list will have already added the standard version to the list, so it won't need to be added again.

In [None]:
# find the columns than contain needed information
standard_uri = findColumnWithHeader(standards_versions_metadata[0], 'standard')[1]
document_modified = findColumnWithHeader(standards_versions_metadata[0], 'document_modified')[1]
version_uri = findColumnWithHeader(standards_versions_metadata[0], 'version')[1]
version_issued = findColumnWithHeader(standards_versions_metadata[0], 'version_issued')[1]
status_column = findColumnWithHeader(standards_versions_metadata[0], 'standard_status')[1]

if not alreadyAddedStandard:
    if aNewStandard: # this will happen if the standard did not previously exist 
        try:
            newStandardRow = readCsv('files_for_new/new_standard_version.csv')[1]
        except:
            print('The standard version was not found and there was no new_standard_version.csv file.')
            sys.exit()
        # the new row will be added to the end and therefore will have an index number - number of rows before appending
        mostRecentStandardNumber = len(standards_versions_metadata)
    else:
        # find the most recent previous version of the standard
        mostRecent = 'a' # start the value of mostRecent as something earlier alphabetically than all of the standard version URIs
        mostRecentStandardNumber = 0 # dummy standard number to be replaced when most recent standard version is found
        for standardRowNumber in range(1, len(standards_versions_metadata)):
            # the row is one of the versions of the standard
            if standards_versions_metadata[standardRowNumber][standard_uri] == standardUri:
                # Make the version of the row the mostRecent if it's later than the previous mostRecent
                if standards_versions_metadata[standardRowNumber][version_uri] > mostRecent:
                    mostRecent = standards_versions_metadata[standardRowNumber][version_uri]
                    mostRecentStandardNumber = standardRowNumber

        # change the status of the most recent standard to superseded
        standards_versions_metadata[mostRecentStandardNumber][status_column] = 'superseded'
        standards_versions_metadata[mostRecentStandardNumber][document_modified] = isoTime(local_offset_from_utc)

        # start the new standard row with the metadata from the most recent vocabulary
        newStandardRow = copy.deepcopy(standards_versions_metadata[mostRecentStandardNumber])

    # substitute metadata to make the most recent standard version have the modified dates for the new standard version
    newStandardRow[document_modified] = isoTime(local_offset_from_utc)
    newStandardRow[version_uri] = standardVersionUri
    newStandardRow[version_issued] = date_issued
    newStandardRow[status_column] = 'recommended'

    # append the new term list row to the old list of term lists
    standards_versions_metadata.append(newStandardRow)

    # save as a file
    writeCsv('../standards-versions/standards-versions.csv', standards_versions_metadata)

Create the standards versions members list for the new version of the standard. This includes every pre-existing vocabulary version that hasn't changed and the new version of the vocabulary that changed.

In [None]:
# If this is the second term list change for a new standard version, the previous vocabulary version will have 
# been added.  So in that case the vocabulary versions need to be checked to prevent duplication.

if not alreadyAddedStandard:
    # create a list of every vocabulary version that was in the most recent previous standard version
    newStandardMembersList = []
    # create a corresponding list of local names for those term list versions
    vocabularyLocalNameList = []

    if aNewStandard:
        # the new vocabulary version needs to be added to the list
        newStandardMembersList.append(vocabularyVersionUri)
    else:
        # find the vocabulary versions for the most recent standard version
        for vocabularyVersion in standards_versions_parts:
            # the first column contains the standard version
            if standards_versions_metadata[mostRecentStandardNumber][version_uri] == vocabularyVersion[0]:
                newStandardMembersList.append(vocabularyVersion[1])

                # dissect the vocabulary version URI to pull out the local name of the vocabulary version
                pieces = vocabularyVersion[1].split('/')
                versionLocalNamePiece = pieces[len(pieces)-2]
                vocabularyLocalNameList.append(versionLocalNamePiece)

        if aNewVocabulary:
            # the new vocabulary version needs to be added to the list
            newStandardMembersList.append(vocabularyVersionUri)
        else:
            # For the modified vocabulary, find its previous version and replace it with the new version.
            for vocabularyVersionRowNumber in range(0, len(newStandardMembersList)):
                if vocab_subpath == vocabularyLocalNameList[vocabularyVersionRowNumber]:
                    # change the vocabulary version on the list to the new one
                    newStandardMembersList[vocabularyVersionRowNumber] = vocabularyVersionUri

    # Now that the list of new vocabulary versions that are part of the new standard version list is created,
    # add a record for each one to the standard versions members table
    for vocabularyVersionMember in newStandardMembersList:
        standards_versions_parts.append([standardVersionUri, vocabularyVersionMember])
        
# In the case where previous vocabulary versions have already been added and a new standard version already generated
# we only need to update the new vocabulary version
else:
    if aNewVocabulary:
        # the new vocabulary version needs to be added to the list
        standards_versions_parts.append([standardVersionUri, vocabularyVersionUri])
    else:
        # in this case a vocabulary is modified rather than new. So find its version under the current standard version
        # and replace it with the new vocabulary version.  If the change was to a different term list but in the same
        # standard, that's fine - the vocabulary version will be replaced with the same one and duplication will still
        # be prevented
        for vocabularyVersionRowNumber in range(0, len(standards_versions_parts)):
            # consider only vocabularies that match the standard version URI
            if standards_versions_parts[vocabularyVersionRowNumber][0] == standardVersionUri:
                # dissect the vocabulary version URI to pull out the local name of the vocabulary version
                pieces = standards_versions_parts[vocabularyVersionRowNumber][1].split('/')
                versionLocalNamePiece = pieces[len(pieces)-2]
                # check for a match of the vocabulary version local name with the vocabulary string
                if versionLocalNamePiece == vocabulary:
                    # change the vocabulary version on the list to the new one
                    standards_versions_parts[vocabularyVersionRowNumber][1] = vocabularyVersionUri

# Write the updated vocabularies versions members table to a file
writeCsv('../standards-versions/standards-versions-parts.csv', standards_versions_parts)

Add a record to the standard versions replacements table showing that the new standard version has replaced the previous one.

In [None]:
if not(aNewStandard) and not(alreadyAddedStandard):
    standards_versions_replacements.append([standardVersionUri, standards_versions_metadata[mostRecentStandardNumber][version_uri]])
    writeCsv('../standards-versions/standards-versions-replacements.csv', standards_versions_replacements)