<a href="https://colab.research.google.com/github/mkernik/drum_tools/blob/colab-stringtotext/DRUM_Curator_Tools.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Automation Tools for DRUM Curators


---

**Notebook created by:**

*   Melinda Kernik, University of Minnesota Libraries
*   Valerie Collins, University of Minnesota Libraries

**Contact**: datarepo@umn.edu

The code in this notebook is intended for data curators working with records associated with the [Data Repository for the University of Minnesota](https://conservancy.umn.edu/drum). More information about this code can be found in the main [GitHub repository](https://github.com/mkernik/drum_tools).

# Table of Contents

<small>*Only the "Start here" section is a mandatory step in using this notebook. After this step is completed, any of the "Create" sections can be run in any order.*</small>

1.   Start here
  -   Create Curator Log
  -   Create Readme File
  -   Create XML File

<small>*External resources related to these tools are linked from the following sections:*</small>
2.   Known Issues and Limitations
3.   Download All Files from Record

## Start Here


---
Activate this notebook by running the cell below. You must have this notebok open in Colab to do this.

> An input box for text will appear once the notebook has activated. Copy in the **handle** for a DRUM record into this input box, and then hit the enter key. The notebook will now remember this handle, and will use it when you run any of the code blocks below.

> If you enter an incorrect value for a handle, the code below will not run, but you can enter a new value by running this starting code block again.



In [None]:
handle_url = input()

## CREATE CURATOR LOG


---
Run this block of code to create a curator log that will populate with the existing information on the record. By default, this file will be saved to your Downloads folder.

In [None]:
"""
script name: metadata_log.py

inputs: DRUM handle URL (i.e. https://hdl.handle.net/11299/226188 or https://conservancy.umn.edu/handle/11299/226188)
output: Text file with curator log template filled out with metadata from the submission

description: This script generates a metadata log based on the information
entered by researchers during the submission process to the Data Repository
for the University of Minnesota (DRUM) and the original files uploaded.

last modified: July 2022
author: Melinda Kernik
"""

import urllib.request
import math
from bs4 import BeautifulSoup
from datetime import datetime
from google.colab import files


def convert_size(size_bytes):
    """Convert file size in bytes to a more human readable format"""

    if size_bytes == 0:
        return "0B"
    size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
    i = int(math.floor(math.log(size_bytes, 1024)))
    p = math.pow(1024, i)
    s = round(size_bytes / p, 2)
    return "%s %s" % (s, size_name[i])



#Use the handle URL to construct a URL to get to the Dspace endpoint for the item
handle_split = handle_url.split ("/") [-2:]
handle = str(handle_split[0]) + "/" + str(handle_split[1])
url = "https://conservancy.umn.edu/rest/handle/" + handle

def validate_input(url):
  """Test whether the API for the DRUM item can be opened"""
  valid_url = False
  try:
    response = urllib.request.urlopen(url)
    valid_url = True
    return valid_url
  except Exception as e:
    print (url + " could not be opened. (" + str(e) + ")")

#If the API can be reached, continue generating a log file
valid = validate_input(url)
if valid:
  response = urllib.request.urlopen(url)
  #Read in the content at the endpoint and get the internal id for the item
  presoup = BeautifulSoup(response, 'lxml')
  item_info = presoup.p.text
  item_dict = eval(item_info.replace('null', '"null"'))

  internal_id = item_dict["id"]


  ###Get item bitstream information from the submission

  #Read in the content at the bitstream endpoint. Default limit is 20 items per page.
  #Extended to 250 to account for larger data submissions.
  url = "https://conservancy.umn.edu/rest/items/" + str(internal_id) + "/bitstreams?limit=250"
  response = urllib.request.urlopen(url)
  item_soup = BeautifulSoup(response, 'lxml')
  bitstream = item_soup.p.text
  list_bitstream = eval(bitstream.replace('null', '"null"'))

  #Create the item bitstream section of the log.
  #Note: Embargoed files are not visible through the API
  bitstream_string = ""
  for x in list_bitstream:
      if x['bundleName'] == "ORIGINAL":
          bitstream_string += (x['name'] + " (" + convert_size(x['sizeBytes']) + ")\n")


  ###Get metadata from the submission
  #Use the internal id to create the URL endpoint needed to access the metadata
  url = "https://conservancy.umn.edu/rest/items/" + str(internal_id) + "/metadata"

  #Read in the content at the metadata endpoint
  response = urllib.request.urlopen(url)
  soup = BeautifulSoup(response, 'lxml')
  metadata = soup.p.text
  list_metadata = eval(metadata.replace('null', '"null"'))

  #Create the original metadata section of the log
  metadata_string = ""
  for x in range(len(list_metadata)):
      metadata_string += list_metadata[x]['key'] + " : " + list_metadata[x]['value'] +"\n"
      #Create variables for a few specific metadata elements (title and handle) to use in the log header
      if list_metadata[x]['key']=='dc.title':
          title = list_metadata[x]['value']
      if list_metadata[x]['key']=='dc.identifier.uri':
          handle_uri = list_metadata[x]['value']
      if list_metadata[x]['key'] == 'dc.date.available':
          date_split = list_metadata[x]['value'].split("T")
          date_available = date_split[0]

  ###Add the bitstream and metadata lists to the template metadata log text
  metadata_log_template = "Curation log for: " + title + """
Handle: """ + handle_uri + """
Corresponding researcher:
Curator:
Metadata log created: """ + str(datetime.now().strftime("%Y-%m-%d")) + " (Dataset published: " + date_available + ")" + """
\n*************************************************
Files received:
*************************************************\n""" + bitstream_string + """
*************************************************
Changes made to files:
*************************************************

**************************************************
Metadata Changes
**************************************************

**************************************************
Correspondence Notes
**************************************************

*************************************************
Other issues
*************************************************

*************************************************
Original Metadata from Author:
*************************************************\n"""  + metadata_string

  metadata_filename = (str(handle_split[1]) + "_CuratorLog_" + str(datetime.now().strftime("%Y%m%d")) + ".txt")
  with open(metadata_filename, 'w') as f:
    f.write(metadata_log_template)

  try:
    files.download(metadata_filename)
    print("###################################")
    print("            SUCCESS!               ")
    print("     Check your downloads folder   ")
    print("###################################")

  except Exception as e:
    print(str(e))


## CREATE README FILE


---
Run this block of code to create a readme file that will populate with the existing information on the record. By default, this file will be saved to your Downloads folder.

In [None]:
"""
script name: automated_readme.py

inputs: DRUM handle URL (i.e. https://hdl.handle.net/11299/226188 or https://conservancy.umn.edu/handle/11299/226188)
output: A readme file in TXT format

description: This script generates a readme from the information entered by
researchers during the submission process to the Data Repository for the
University of Minnesota (DRUM). It is based on a readme template originally
developed at Cornell Univeristy.

last modified: September 2022
author: Melinda Kernik
"""

import urllib
from bs4 import BeautifulSoup
from string import Template
from datetime import datetime
from google.colab import files


#Use the handle URL to construct a URL to get to the Dspace endpoint for the item
handle_split = handle_url.split ("/") [-2:]
handle = str(handle_split[0]) + "/" + str(handle_split[1])
url = "https://conservancy.umn.edu/rest/handle/" + handle

def validate_input(url):
  """Test whether the API for the DRUM item can be opened"""
  valid_url = False
  try:
    response = urllib.request.urlopen(url)
    valid_url = True
    return valid_url
  except Exception as e:
    print (url + " could not be opened. (" + str(e) + ")")

#If the API can be reached, continue generating a Readme
valid = validate_input(url)
if valid:
  response = urllib.request.urlopen(url)
  #Read in the content at the endpoint and get the internal id for the item
  presoup = BeautifulSoup(response, 'lxml')
  item_info = presoup.p.text
  item_dict = eval(item_info.replace('null', '"null"'))

  internal_id = item_dict["id"]

  #Use the internal id to create the URL endpoint needed to access the metadata
  url = "https://conservancy.umn.edu/rest/items/" + str(internal_id) + "/metadata"


  ###Get metadata from the submission

  #Read in the content at the metadata endpoint
  response = urllib.request.urlopen(url)
  soup = BeautifulSoup(response, 'lxml')
  metadata = soup.p.text
  list_metadata = eval(metadata.replace('null', '"null"'))

  #Create an dictionary to be filled with metadata values from the submission
  metadata_dict = {'readme_date': str(datetime.now().strftime("%Y-%m-%d")),
                   'author_citation':"", 'year_published':"", 'url':"",
                   'title':"",'date_published':"", 'authors':"", 'contact_author': "", 'date_collected':"",
                   'spatial':"", 'abstract': "", 'license_info':"", 'publications':"",
                   'funding':"", 'file_list':""}

  #Create lists and dictionaries to hold multi-valued metadata elements or values
  #that need to be edited before being added to the dictionary
  authors_list = []
  referenceby = []
  funders = []
  rights_dict = {}
  date_collected_dict = {}


  #For each metadata field in Dspace, check if it is something to be included in the readme.
  #If it is, add it to the metadata dictionary or to a list for further processing.
  for x in range(len(list_metadata)):

      #If the record has been assigned a DOI, use that for the recommended citation. Otherwise, use the handle.
      if list_metadata[x]['key'] == 'dc.identifier.doi':
        metadata_dict ['url'] = list_metadata[x]['value']
      else:
        metadata_dict ['url'] = handle_url

      ##General information
      if list_metadata[x]['key'] == 'dc.title':
          metadata_dict ['title'] = list_metadata[x]['value']

      if list_metadata[x]['key'] == 'dc.contributor.author':
          authors_list.append(list_metadata[x]['value'])

      #Retrieve the last name of the contact person to be used in the filename
      if list_metadata[x]['key'] == 'dc.contributor.contactname':
          contact_name = list_metadata[x]['value']
          contact_split = contact_name.split (",") [:]
          contact_lastname = contact_split[0].replace(" ", "_")

      if list_metadata[x]['key'] == 'dc.contributor.contactemail':
          contact_email = list_metadata[x]['value']

      #Split the date field and use only YYYYMMDD, not exact time
      if list_metadata[x]['key'] == 'dc.date.available':
          date_split = list_metadata[x]['value'].split("T")
          metadata_dict ['date_published'] = date_split[0]
          #Isolate the year published to use in the Readme filename
          year_split = date_split[0].split("-")
          year_published = year_split[0]
          metadata_dict ['year_published'] = year_published

      if list_metadata[x]['key'] == 'dc.date.collectedbegin':
          date_collected_dict['begin'] = list_metadata[x]['value']
      if list_metadata[x]['key'] == 'dc.date.collectedend':
          date_collected_dict['end'] = list_metadata[x]['value']

      if list_metadata[x]['key'] == 'dc.coverage.spatial':
          metadata_dict ['spatial'] = list_metadata[x]['value']

      if list_metadata[x]['key'] == 'dc.description.sponsorship':
          funders.append(list_metadata[x]['value'])

      if list_metadata[x]['key'] == 'dc.description.abstract':
          metadata_dict ['abstract'] = list_metadata[x]['value']

      #Sharing/Access Information
      #Remove formatting from dc.rights field before adding it to the metadata dictionary
      if list_metadata[x]['key'] == 'dc.rights':
          rights_dict['rights'] = list_metadata[x]['value'].replace('\r\n', " ")
      if list_metadata[x]['key'] == 'dc.rights.uri':
          rights_dict['rights_url'] = list_metadata[x]['value']

      if list_metadata[x]['key'] == 'dc.relation.isreferencedby':
          referenceby.append(list_metadata[x]['value'])


  ###Format multi-valued metadata elements to be added to the metadata dictionary

  #Format author information for use in the recommended citation
  author_citation = ""
  author_count = 0
  for author in authors_list:
    #Count while going through the list of authors.  When you reach the last author,
    #use a period, otherwise separate using a semicolon.
    author_count += 1
    if author_count == len(authors_list):
      author_citation += author + "."
    else:
      author_citation += author + "; "
  metadata_dict ['author_citation'] = author_citation

  #Format author information for use in the author contact section
  author_string = ""
  for author in authors_list:
      #Rearrange author name to be First Last instead of Last, First
      author_split = author.split (",") [:]
      author_firstLast = author_split[1] + " " + author_split[0]
      #If the author is the contact person, add their email address. If not, leave email blank.
      if author == contact_name:
          author_string += "\n\tName: " + author_firstLast + "\n\tInstitution:\n\tEmail: " + contact_email + "\n\tORCID:\n\n"
      else:
          author_string += "\n\tName: " + author_firstLast + "\n\tInstitution:\n\tEmail:\n\tORCID:\n\n"
  metadata_dict ['authors'] = author_string

  try:
    contact_author_string = "\tAuthor Contact: " + contact_split[1] + " " + contact_split[0] + " (" + contact_email + ")"
  except:
    contact_author_string = "\tAuthor Contact: " + contact_name + " (" + contact_email + ")"
  metadata_dict ['contact_author'] = contact_author_string

  funders_string = ""
  for funder in funders:
      funders_string += "\t" + funder + "\n"
  metadata_dict ['funding'] = funders_string

  publications_string = ""
  for item in referenceby:
      publications_string += item + "\n\n"
  metadata_dict ['publications'] = publications_string


  ## Add together multiple Dspace fields to be used in one section of the readme
  if date_collected_dict:
      metadata_dict ['date_collected'] = str(date_collected_dict['begin']) + " to " + str(date_collected_dict['end'])

  if rights_dict:
      try:
        rights_string = rights_dict['rights'] + " (" + rights_dict["rights_url"] + ")"
        metadata_dict ['license_info'] = rights_string
      except:
        metadata_dict ['license_info'] = rights_dict['rights']


  ###Get item bitstream information from the submission

  #Read in the content at the bitstream endpoint. Default limit is 20 items per page.
  #Extended to 250 to account for larger data submissions.
  url = "https://conservancy.umn.edu/rest/items/" + str(internal_id) + "/bitstreams?limit=250"
  response = urllib.request.urlopen(url)
  item_soup = BeautifulSoup(response, 'lxml')
  bitstream = item_soup.p.text
  list_bitstream = eval(bitstream.replace('null', '"null"'))

  #Create the "File List" section of the readme and add it to the metadata dictionary
  file_list_string = "File List\n\n"
  for x in list_bitstream:
      if x['bundleName'] == "ORIGINAL":
          file_list_string += ("\tFilename: " + x['name'] +" \n\tShort description: " + x['description'] + "\n\n")
  metadata_dict ['file_list'] = file_list_string



  ###Insert metadata elements from the submission into the template readme text
  readme_template = Template(
"""This readme.txt file was generated on ${readme_date} by <Name>
Recommended citation for the data: ${author_citation} (${year_published}). ${title}. Retrieved from the Data Repository for the University of Minnesota. ${url}.\n
-------------------
GENERAL INFORMATION
-------------------\n
1. Title of Dataset: ${title}\n
2. Author Information\n\n${contact_author}\n${authors}
3. Date published or finalized for release: ${date_published}\n\n
4. Date of data collection (single date, range, approximate date): ${date_collected}\n\n
5. Geographic location of data collection (where was data collected?): ${spatial}\n\n
6. Information about funding sources that supported the collection of the data:\n${funding}\n
7. Overview of the data (abstract):\n${abstract}\n\n\n\n
--------------------------
SHARING/ACCESS INFORMATION
--------------------------\n
1. Licenses/restrictions placed on the data: ${license_info}\n
2. Links to publications that cite or use the data:\n${publications}
3. Was data derived from another source?
\tIf yes, list source(s):\n
4. Terms of Use: Data Repository for the U of Minnesota (DRUM) By using these files, users agree to the Terms of Use. https://conservancy.umn.edu/pages/policies/#drum=terms-of-use\n\n\n\n
---------------------
DATA & FILE OVERVIEW
---------------------\n
${file_list}\n
2. Relationship between files:\n\n
--------------------------
METHODOLOGICAL INFORMATION
--------------------------\n
1. Description of methods used for collection/generation of data:\n\n
2. Methods for processing the data: <describe how the submitted data were generated from the raw or collected data>\n\n
3. Instrument- or software-specific information needed to interpret the data:\n\n
4. Standards and calibration information, if appropriate:\n\n
5. Environmental/experimental conditions:\n\n
6. Describe any quality-assurance procedures performed on the data:\n\n
7. People involved with sample collection, processing, analysis and/or submission:\n\n\n\n""")

  #Replace variables in the template with the information from the metadata dictionary
  readme_string = readme_template.substitute(metadata_dict)


  ###Add a data_specific section to the readme for each spreadsheet file
  #Make a list of all "Original" bitstream items with ".csv" or ".xlsx" in the name
  spreadsheets = []
  data_specific_string = ""
  for x in list_bitstream:
      if x['bundleName'] == "ORIGINAL":
          if ".csv" in x['name']:
              spreadsheets.append(x['name'])
          #Will pick up a range of Excel formats including .xls, .xlsx, and .xlsm
          if ".xls" in x['name']:
              spreadsheets.append(x['name'])

  #If there are no files with .csv or .xls extensions in the submission, add a
  #placeholder "[FILENAME]" so that there will be one example section
  if not spreadsheets:
      spreadsheets.append("[FILENAME]")

  for item in spreadsheets:
      data_specific_string += """-----------------------------------------
DATA-SPECIFIC INFORMATION FOR: """ + item + """\n-----------------------------------------\n
1. Number of variables:\n
2. Number of cases/rows:\n
3. Missing data codes:\n
\tCode/symbol\tDefinition
\tCode/symbol\tDefinition\n
4. Variable List\n
\tA. Name: <variable name>
\t   Description: <description of the variable>
\t\tValue labels if appropriate\n
\tB. Name: <variable name>
\t   Description: <description of the variable>
\t\tValue labels if appropriate\n\n\n\n"""

  #Add the data-specific section(s) onto the end of the readme
  readme_full_string = readme_string + data_specific_string

  #Create the file name using contact person's last name and the year the submission was published
  #If contact person has not been identified create a file name with just the year published
  try:
    readme_filename = ("Readme_" + contact_lastname + "_" + year_published + ".txt")
  except:
    readme_filename = ("Readme_" + year_published + ".txt")
    print ("The name given for the contact author did not exactly match any of the names in the author list. Their contact info will need to be added to the Readme manually.")

  #Generate the Readme
  with open(readme_filename, 'w') as f:
    f.write(readme_full_string)

  try:
    files.download(readme_filename)
    print("###################################")
    print("            SUCCESS!               ")
    print("     Check your downloads folder   ")
    print("###################################")
  except Exception as e:
    print(str(e))

## CREATE XML FILE


---
Run this block of code to create an XML file that is formatted in the DataCite metadata schema, based on the information on the record. This file will be saved in an XML format in your Downloads folder, and will need to be uploaded to DataCite to create a DOI for the record.

> [Instructions for uploading the file to DataCite](https://docs.google.com/document/d/16CVkUWrRRStqErDS_L5DRoAaLOEZlAoJBtiiOKFCirE/edit#)


In [None]:
"""
script name: datacite_xml.py

inputs: DRUM handle URL (i.e. https://hdl.handle.net/11299/226188 or https://conservancy.umn.edu/handle/11299/226188)
output: An XML file in DataCite

description: This script generates an XML file using the information from the DRUM record.
The XML file is formatted in the DataCite Metadata Schema 4.4, and can generate a DOI when
uploaded to DataCite.

last modified: July 2022
author: Valerie Collins & Melinda Kernik
"""

import urllib.request
from bs4 import BeautifulSoup
from string import Template
from datetime import datetime
from google.colab import files


#Use the handle URL to construct a URL to get to the Dspace endpoint for the item
handle_split = handle_url.split ("/") [-2:]
handle = str(handle_split[0]) + "/" + str(handle_split[1])
url = "https://conservancy.umn.edu/rest/handle/" + handle

def validate_input(url):
  """Test whether the API for the DRUM item can be opened"""
  valid_url = False
  try:
    response = urllib.request.urlopen(url)
    valid_url = True
    return valid_url
  except Exception as e:
    print (url + " could not be opened. (" + str(e) + ")")

#If the API can be reached, continue generating a XML
valid = validate_input(url)
if valid:

  #Read in the content at the item API endpoint and get the internal id for the item
  response = urllib.request.urlopen(url)
  presoup = BeautifulSoup(response, 'lxml')
  item_info = presoup.p.text
  item_dict = eval(item_info.replace('null', '"null"'))

  internal_id = item_dict["id"]

  metadata_url = "https://conservancy.umn.edu/rest/items/" + str(internal_id) + "/metadata"

  #Read in the content at the metadata endpoint
  response = urllib.request.urlopen(metadata_url)
  soup = BeautifulSoup(response, 'lxml')
  metadata = soup.p.text
  list_metadata = eval(metadata.replace('null', '"null"'))


  #Create lists to hold the multi-valued metadata elements & empty strings for values that might not exist
  authors_list = []
  subjects_list = []
  rights_string = ""
  description_string = ""
  technical_desc_string = ""
  abstract_string = ""
  license_text = ""
  license_url = ""

  #For each metadata field in Dspace, check if it is something to be included in the XML.
  #If it is, save it to a variable.
  for x in range(len(list_metadata)):
      ##General information
      if list_metadata[x]['key'] == 'dc.title':
          title = list_metadata[x]['value']

      if list_metadata[x]['key'] == 'dc.contributor.author':
          authors_list.append(list_metadata[x]['value'])

      if list_metadata[x]['key'] == 'dc.subject':
          subjects_list.append(list_metadata[x]['value'])

      if list_metadata[x]['key'] == 'dc.date.available':
          date_available = list_metadata[x]['value']

      if list_metadata[x]['key'] == 'dc.identifier.uri':
          alt_id = list_metadata[x]['value']

      ##checking for variables that may not exist
      if list_metadata[x]['key'] == "dc.description":
          technical_description = list_metadata[x]['value']
          technical_desc_string = """
<description descriptionType="TechnicalInfo">"""+technical_description+"""</description>"""

      if list_metadata[x]['key'] == 'dc.description.abstract':
          abstract = list_metadata[x]['value']
          abstract_string = """
<description descriptionType="Abstract">""" + abstract + """</description>"""

      #if abstract or description element exists, then build the description block
      if technical_desc_string != "" or abstract_string != "":
          description_string = """
<descriptions>""" + abstract_string + technical_desc_string + """
</descriptions>"""


      if list_metadata[x]['key'] == 'dc.rights':
          license_text = list_metadata[x]['value']

      if list_metadata[x]['key'] == 'dc.rights.uri':
          license_url = list_metadata[x]['value']

      #license text and URI must both be present to build the rights block
      if license_text != "" and license_url != "":
          rights_string = """
<rightsList>
  <rights rightsURI=\""""+license_url+"""\">"""+license_text+"""</rights>
</rightsList>"""


  ### Format multi-valued metadata element "author" to be added to the XML
  author_string = ""
  for author in authors_list:
      #Split up author name
      author_split = author.split (", ") [:]
      author_first = author_split[1]
      author_last = author_split[0].strip()
      #loop through authors and append each new XML <creator> block to author_string
      author_string += """
<creator>
  <creatorName nameType="Personal">""" + author + """</creatorName>
  <givenName>""" + author_first + """</givenname>
  <familyName>""" + author_last + """</familyname>
</creator>"""

  #format <subject> block if subjects exist
  subject_string = ""
  if bool(subjects_list):
      subjects = ""
      #loop through and build out subject block
      for subject in subjects_list:
          subjects += """
  <subject>""" + subject + """</subject> """
      #add subject blocks to outer tags
      subject_string = """
<subjects>""" + subjects + """\n</subjects>"""

  #schema template
  datacite_schema = """<?xml version="1.0" encoding="UTF-8"?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 https://schema.datacite.org/meta/kernel-4.4/metadata.xsd">
<identifier identifierType="DOI"></identifier>
<creators> """ + author_string + """
</creators>
<titles>
  <title>""" + title + """</title>
</titles>
<publisher>Data Repository for the University of Minnesota (DRUM)</publisher>
<publicationYear>""" + str(datetime.now().strftime("%Y")) + """</publicationYear>
<resourceType resourceTypeGeneral="Dataset"/>""" + subject_string + """
<dates>
  <date dateType="Available">""" + date_available + """</date>
</dates>
<alternateIdentifiers>
  <alternateIdentifier alternateIdentifierType="Handle">""" + alt_id + """</alternateIdentifier>
</alternateIdentifiers>
<sizes/>
<formats/>
<version/>""" + rights_string + description_string + """
</resource>"""


  schema_file_name = (str(handle_split[1]) + "_doi_xml.xml")

  with open(schema_file_name, 'w') as f:
    f.write(datacite_schema)

  try:
    files.download(schema_file_name)
    print("###################################")
    print("            SUCCESS!               ")
    print("     Check your downloads folder   ")
    print("###################################")
  except Exception as e:
    print(str(e))

# Known Issues and Limitations
---

DRUM curators can find a full list of known issues and limitations with these tools for our workflows in [this Google Drive document](https://docs.google.com/document/d/1CHwFyh4679QYuL7nouN5F9SxGK7I1-3W2B1i83lxm-o/edit#heading=h.3aytv8au0zxl).

# Download All Files from Record


---

We have created a separate, standalone tool that can be used to download all files from a record to a directory on your computer. Files will be downloaded as they are listed on the DRUM record, and will not be zipped into one package. This tool works on Windows environments only.

You can access the .exe file [here](https://z.umn.edu/DRUM_downloadFiles).  Download and extract the folder.  Click to open "DRUM_downloadFiles.exe."  You will have to override your anti-virus software if it's your first time opening the file.

The python script version of the tool is also available on [GitHub](https://github.com/mkernik/drum_tools/blob/main/DRUM_downloadFiles.py).