# Zenodo Toolbox: Upload Files and Publish Record (Sandbox)

This notebook demonstrates how to use the Zenodo Toolbox to upload files to a draft record and then publish it on the Zenodo Sandbox. This process simulates the complete data submission workflow in a safe testing environment.

The publishing process in the Zenodo Sandbox involves several key steps:

1. **Creating a Draft**: This is what we did in the previous notebook. It establishes a new deposition with your metadata in the Sandbox environment.

2. **Uploading Files**: In this step, you add your test files to the draft deposition in the Sandbox.

3. **Publishing**: Once you're satisfied with your metadata and uploaded files, you publish the deposition in the Sandbox. This action:
   - Finalizes the record
   - Assigns a test DOI (Digital Object Identifier)
   - Simulates making the record publicly accessible

While records published in the Sandbox are not persistent, this helps in understanding the process and outcome of publishing records on the main Zenodo platform.

### What We'll Cover

In this notebook, we will:

1. **Connect to an Existing Draft**: We'll use the response data from our previously created draft on Sandbox Zenodo.

2. **Upload Files**: We'll demonstrate how to upload a test image to your draft record.

3. **Publish the Record**: We'll publish the record in the Sandbox, simulating the official publication process.

4. **Review the Published Record**: We'll examine the response to confirm successful publication and retrieve important information about the published record in the Sandbox.


Before proceeding, ensure that you have completed the "Create a Record" notebook using the Zenodo Sandbox and have saved its response data as a JSON. Let's begin by setting up our environment and loading the previous response data.

In [None]:
import os
from pathlib import Path

os.chdir(Path().absolute().parent) if Path().absolute().name == "Tutorials" else None
from main_functions import delete_files_in_draft, publish_record, upload_files_into_deposition
from utilities import append_to_json, load_json, printJSON, write_json

# Initial Configuration, see Notebook #01 for other configurations
ZENODO_BASE_URL = "https://sandbox.zenodo.org"
ZENODO_API_KEY = os.environ.get("ZENODO_SANDBOX_API_KEY")
HEADERS = {"Content-Type": "application/json"}
PARAMS = {"access_token": ZENODO_API_KEY}

# Load the draft data from the JSON file
draft_data = load_json("Tutorials/Output/sandbox_drafts.json")[-1]  # catches the latest response data

# Extract necessary link from the draft data to upload files
bucket_url = draft_data['links']['bucket']

# Define filepaths to the files that shall be uploaded
filepaths = ["Tutorials/Images/test_image.png", "Tutorials/Images/test_image_2.png"]

# Upload Files and retrieve Response
fileupload_msg, fileupload_data = upload_files_into_deposition(draft_data, filepaths, replace_existing=True)

# Print the resulting response
print("\nResponse of Fileupload to Zenodo Sandbox:")
printJSON(fileupload_data)

if fileupload_msg["success"] and fileupload_data:
    append_to_json(fileupload_data, "Tutorials/Output/sandbox_files.json")
    print("\nFiles successfully uploaded! Response Data saved to: ./Tutorials/Output/sandbox_files.json")
    [print(f"\nDirect Link to {i['filename']}: {i['links']['download'].replace('/files', '/draft/files')}") for i in fileupload_data]
else:
    print("\nFailed to upload Files. Please check the error message above or in fileupload_msg['text']:")
    print(fileupload_msg["text"])

### Delete Files in Draft

Now, if you would want to repeat the above operation, you will receive a 400 response code, as the filenames are already existing in the deposit. To solve this, you can provide the previously retrieved response of the fileupload to delete the uploaded files.

<small>

Note: If a new version of a Record is created, not a completely new Record, the files response can be acquired by `response_data["files"]`.

</small>

In [None]:
fileupload_data = load_json("Tutorials/Output/sandbox_files.json")[-1] # load latest fileupload data
delete_msg, delete_data = delete_files_in_draft(fileupload_data) # delete files in draft
if delete_msg["success"]:
    print(f"Files successfully deleted from draft: {' | '.join(file['filename'] for file in fileupload_data)}")
else:
    print("\nFailed to delete files from draft. Please check the error messages:")
    print(delete_msg["text"])

After executing the above operation, the deposit should be empty again. Without any files in the deposit, you can not publish your Record, so upload them again:

In [None]:
fileupload_msg, fileupload_data = upload_files_into_deposition(draft_data, filepaths, replace_existing=True)

print("\nResponse of Fileupload to Zenodo Sandbox:")
printJSON(fileupload_data)

if fileupload_msg["success"] and fileupload_data:
    append_to_json(fileupload_data, "sandbox_files.json")
    print("\nFiles successfully uploaded! Response Data saved to: sandbox_files.json")
    [print(f"\nDirect Link to {i['filename']}: {i['links']['download'].replace('/files', '/draft/files')}") for i in fileupload_data]
else:
    print("\nFailed to upload Files. Please check the error message above or in fileupload_msg['text']:")
    print(fileupload_msg["text"])

If everything went fine, you should now be able to view the resulting file response in [sandbox_files.json](Output/sandbox_files.json).

### Publish Record

Now that we have uploaded our files to the draft record, we're ready to publish it on the Zenodo Sandbox. Publishing the record will finalize it, assign a test DOI, and simulate making the record publicly accessible.

Let's use the `publish_record` function to publish our draft:

In [None]:
# Publish the record
publish_msg, publish_data = publish_record(draft_data)

if publish_msg["success"]:
    print("Record successfully published!")
    print(f"DOI: {publish_data['doi']}")
    print(f"Record URL: {publish_data['links']['record_html']}")
    
    # Save the published record data
    append_to_json(publish_data, "Tutorials/Output/sandbox_published.json")
    print("Published record data saved to: Tutorials/Output/sandbox_published.json")
else:
    print("Failed to publish record. Error message:")
    print(publish_msg["text"])

### Review Published Record
After successfully publishing the record, you should be able to view the result in [sandbox_published.json](Output/sandbox_published.json). Let's review some key information about our newly published record in the Zenodo Sandbox:

<small>

Note: To retrieve the correct direct link to the files, '/draft' must be removed from the response in `published_data['files'][n]['links']['download']`. For some reason, it returns the draft link only, even if published.

</small>

In [None]:
# Load the published record data
published_data = load_json("Tutorials/Output/sandbox_published.json")[-1] # Note: Index was set to the latest response here.

print("Published Record Information:")
print(f"Title: {published_data['metadata']['title']}")
print(f"DOI: {published_data['doi']}")
print(f"Record URL: {published_data['links']['record_html']}")
print("\nFiles in the published record:")
for file in published_data['files']:
    print(f"- {file['filename']} (Size: {int(file['filesize']) / (1024 * 1024):.2f} MB): {file['links']['download'].replace('/draft', '')}")

print("\nMetadata:")
for key, value in published_data['metadata'].items():
    if key not in ['title', 'doi']:
        print(f"- {key}: {value}")


This completes the process of creating a draft, uploading files, and publishing a record on the Zenodo Sandbox. You can now view your record published in the Sandbox using the printed link. Remember that records published in the Sandbox are not persistent and are meant for testing purposes only. When you're ready to publish real data, you'll use the main Zenodo platform with a similar workflow.


## Understanding Zenodo Record Concepts

When working with Zenodo, it's important to understand several key concepts related to record identification, versioning, and persistent identifiers:

### Record Identifiers

1. **Record ID**: A unique numerical identifier assigned to each individual version of a record in Zenodo. It changes with each new version.

2. **Concept Record ID**: A persistent identifier that represents all versions of a record. It remains constant across different versions.

### Digital Object Identifiers (DOIs)

3. **DOI (Digital Object Identifier)**: A persistent identifier assigned to each specific version of a record. It changes with each new version.

4. **Concept DOI**: A persistent identifier that represents all versions of a record. It remains constant and always resolves to the latest version.

### Versioning

1. **Versions**: Zenodo allows you to create new versions of a record while maintaining links between different versions. Each version gets a new Record ID and DOI, but shares the same Concept Record ID and Concept DOI. The main idea is to persistently maintain the availability of research data in the exact version it was cited.

### Other Important Concepts

6. **Communities**: Collections of records in Zenodo, often organized around specific topics or projects. Records can belong to multiple communities.

7. **Embargo**: A feature that allows you to restrict access to files in a record for a specified period.

8. **Restricted Access**: The ability to limit access to certain files or entire records to specific users or groups.

9. **Metadata**: Descriptive information about the record, including title, authors, description, and more. It's crucial for discoverability and proper citation.

10. **License**: The terms under which the record's content is made available. Zenodo supports various open licenses and rights statements.

11. **Sandbox**: A testing environment that mimics the main Zenodo platform, allowing users to experiment with uploads and workflows without creating permanent records.

Understanding these concepts is crucial for effectively managing and sharing research outputs on Zenodo, ensuring proper versioning, citation, and access control for your records.


## What Next?

Congratulations on completing the basic workflow for creating, uploading, and publishing a record on Zenodo Sandbox! In the upcoming notebooks, we'll explore more advanced features and operations to enhance your Zenodo workflow:

1. **Database Integration**: Learn how to keep track of your Zenodo records using local or remote databases for improved management and retrieval.

2. **Versioning, Communities and Advanced Descriptions**: Discover techniques for updating existing records and creating new versions to maintain a clear history of your datasets.

3. **Batch Operations**: Explore how to process Excel files for efficient batch uploads, streamlining the submission of multiple records.

4. **Image Processing**:
   - Implement person masking using detector and segmentation models for privacy protection.
   - Apply image scaling techniques for consistent file sizes.
   - Extract and utilize EXIF metadata from images.

5. **3D Model Handling**:
   - Generate thumbnails for 3D models to provide quick visual references.
   - Convert 3D models to the GLB format for better compatibility.
   - Apply model reduction techniques for various retrieval cases.

These advanced topics will help you create more sophisticated and efficient workflows for managing your research data on Zenodo.
