Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest-upload): create new ingest xmlupload cli command (DEV-3019) #670

Merged
merged 91 commits into from
Dec 22, 2023
Merged
Show file tree
Hide file tree
Changes from 87 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
af7ce45
add test data
Nora-Olivia-Ammann Dec 5, 2023
a4dc4d0
Create ingest user message
Nora-Olivia-Ammann Dec 5, 2023
54bb073
change shortcode
Nora-Olivia-Ammann Dec 5, 2023
0fff791
apply ingest name mapping
Nora-Olivia-Ammann Dec 5, 2023
aadde21
user information
Nora-Olivia-Ammann Dec 5, 2023
f4ea24f
Update user_information.py
Nora-Olivia-Ammann Dec 5, 2023
bb17f55
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 6, 2023
903f5c4
Merge branch 'wip/dev-3019-create-new-ingest-cli' of https://github.c…
Nora-Olivia-Ammann Dec 6, 2023
250c5c6
name changes
Nora-Olivia-Ammann Dec 6, 2023
3fbacca
change user information
Nora-Olivia-Ammann Dec 6, 2023
5e7760d
Update upload_config.py
Nora-Olivia-Ammann Dec 6, 2023
5f0142e
sourcery
Nora-Olivia-Ammann Dec 6, 2023
33112fe
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 6, 2023
8f8d923
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 6, 2023
bdfe354
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 7, 2023
e4fa44f
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 7, 2023
793840b
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 7, 2023
529ffc3
rename
Nora-Olivia-Ammann Dec 7, 2023
4d8a408
change max prints
Nora-Olivia-Ammann Dec 7, 2023
d0d3b54
fix type hints
Nora-Olivia-Ammann Dec 8, 2023
29935fb
change upload config
Nora-Olivia-Ammann Dec 8, 2023
5444167
Change paths
Nora-Olivia-Ammann Dec 8, 2023
2be09ff
simplify test data
Nora-Olivia-Ammann Dec 8, 2023
3d1b4a7
integrate ingest in upload resource
Nora-Olivia-Ammann Dec 8, 2023
122bdb2
add ingest parser
Nora-Olivia-Ammann Dec 8, 2023
b5e886d
rename do fast upload variable
Nora-Olivia-Ammann Dec 8, 2023
8920ad8
add ingest call
Nora-Olivia-Ammann Dec 8, 2023
ad9c5f4
update
Nora-Olivia-Ammann Dec 8, 2023
08687b1
add parser
Nora-Olivia-Ammann Dec 8, 2023
bdc2358
add permissible file endings
Nora-Olivia-Ammann Dec 8, 2023
fbc01a4
file not found
Nora-Olivia-Ammann Dec 8, 2023
f00df0b
user info
Nora-Olivia-Ammann Dec 8, 2023
2797524
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 8, 2023
eae6e00
Update src/dsp_tools/commands/ingest_xmlupload/apply_ingest.py
Nora-Olivia-Ammann Dec 11, 2023
509af34
Update test_user_information.py
Nora-Olivia-Ammann Dec 11, 2023
6f14c16
docstring
Nora-Olivia-Ammann Dec 11, 2023
2823360
simplify data
Nora-Olivia-Ammann Dec 11, 2023
e1c491c
Merge branch 'wip/dev-3019-create-new-ingest-cli' of https://github.c…
Nora-Olivia-Ammann Dec 11, 2023
f105ac7
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 11, 2023
bbcdd85
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 12, 2023
34f794e
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 12, 2023
f2ed9e9
fixes from previous merge
Nora-Olivia-Ammann Dec 12, 2023
b2877a0
Update src/dsp_tools/commands/ingest_xmlupload/upload_xml.py
Nora-Olivia-Ammann Dec 12, 2023
71992d5
Update src/dsp_tools/commands/ingest_xmlupload/upload_xml.py
Nora-Olivia-Ammann Dec 12, 2023
a6d5c38
renaming folder test data
Nora-Olivia-Ammann Dec 12, 2023
28a82bb
renaming folder test data
Nora-Olivia-Ammann Dec 12, 2023
48ec620
renaming file
Nora-Olivia-Ammann Dec 12, 2023
0ddc47e
logg message
Nora-Olivia-Ammann Dec 12, 2023
37911e1
cosmetic changes
Nora-Olivia-Ammann Dec 12, 2023
b5c99a5
change to class variable
Nora-Olivia-Ammann Dec 12, 2023
4135098
change default variables
Nora-Olivia-Ammann Dec 12, 2023
c76e313
Update src/dsp_tools/commands/ingest_xmlupload/user_information.py
Nora-Olivia-Ammann Dec 13, 2023
573de91
Update test_user_information.py
Nora-Olivia-Ammann Dec 13, 2023
0c7daf9
Merge branch 'wip/dev-3019-create-new-ingest-cli' of https://github.c…
Nora-Olivia-Ammann Dec 13, 2023
69600a7
cosmetic changes
Nora-Olivia-Ammann Dec 13, 2023
c395082
renaming media
Nora-Olivia-Ammann Dec 13, 2023
d5dff67
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 13, 2023
bcea3e5
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 13, 2023
fae1f2f
formulation user info
Nora-Olivia-Ammann Dec 13, 2023
5c67af4
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 13, 2023
a99796e
header
Nora-Olivia-Ammann Dec 13, 2023
6523063
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 14, 2023
77f5c1f
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 14, 2023
dc47f10
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 15, 2023
faffc6b
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 18, 2023
716bc8e
fix xml parser
Nora-Olivia-Ammann Dec 18, 2023
48fd006
xml parser
Nora-Olivia-Ammann Dec 18, 2023
7cc706e
linting
Nora-Olivia-Ammann Dec 18, 2023
012d42a
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 20, 2023
48cd76d
update user_information
Nora-Olivia-Ammann Dec 20, 2023
6e46710
Change xmlfile to path from string
Nora-Olivia-Ammann Dec 20, 2023
f042ec2
add test data
Nora-Olivia-Ammann Dec 20, 2023
d573533
change path
Nora-Olivia-Ammann Dec 20, 2023
800c7c4
add shortcode to read mapping csv
Nora-Olivia-Ammann Dec 20, 2023
ab1b711
Create test_ingest_xmlupload.py
Nora-Olivia-Ammann Dec 20, 2023
397c090
linting
Nora-Olivia-Ammann Dec 20, 2023
0fa05c5
change misleading uuid to id
Nora-Olivia-Ammann Dec 20, 2023
45ef10f
proposed changes
Nora-Olivia-Ammann Dec 20, 2023
2cc2ce3
Update test_ingest_xmlupload.py
Nora-Olivia-Ammann Dec 20, 2023
aae8917
change test
Nora-Olivia-Ammann Dec 21, 2023
89ec625
Merge branch 'main' into wip/dev-3019-create-new-ingest-cli
Nora-Olivia-Ammann Dec 21, 2023
ae205fd
Cosmetic changes
Nora-Olivia-Ammann Dec 22, 2023
b957c25
change headers variable
Nora-Olivia-Ammann Dec 22, 2023
92a90cb
fix ducktyping
Nora-Olivia-Ammann Dec 22, 2023
babdc8f
cosmetic changes
Nora-Olivia-Ammann Dec 22, 2023
3b447da
cosmetic changes
Nora-Olivia-Ammann Dec 22, 2023
0c98cab
cosmetic changes
Nora-Olivia-Ammann Dec 22, 2023
5a2b4f0
delete files
Nora-Olivia-Ammann Dec 22, 2023
ae00433
cosmetic changes
Nora-Olivia-Ammann Dec 22, 2023
533b505
delete readme
Nora-Olivia-Ammann Dec 22, 2023
f7113a8
undo mistaken delete
Nora-Olivia-Ammann Dec 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
15 changes: 15 additions & 0 deletions src/dsp_tools/cli/call_action.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import argparse
from pathlib import Path

from dsp_tools.commands.excel2json.lists import excel2lists, validate_lists_section_with_schema
from dsp_tools.commands.excel2json.project import excel2json
Expand All @@ -9,6 +10,7 @@
from dsp_tools.commands.fast_xmlupload.upload_files import upload_files
from dsp_tools.commands.fast_xmlupload.upload_xml import fast_xmlupload
from dsp_tools.commands.id2iri import id2iri
from dsp_tools.commands.ingest_xmlupload.upload_xml import ingest_xmlupload
from dsp_tools.commands.project.create.project_create import create_project
from dsp_tools.commands.project.create.project_create_lists import create_lists
from dsp_tools.commands.project.create.project_validate import validate_project
Expand Down Expand Up @@ -68,6 +70,8 @@ def call_requested_action(args: argparse.Namespace) -> bool: # noqa: PLR0911 (T
return _call_upload_files(args)
case "fast-xmlupload":
return _call_fast_xmlupload(args)
case "ingest-xmlupload":
return _call_ingest_xmlupload(args)
case "template":
return generate_template_repo()
case "rosetta":
Expand Down Expand Up @@ -144,6 +148,17 @@ def _call_excel2json(args: argparse.Namespace) -> bool:
)


def _call_ingest_xmlupload(args: argparse.Namespace) -> bool:
ingest_xmlupload(
xml_file=Path(args.xml_file),
user=args.user,
password=args.password,
dsp_url=args.server,
sipi_url=args.sipi_url,
)
return True


def _call_fast_xmlupload(args: argparse.Namespace) -> bool:
return fast_xmlupload(
xml_file=args.xml_file,
Expand Down
19 changes: 19 additions & 0 deletions src/dsp_tools/cli/create_parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ def make_parser(

_add_xmlupload(subparsers, default_dsp_api_url, root_user_email, root_user_pw)

_add_ingest_xmlupload(subparsers, default_dsp_api_url, root_user_email, root_user_pw)

_add_fast_xmlupload(subparsers, default_dsp_api_url, root_user_email, root_user_pw)

_add_process_files(subparsers)
Expand Down Expand Up @@ -178,6 +180,23 @@ def _add_excel2json(subparsers: _SubParsersAction[ArgumentParser]) -> None:
subparser.add_argument("project_definition", help="path to the output JSON file")


def _add_ingest_xmlupload(
subparsers: _SubParsersAction[ArgumentParser],
default_dsp_api_url: str,
root_user_email: str,
root_user_pw: str,
) -> None:
subparser = subparsers.add_parser(
name="ingest-xmlupload",
help="For internal use only: create resources with files already uploaded through dsp-ingest",
)
subparser.set_defaults(action="ingest-xmlupload")
subparser.add_argument("-s", "--server", default=default_dsp_api_url, help=dsp_server_text)
subparser.add_argument("-u", "--user", default=root_user_email, help=username_text)
subparser.add_argument("-p", "--password", default=root_user_pw, help=password_text)
subparser.add_argument("xml_file", help="path to XML file containing the data")


def _add_fast_xmlupload(
subparsers: _SubParsersAction[ArgumentParser],
default_dsp_api_url: str,
Expand Down
2 changes: 1 addition & 1 deletion src/dsp_tools/commands/fast_xmlupload/upload_xml.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ def fast_xmlupload(
password=password,
imgdir=".",
sipi=sipi_url,
config=UploadConfig(preprocessing_done=True),
config=UploadConfig(media_previously_uploaded=True),
Nora-Olivia-Ammann marked this conversation as resolved.
Show resolved Hide resolved
)

end_time = datetime.now()
Expand Down
Empty file.
66 changes: 66 additions & 0 deletions src/dsp_tools/commands/ingest_xmlupload/apply_ingest_id.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
from __future__ import annotations

from copy import deepcopy
from pathlib import Path
from typing import cast

import pandas as pd
from lxml import etree

from dsp_tools.commands.ingest_xmlupload.user_information import IngestInformation
from dsp_tools.models.exceptions import InputError
from dsp_tools.utils.create_logger import get_logger

logger = get_logger(__name__)


def get_mapping_dict_from_file(shortcode: str) -> dict[str, str]:
"""
This functions returns the information to replace the original filepaths with the identifier from dsp-ingest.

Args:
shortcode: Shortcode of the project

Returns:
dictionary with original: identifier from dsp-ingest

Raises:
InputError: if no file was found
"""
filepath = Path(f"mapping-{shortcode}.csv")
if not filepath.is_file():
raise InputError(f"No mapping CSV file was found at {filepath}.")
df = pd.read_csv(filepath)
msg = f"The file '{filepath}' is used to map the internal SIPI image IDs to the original filepaths."
print(msg)
logger.info(msg)
return dict(zip(df["original"].tolist(), df["derivative"].tolist()))


def replace_filepath_with_sipi_id(
xml_tree: etree._ElementTree[etree._Element],
orig_path_2_id_filename: dict[str, str],
) -> tuple[etree._ElementTree[etree._Element], IngestInformation]:
"""
Replace the original filepaths in the <bitstream> tags by the id filenames of the uploaded files.

Args:
xml_tree: The parsed original XML tree
orig_path_2_id_filename: Mapping from original filenames to id filenames from the mapping.csv

Returns:
A copy of the XMl tree, with the replaced filepaths.
Message informing if all referenced files were uploaded or not.
"""
no_id_found = []
used_media_file_paths = []
new_tree = deepcopy(xml_tree)
for elem in new_tree.iter():
if etree.QName(elem).localname.endswith("bitstream"):
if (img_path := elem.text) in orig_path_2_id_filename:
elem.text = orig_path_2_id_filename[img_path]
used_media_file_paths.append(img_path)
else:
no_id_found.append((cast("etree._Element", elem.getparent()).attrib["id"], str(elem.text)))
unused_media_paths = [x for x in orig_path_2_id_filename if x not in used_media_file_paths]
return new_tree, IngestInformation(unused_mediafiles=unused_media_paths, mediafiles_no_id=no_id_found)
70 changes: 70 additions & 0 deletions src/dsp_tools/commands/ingest_xmlupload/upload_xml.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
from __future__ import annotations

from pathlib import Path

from lxml import etree

from dsp_tools.commands.ingest_xmlupload.apply_ingest_id import (
get_mapping_dict_from_file,
replace_filepath_with_sipi_id,
)
from dsp_tools.commands.xmlupload.upload_config import UploadConfig
from dsp_tools.commands.xmlupload.xmlupload import xmlupload
from dsp_tools.models.exceptions import InputError
from dsp_tools.utils.create_logger import get_logger
from dsp_tools.utils.xml_utils import remove_comments_from_element_tree

logger = get_logger(__name__)


def ingest_xmlupload(
xml_file: Path,
user: str,
password: str,
dsp_url: str,
sipi_url: str,
) -> None:
"""
This function reads an XML file
and imports the data described in it onto the DSP server,
using the ingest XML upload method.
Nora-Olivia-Ammann marked this conversation as resolved.
Show resolved Hide resolved
Before using this function,
the multimedia files must be ingested on the DSP server.
A mapping file with the internal IDs of the multimedia files must also be provided.

Args:
xml_file: path to XML file containing the resources
user: the user's e-mail for login into DSP
password: the user's password for login into DSP
dsp_url: URL to the DSP server
sipi_url: URL to the Sipi server

Raises:
InputError: if any media was not uploaded or uploaded media was not referenced.
Nora-Olivia-Ammann marked this conversation as resolved.
Show resolved Hide resolved
"""
xml_tree_orig = etree.parse(xml_file)
xml_tree_orig = remove_comments_from_element_tree(xml_tree_orig)

shortcode = xml_tree_orig.getroot().attrib["shortcode"]
orig_path_2_id_filename = get_mapping_dict_from_file(shortcode)
xml_tree_replaced, ingest_info = replace_filepath_with_sipi_id(
xml_tree=xml_tree_orig,
orig_path_2_id_filename=orig_path_2_id_filename,
)
if ok := ingest_info.ok_msg():
print(ok)
logger.info(ok)
else:
err_msg = ingest_info.execute_error_protocol()
logger.error(err_msg)
raise InputError(err_msg)

xmlupload(
input_file=xml_tree_replaced,
server=dsp_url,
user=user,
password=password,
imgdir=".",
sipi=sipi_url,
config=UploadConfig(media_previously_uploaded=True),
)
120 changes: 120 additions & 0 deletions src/dsp_tools/commands/ingest_xmlupload/user_information.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
from dataclasses import dataclass, field
from pathlib import Path

import pandas as pd

separator = "\n "
list_separator = "\n - "
Nora-Olivia-Ammann marked this conversation as resolved.
Show resolved Hide resolved


@dataclass(frozen=True)
class IngestInformation:
"""
This class stores the information about the mapping of ids provided by the dsp-ingest service
and the filepaths used in the XML file.
"""

unused_mediafiles: list[str]
mediafiles_no_id: list[tuple[str, str]]
maximum_prints: int = 20
csv_directory_path: Path = field(default=Path.cwd())
unused_mediafiles_csv: str = "UnusedMediaUploadedInSipi.csv"
mediafiles_no_id_csv: str = "FilesNotUploadedToSipi.csv"

def ok_msg(self) -> str | None:
"""
This function checks if no media was unused or not uploaded.
If that is the case it returns the message,
if not, it ends without an effect.

Returns:
Message if all went well.
"""
if not self.unused_mediafiles and not self.mediafiles_no_id:
return (
"All multimedia files referenced in the XML file were uploaded through dsp-ingest.\n"
"No multimedia files were uploaded through dsp-ingest that were not referenced in the XML file."
)
return None

def execute_error_protocol(self) -> str:
"""
This function generates the user message and saves a file with the information
if a lot of resources are affected.

Returns:
User message
"""
self._save_csv_if_applicable()
return self._get_error_msg()

def _get_error_msg(self) -> str:
msg_list = [
"The upload cannot continue as there are problems with the multimedia files referenced in the XML.",
]
if has_msg := self._get_unused_mediafiles_msg():
msg_list.append(has_msg)
if has_msg := self._get_mediafiles_no_id_msg():
msg_list.append(has_msg)
return separator.join(msg_list)

def _get_mediafiles_no_id_msg(self) -> str | None:
if 0 < len(self.mediafiles_no_id) <= self.maximum_prints:
return (
"The data XML file contains references to the following multimedia files "
"which were not previously uploaded through dsp-ingest:"
+ list_separator
+ list_separator.join([f"Resource ID: '{x[0]}' | Filepath: '{x[1]}'" for x in self.mediafiles_no_id])
)
elif len(self.mediafiles_no_id) > self.maximum_prints:
return (
"The data XML file contains references to multimedia files "
"which were not previously uploaded through dsp-ingest:\n"
f" The file with the resource IDs and problematic filenames was saved at "
f"'{Path(self.csv_directory_path/self.mediafiles_no_id_csv)}'."
)
return None

def _get_unused_mediafiles_msg(self) -> str | None:
if 0 < len(self.unused_mediafiles) <= self.maximum_prints:
return (
"The data XML file does not reference the following multimedia files which were previously "
"uploaded through dsp-ingest:" + list_separator + list_separator.join(self.unused_mediafiles)
)
elif len(self.unused_mediafiles) > self.maximum_prints:
return (
"The data XML file does not reference all the multimedia files which were previously "
"uploaded through dsp-ingest.\n"
f" The file with the unused filenames was saved at "
f"'{Path(self.csv_directory_path/self.unused_mediafiles_csv)}'."
)
return None

def _save_csv_if_applicable(self) -> None:
if unused_mediafiles_df := self._unused_mediafiles_to_df():
_save_as_csv(unused_mediafiles_df, self.csv_directory_path, self.unused_mediafiles_csv)
if no_id_df := self._mediafiles_no_id_to_df():
_save_as_csv(no_id_df, self.csv_directory_path, self.mediafiles_no_id_csv)

def _unused_mediafiles_to_df(self) -> pd.DataFrame | None:
return (
pd.DataFrame({"Multimedia Filenames": self.unused_mediafiles})
if len(self.unused_mediafiles) > self.maximum_prints
else None
)

def _mediafiles_no_id_to_df(self) -> pd.DataFrame | None:
return (
pd.DataFrame(
{
"Resource ID": [x[0] for x in self.mediafiles_no_id],
"Filepath": [x[1] for x in self.mediafiles_no_id],
}
)
if len(self.mediafiles_no_id) > self.maximum_prints
else None
)


def _save_as_csv(df: pd.DataFrame, directory_path: Path, filename: str) -> None:
df.to_csv(Path(directory_path, filename), index=False)
3 changes: 2 additions & 1 deletion src/dsp_tools/commands/xmlupload/resource_create_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,8 @@ def _make_bitstream_file_value(bitstream_info: BitstreamInfo) -> dict[str, Any]:
case "mp4":
prop = "knora-api:hasMovingImageFileValue"
value_type = "MovingImageFileValue"
case "jpg" | "jpeg" | "jp2" | "png" | "tif" | "tiff":
# jpx is the extension of the files returned by dsp-ingest
case "jpg" | "jpeg" | "jp2" | "png" | "tif" | "tiff" | "jpx":
Nora-Olivia-Ammann marked this conversation as resolved.
Show resolved Hide resolved
prop = "knora-api:hasStillImageFileValue"
value_type = "StillImageFileValue"
case "odd" | "rng" | "txt" | "xml" | "xsd" | "xsl" | "xslt" | "csv":
Expand Down
2 changes: 1 addition & 1 deletion src/dsp_tools/commands/xmlupload/upload_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class DiagnosticsConfig:
class UploadConfig:
"""Configuration for the upload process."""

preprocessing_done: bool = False
media_previously_uploaded: bool = False
server: str = "unknown"
shortcode: str = "unknown"
diagnostics: DiagnosticsConfig = field(default_factory=DiagnosticsConfig)
Expand Down
11 changes: 8 additions & 3 deletions src/dsp_tools/commands/xmlupload/xmlupload.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def xmlupload(
default_ontology, root, shortcode = validate_and_parse_xml_file(
input_file=input_file,
imgdir=imgdir,
preprocessing_done=config.preprocessing_done,
preprocessing_done=config.media_previously_uploaded,
)

config = config.with_server_info(
Expand All @@ -85,7 +85,12 @@ def xmlupload(

# establish connection to DSP server
con = login(server=server, user=user, password=password, dump=config.diagnostics.dump)
sipi_con = ConnectionLive(sipi, dump=config.diagnostics.dump, token=con.get_token())
if config.media_previously_uploaded:
sipi_con = ConnectionLive(
sipi, dump=config.diagnostics.dump, token=con.get_token(), headers={"X-Asset-Ingested": "true"}
)
else:
sipi_con = ConnectionLive(sipi, dump=config.diagnostics.dump, token=con.get_token())
sipi_server = Sipi(sipi_con)

ontology_client = OntologyClientLive(
Expand Down Expand Up @@ -338,7 +343,7 @@ def _upload_resources(

for i, resource in enumerate(resources):
success, media_info = handle_media_info(
resource, config.preprocessing_done, sipi_server, imgdir, permissions_lookup
resource, config.media_previously_uploaded, sipi_server, imgdir, permissions_lookup
)
if not success:
failed_uploads.append(resource.res_id)
Expand Down