Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ignore_existing: true fails the execution when target doesn't exist #301

Closed
dduong1603 opened this issue May 23, 2024 · 1 comment

Comments

@dduong1603
Copy link

dduong1603 commented May 23, 2024

Issue Description

  • Description of the issue: When I add ignore_existing to the replication, the execution will now fail instead of issuing a warning if the target doesn't exist. If the target exists, then sling will happily skip the file which is as expected.

  • Sling version (sling --version): 1.2.10

  • Operating System (linux, mac, windows): linux

  • Replication Configuration:

source: S3
target: SFTP
streams:
  {s3_prefix}/{file_prefix}_<redacted>_*.csv:
    object: '{folder_path}/{stream_file_name}.csv'
    target_options:
      format: csv
      ignore_existing: true
env:
  s3_prefix: <redacted>
  file_prefix: <redacted>
  folder_path: <redacted>
  • Log Output (please run command with -d):
2024-05-23 17:43:30 INF [1 / 4] running stream s3://<redacted>/<redacted>.csv
2024-05-23 17:43:30 DBG Sling version: 1.2.10 (linux amd64)
2024-05-23 17:43:30 DBG type is file-file
2024-05-23 17:43:30 DBG using source options: {"trim_space":false,"empty_as_null":true,"header":true,"fields_per_rec":-1,"compression":"auto","null_if":"NULL","datetime_format":"AUTO","skip_blank_lines":false,"max_decimals":-1}
2024-05-23 17:43:30 DBG using target options: {"header":true,"compression":"auto","concurrency":7,"datetime_format":"auto","delimiter":",","file_max_rows":0,"file_max_bytes":0,"format":"csv","max_decimals":-1,"use_bulk":true,"ignore_existing":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"source"}
2024-05-23 17:43:30 INF reading from source file system (s3)
2024-05-23 17:43:30 DBG reading datastream from s3://<redacted>/<redacted>.csv [format=csv]
2024-05-23 17:43:30 DBG merging csv readers of 1 files [concurrency=10] from s3://<redacted>/<redacted>.csv
2024-05-23 17:43:30 DBG processing reader from s3://<redacted>/<redacted>.csv
2024-05-23 17:43:30 DBG delimiter auto-detected: ","
2024-05-23 17:43:30 INF writing to target file system (sftp)
2024-05-23 17:43:30 INF execution failed
2024-05-23 17:43:30 INF ~ error listing path: "/<redacted>/<redacted>.csv"
file does not exist

Previously in 1.2.9 (or if I remove the ignore_existing: true in 1.2.10) this would continue with

2024-05-23 17:58:42 WRN could not delete path sftp://<redacted>/<redacted>.csv
~ error listing path: "/<redacted>/<redacted>.csv"
file does not exist
2024-05-23 17:58:42 DBG writing to sftp://<redacted>/<redacted>.csv [fileRowLimit=0 fileBytesLimit=0 compression=auto concurrency=7 useBufferedStream=false fileFormat=csv]
2024-05-23 17:58:43 DBG wrote 14 kB: 67 rows [947 r/s]
2024-05-23 17:58:43  wrote 67 rows to sftp://<redacted>/<redacted>.csv in 0 secs [947 r/s]
2024-05-23 17:58:43  execution succeeded

Or if the target actually exists, then sling will happily skip the file and mark the execution as succeeded

2024-05-23 18:14:37 DBG not writing since file/folder exists at sftp://<redacted>/<redacted>.csv (ignore_existing=true)
2024-05-23 18:14:37 DBG wrote 0 B: 0 rows [0 r/s]
2024-05-23 18:14:37  wrote 0 rows to sftp://<redacted>/<redacted>.csv in 0 secs [0 r/s]
2024-05-23 18:14:37  execution succeeded
@flarco
Copy link
Collaborator

flarco commented May 23, 2024

Should be fixed in #303 with 4131cb5
Feel free to compile binary and test.
Closing for now.

@flarco flarco closed this as completed May 23, 2024
@flarco flarco mentioned this issue May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants