Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with loading files using LoadFileOperator with native_support=True #890

Closed
phanikumv opened this issue Sep 22, 2022 · 5 comments · Fixed by #1194
Closed

Issue with loading files using LoadFileOperator with native_support=True #890

phanikumv opened this issue Sep 22, 2022 · 5 comments · Fixed by #1194
Assignees
Labels
bug Something isn't working priority/critical Critical priority product/python-sdk Label describing products
Milestone

Comments

@phanikumv
Copy link
Collaborator

Describe the bug
Check why the 'use_native_support'=True isn't loading all the files with the LoadFileOperator
Relevant thread https://astronomer.slack.com/archives/C02B8SPT93K/p1663783869151809


I'm seeing an issue in my job with the LoadFileOperator
The file was a GCS Parquet file - I physically checked and there is data in it
The table was a Snowflake TempTable - I checked and it's empty
It then gets merged with a MergeOperator, but given that the temp table was empty, nothing was added
No errors in the logs as far as I can tell.
I have use_native_support=True - should I turn it off and see if that helps?
This happened only occasionally - I loaded ~80 files each for 4 different tables (as fast as airflow could load them), and only some loads were affected
It does appear to happen consistently with the same records missing for at least one case if I rerun the tasks though


Version

  • Astro: [e.g. 0.6.0]
  • OS: [eg. Debian]

To Reproduce
Steps to reproduce the behavior:

  1. Write the DAG '...'
  2. Create connection '....'
  3. Run using '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

@kaxil kaxil added bug Something isn't working product/python-sdk Label describing products labels Sep 23, 2022
@kaxil kaxil added this to the 1.1.1 milestone Sep 23, 2022
@utkarsharma2
Copy link
Collaborator

@fritz-astronomer I'm not able to reproduce this on my local, can you share the details like the GCS files and Snowflake Table details OR we can get on a call to try this out?

@utkarsharma2 utkarsharma2 modified the milestones: 1.1.1, 1.2.0 Sep 29, 2022
@fritz-astronomer
Copy link
Contributor

@utkarsharma2 following up in slack

@sunank200
Copy link
Contributor

@utkarsharma2 could you please once check the status on this issue?

@utkarsharma2
Copy link
Collaborator

Need to try on the env @fritz-astronomer has shared with me.

@utkarsharma2 utkarsharma2 reopened this Oct 19, 2022
@sunank200 sunank200 modified the milestones: 1.2.0, 1.2.1 Oct 19, 2022
@phanikumv phanikumv added the priority/critical Critical priority label Oct 20, 2022
utkarsharma2 added a commit that referenced this issue Nov 4, 2022
# Description
## What is the current behavior?
Currently, we were not bubbling up the error raised by copying into
command in snowflake native path GCS -> Snowflake.

closes: #890

## What is the new behavior?
Now we pass a handler that checks for the status of the query returned
and if any is failing raise a `DatabaseCustomError` error.

## Does this introduce a breaking change?
Nope

### Checklist
- [ ] Created tests which fail without the change (if possible)
- [ ] Extended the README / documentation, if necessary
@fritz-astronomer
Copy link
Contributor

Thank you!!!!

utkarsharma2 added a commit that referenced this issue Nov 4, 2022
# Description
## What is the current behavior?
Currently, we were not bubbling up the error raised by copying into
command in snowflake native path GCS -> Snowflake.

closes: #890

## What is the new behavior?
Now we pass a handler that checks for the status of the query returned
and if any is failing raise a `DatabaseCustomError` error.

## Does this introduce a breaking change?
Nope

### Checklist
- [ ] Created tests which fail without the change (if possible)
- [ ] Extended the README / documentation, if necessary
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority/critical Critical priority product/python-sdk Label describing products
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants