Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Arrow library usage and memory usage cleanup #769

Merged
merged 7 commits into from
Apr 20, 2023

Conversation

zeroshade
Copy link
Contributor

@zeroshade zeroshade commented Apr 10, 2023

Description

Updates to Arrow v11 and leverages the compute functions + significantly cleaning up the usage of the Arrow library to be cleaner and fixing the memory leaks that existed in the incorrect usage of the Arrow library. This includes properly calling Release on record batches and arrays and adding CheckedAllocators to some tests to ensure there are no leaks.

This also unifies the Arrow memory allocator used so that consumers can provide their own Arrow memory.Allocator if they want.

Checklist

  • Code compiles correctly
  • Run make fmt to fix inconsistent formats
  • Run make lint to get lint errors and fix all of them
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary

@zeroshade zeroshade requested a review from a team as a code owner April 10, 2023 20:06
@github-actions
Copy link

github-actions bot commented Apr 10, 2023

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@zeroshade
Copy link
Contributor Author

I have read the CLA Document and I hereby sign the CLA

@zeroshade
Copy link
Contributor Author

While i needed to upgrade to Arrow v11 to do the cleanup and fixes, #764 should get merged first as he went further than I did and also updated other dependencies and removed the replace directives from the go.mod. After that is merged, my change can get merged (or just updated to not do any changes to the vendor dir and only be the changes to the gosnowflake driver itself).

I'm going to second @kenshaw's request to just eliminate the vendoring altogether and just follow the standard Go module handling to make upgrades like this easier in the future.

@yevgenypats
Copy link

@zeroshade nice! I was just looking at what the status of arrow in this driver and saw this PR.

Anything we can help here or we can use this branch?

@zeroshade
Copy link
Contributor Author

@yevgenypats if you have any contacts at snowflake, try to push them to review and pay attention to this PR and the linked #764 PR that's the best help I can get right now.

@sfc-gh-igarish
Copy link
Collaborator

@zeroshade Thanks for the PR. This PR has many files changes mainly for Arrow library, we will take a look before merging it.

@yevgenypats
Copy link

@yevgenypats if you have any contacts at snowflake, try to push them to review and pay attention to this PR and the linked #764 PR that's the best help I can get right now.

Unfortunately don't have any contact.

@sfc-gh-igarish - this just touches the vendor directory as far as I can see so there are not a lot of changes.

@yevgenypats
Copy link

yevgenypats commented Apr 18, 2023

@yevgenypats if you have any contacts at snowflake, try to push them to review and pay attention to this PR and the linked #764 PR that's the best help I can get right now.

@zeroshade Just an FYI on the status of this driver, we are using this driver in production for a while now for our snowflake destination but I had quite a bit of issues due to a lot of limitations on what it can or cannot do. we ended up just using load to snowflake via CSV and using this driver just to run the load from csv/json command (I think this is also the way that most users load things to snowflake from a remote storage with files in csv/parquet/json format and why some of the drivers seems to be in "alpha").

@sfc-gh-igarish
Copy link
Collaborator

I approved #764 first. As I read #764 PR should merge first. Right?

Copy link
Collaborator

@sfc-gh-igarish sfc-gh-igarish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As mentioned in the PR, merge it after #764 merge.

@sfc-gh-igarish
Copy link
Collaborator

I approved this also.

@sfc-gh-igarish
Copy link
Collaborator

Please resolve the conflicts then I can help to merge this PR. I already merged 764.

@zeroshade
Copy link
Contributor Author

Thanks much @sfc-gh-igarish! I'll resolve the conflicts first thing tomorrow for myself and tag you in a comment when I do.

@zeroshade
Copy link
Contributor Author

@sfc-gh-igarish rebased and conflicts fixed! thanks!

@sfc-gh-igarish
Copy link
Collaborator

Created a linked PR and start running test.

@sfc-gh-igarish sfc-gh-igarish merged commit 5637e86 into snowflakedb:master Apr 20, 2023
23 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants