Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix passing credentials to gcsfs #66

Merged
merged 13 commits into from
May 19, 2023
Merged

Conversation

j-bennet
Copy link
Contributor

@j-bennet j-bennet commented May 16, 2023

According to the docs here:

https://gcsfs.readthedocs.io/en/latest/index.html#credentials

gcsfs does not accept just a token, it needs a Credentials instance or a dict. For dask dict works better, because it can be pickled.

We also pass the same credentials to to_parquet.

@j-bennet j-bennet marked this pull request as draft May 16, 2023 01:08
@j-bennet
Copy link
Contributor Author

@bnaul What do you think is going on here?

https://github.com/coiled/dask-bigquery/actions/runs/4997384122/jobs/8951694334?pr=66#step:6:105

E           	debug_error_string = "UNKNOWN:Error received from peer ipv4:172.217.10.106:443 ***grpc_message:"request failed: Cannot query over table \'***.bddf1d76129e47f8988633536743ba96.partitioned_table_test\' without a filter over column(s) \'timestamp\' that can be used for partition elimination", grpc_status:3, created_time:"2023-05-16T22:58:45.81916+00:00"***"

it's happening on a line that is supposed to raise:

        with pytest.raises(InvalidArgument):
>           read_gbq(
                project_id=project_id,
                dataset_id=dataset_id,
                table_id=table_id,
            ).head()

but apparently it's raising something different.

@j-bennet j-bennet marked this pull request as ready for review May 16, 2023 23:17
@bnaul
Copy link
Contributor

bnaul commented May 17, 2023 via email

@j-bennet j-bennet removed the request for review from bnaul May 17, 2023 21:40
@martindurant
Copy link

I hadn't realised that the gcsfs discussion was the same as the one here. As in the liked issue, it would not be too difficult to have gcsfs accept a raw token, either working around the current system or making a small PR to gcsfs. It would, of course, not be able to refresh and eventually return permission errors if the token has expired.

@j-bennet
Copy link
Contributor Author

@martindurant I'm going to try making a change in gcsfs. Might need your help with that one.

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @j-bennet

@jrbourbeau jrbourbeau merged commit b9fee8a into main May 19, 2023
@j-bennet j-bennet deleted the j-bennet/gcsfs-fix-credentials branch May 19, 2023 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants