ENH Adding read_* support for Google Cloud Storage gs:// URLs #19454
Would be great if pandas were to support reading directly from Google Cloud Storage URLs (gs://), similar to the way it supports AWS s3 URLs (s3://)
Ideally, one could write:
df_csv = pd.read_csv('gs://my-bucket/super-cool-dataset.csv') df_excel = pd.read_excel('gs://my-bucket/super-cool-dataset.xlsx')
I took a quick stab at it in 4c9196a. Let me know if this makes sense and I can add some documentation/tests around it and open a pull request.
@vision-sbm we're certainly open to this. Making a PR will be the best move as far as code review goes.
FWIW, I think having an optional dependency on gcsfs and then mirroring https://github.com/pandas-dev/pandas/blob/master/pandas/io/s3.py is the easiest way.