Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: GeoJSON file from URL not recognized as a supported file format #3284

Open
mattijn opened this issue May 13, 2024 · 17 comments · May be fixed by #3311
Open

BUG: GeoJSON file from URL not recognized as a supported file format #3284

mattijn opened this issue May 13, 2024 · 17 comments · May be fixed by #3311

Comments

@mattijn
Copy link

mattijn commented May 13, 2024

We start to receive errors within the CI of Vega-Altair related to something we are not sure about. Xref vega/altair#3418

The full traceback we see is the following:

_ ERROR collecting tests/examples_arguments_syntax/interval_selection_map_quakes.py _
fiona/ogrext.pyx:136: in fiona.ogrext.gdal_open_vector
    ???
fiona/_err.pyx:291: in fiona._err.exc_wrap_pointer
    ???
E   fiona._err.CPLE_OpenFailedError: '/vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json' not recognized as a supported file format.

During handling of the above exception, another exception occurred:
tests/examples_arguments_syntax/interval_selection_map_quakes.py:14: in <module>
    gdf_quakies = gpd.read_file(data.earthquakes.url, driver="GeoJSON")
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/geopandas/io/file.py:289: in _read_file
    return _read_file_fiona(
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/geopandas/io/file.py:315: in _read_file_fiona
    with reader(path_or_bytes, **kwargs) as features:
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/fiona/env.py:457: in wrapper
    return f(*args, **kwds)
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/fiona/__init__.py:292: in open
    colxn = Collection(
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/fiona/collection.py:[24](https://github.com/vega/altair/actions/runs/9005770219/job/24850952007#step:9:25)3: in __init__
    self.session.start(self, **kwargs)
fiona/ogrext.pyx:588: in fiona.ogrext.Session.start
    ???
fiona/ogrext.pyx:143: in fiona.ogrext.gdal_open_vector
    ???
E   fiona.errors.DriverError: '/vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json' not recognized as a supported file format.

It basically errors on this lines:

import geopandas as gpd
gpd.read_file('https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json', driver='GeoJSON')

But the issue is, we can't reproduce it yet outside the CI on both Windows and MacOS. According to the CI (ubuntu-latest) it should fail using this combination of related packages:

fiona=1.9.6 geopandas=0.14.4 numpy=1.26.4 pandas=2.2.2 pyproj=3.6.1

We have seen the same error on python 3.10, 3.11 and 3.12 so far.

Is this a glitch that automagically will disappear in a few days, or is this something that is reproducible by others?

@martinfleis
Copy link
Member

Could be related to the latest GDAL release from a few days back.

@jorisvandenbossche
Copy link
Member

The CI is using pip and installing wheels, so it should not be related to the GDAL version (it's installing fiona's latest wheel, which is from March)

@jorisvandenbossche
Copy link
Member

Comparing the latest working run (https://github.com/vega/altair/actions/runs/8853323212/job/24313906217) vs the failing run (https://github.com/vega/altair/actions/runs/9005770219/job/24853034437) on the main branch, and comparing the versions of packages installed, the most relevant one is geopandas 0.14.3 -> 0.14.4. So we should check if this is not another regression related to the path changes

@m-richards
Copy link
Member

m-richards commented May 14, 2024

At a quick glance it looks to me like something goes wrong here:
which is inside our _read_file. If I run this locally, I get to from_bytes = True

    if _is_url(filename):
        # if it is a url that supports random access -> pass through to
        # pyogrio/fiona as is (to support downloading only part of the file)
        # otherwise still download manually because pyogrio/fiona don't support
        # all types of urls (https://github.com/geopandas/geopandas/issues/2908)
        with urllib.request.urlopen(filename) as response:
            if not response.headers.get("Accept-Ranges") == "bytes":
                filename = response.read()
                from_bytes = True

and correspondingly in _read_file_fiona we have reader=fiona.BytesCollection. But the stack trace above comes from fiona.open, which suggests the above code path doesn't happen.

Not entirely sure how that behaviour could change given that urllib is from the standard library.

If I update the code snippet to read if response.headers.get("Accept-Ranges") == "bytes": then I can replicate the stack-trace in CI. Not really familiar enough with http stuff to know if that response value might change in a flakey way though.

As an aside, gpd.read_file('GeoJSON:https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json', engine='pyogrio') seem to work regardless of whether I make that change (with pyogrio installed). Might be something for Vega altair to try upstream in the mean time while we try and figure this out?

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented May 14, 2024

@mattijn could you maybe restart one of the failing builds to check if it is still failing, in case this was a temporary glitch with the CDN?

I thought that I could actually reproduce the issue this morning with latest main, but can't anymore right now (and no longer have the output of the console session to verify). I can reproduce the error by explicitly trying to read with /vsicurl/https://..., which confirms @m-richards analysis that this error should in theory only happen if geopandas for some reason decides to not download the bytes from the url.

As an aside, gpd.read_file('GeoJSON:https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json', engine='pyogrio') seem to work regardless of whether I make that change (with pyogrio installed). Might be something for Vega altair to try upstream in the mean time while we try and figure this out?

Something else I noticed (although this is an issue for pyogrio), when leaving out the GeoJSON: part, and passing the url directly to pyogrio's read_dataframe (so not letting geopandas download the bytes from the url), this just hangs for me:

import pyogrio
pyogrio.read_dataframe("https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json")

Although that might also be something on the GDAL side, because AFAIK this is just passing that url (preprended with /vsicurl/) to GDAL (or with the server of the file).

@mattijn
Copy link
Author

mattijn commented May 15, 2024

I did a re-run, but that still gives the same error (link to run).
Temporarily pinning geopandas to 0.14.3 did make all tests pass again, so did that for now in vega/altair#3421.

By comparing the changes between 0.14.3 and 0.14.4, I see some potential related changes that might be introduced by #3232?

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented May 15, 2024

By comparing the changes between 0.14.3 and 0.14.4, I see some potential related changes that might be introduced by #3232?

Yes, but so those changes are only behind a if not from_bytes check, and when testing this locally, we never get there because we download the file from the url and continue as bytes (with from_bytes=True):

from_bytes = False
if _is_url(filename):
# if it is a url that supports random access -> pass through to
# pyogrio/fiona as is (to support downloading only part of the file)
# otherwise still download manually because pyogrio/fiona don't support
# all types of urls (https://github.com/geopandas/geopandas/issues/2908)
with urllib.request.urlopen(filename) as response:
if not response.headers.get("Accept-Ranges") == "bytes":
filename = response.read()
from_bytes = True

And so locally I get:

>>> response = urllib.request.urlopen("https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json")
>>> print(response.headers.get("accept-ranges"))
None

resulting in taking that path to read the response.

@mattijn
Copy link
Author

mattijn commented May 16, 2024

I realized I've access to a linux machine and had the chance to do some tests.
On linux the following code:

import urllib.request

url = 'https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json'

# Open URL with custom request
with urllib.request.urlopen(url) as response:
    print(response.headers.get("Accept-Ranges"))
    header = response.getheaders()
    # lower/higher case doesn't matter for the `headers.get()`, but it does for sorting.
    sorted_header = sorted([(key.lower(), value) for key, value in header])
    print(sorted_header)
    print(len(sorted_header))

consistently returns:

bytes
[('accept-ranges', 'bytes'), ('access-control-allow-origin', '*'), ('access-control-expose-headers', '*'), ('age', '223460'), ('alt-svc', 'h3=":443";ma=86400,h3-29=":443";ma=86400,h3-27=":443";ma=86400'), ('cache-control', 'public, max-age=31536000, s-maxage=31536000, immutable'), ('connection', 'close'), ('content-length', '1219853'), ('content-type', 'application/json; charset=utf-8'), ('cross-origin-resource-policy', 'cross-origin'), ('date', 'Thu, 16 May 2024 21:14:41 GMT'), ('etag', 'W/"129d0d-nk6KiNV9fUTf5O95Ns8JhUD6yxk"'), ('strict-transport-security', 'max-age=31536000; includeSubDomains; preload'), ('timing-allow-origin', '*'), ('vary', 'Accept-Encoding'), ('x-cache', 'HIT, HIT'), ('x-content-type-options', 'nosniff'), ('x-jsd-version', '1.29.0'), ('x-jsd-version-type', 'version'), ('x-served-by', 'cache-fra-eddf8230110-FRA, cache-ams21061-AMS')]
20

Where on windows the return differs. Sometimes it returns the same as above and sometimes it returns the following:

None
[('access-control-allow-origin', '*'), ('access-control-expose-headers', '*'), ('age', '268142'), ('alt-svc', 'h3=":443"; ma=86400'), ('cache-control', 'public, max-age=31536000, s-maxage=31536000, immutable'), ('cf-cache-status', 'HIT'), ('cf-ray', '884e612bb9c866ab-AMS'), ('connection', 'close'), ('content-type', 'application/json; charset=utf-8'), ('cross-origin-resource-policy', 'cross-origin'), ('date', 'Thu, 16 May 2024 21:14:49 GMT'), ('etag', 'W/"129d0d-nk6KiNV9fUTf5O95Ns8JhUD6yxk"'), ('nel', '{"success_fraction":0.01,"report_to":"cf-nel","max_age":604800}'), ('report-to', '{"endpoints":[{"url":"https:\\/\\/a.nel.cloudflare.com\\/report\\/v4?s=ftXz3PUHqi1D5sRZvNku8s9TgOPc7Uc2BjYMWoyymLptxWZNKCfovtFLWLx3TEjTSzVnYjoqHgnXNJm2Zvb67vbgWyu5BPDWj5oPXh1r29ZJs88D8auMLijhgNn52qMnYI8%3D"}],"group":"cf-nel","max_age":604800}'), ('server', 'cloudflare'), ('strict-transport-security', 'max-age=31536000; includeSubDomains; preload'), ('timing-allow-origin', '*'), ('transfer-encoding', 'chunked'), ('vary', 'Accept-Encoding'), ('x-cache', 'MISS, HIT'), ('x-content-type-options', 'nosniff'), ('x-jsd-version', '1.29.0'), ('x-jsd-version-type', 'version'), ('x-served-by', 'cache-fra-eddf8230110-FRA, cache-lga21924-LGA')]
24

And with sometimes, I mean, with an interval of a few minutes between the next call. It is not a fixed interval and I cannot force it with a User-Agent request header.

Inspecting the differences in header seems that the request on windows is sometimes served by a cloudflare endpoint. Maybe depending on server-load? When it is served by the cloudflare endpoint the header does not contain an accept-ranges key.

While typing this, I think this might be a side-effect introduced by resolving jsdelivr/jsdelivr#18565 (comment) recently?

But again, still, if it the return 'bytes' then it still shouldn't fail on reading by fiona, isn't it.
If I do the following on my linux machine:

import fiona
with fiona.open("/vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json") as features:
    columns = list(features.schema["properties"])
    print(len(columns))

Then it returns output (27) and it doesn't error (fiona version 1.9.5).

If I do the same on windows (while the header doesn't return an accept-ranges key), then I get the same error as in OP.

DriverError: '/vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json' not recognized as a supported file format.

One step closer or further... Hope this info helps somebody else finding the culprit.

@jorisvandenbossche
Copy link
Member

So on my Linux machine (Ubuntu 22.04), I consistently get None with your code above .. (and tried it both with the system python as in a conda env)

But checking the header items, I am also seeing ('server', 'cloudflare') in there.

But again, still, if it the return 'bytes' then it still shouldn't fail on reading by fiona, isn't it.

Can you reproduce it locally that it fails when reading through geopandas.read_file? (like on CI, instead of your pure fiona snippet above)

Indeed, if it does return "bytes", geopandas will not download it because we then indeed assume fiona/pyogrio (GDAL/OGr in fact) can read it.
One possible reason I can think of: by the time GDAL tries to read from it again (although this should only be a few ms later), it sees the cloudfare endpoint, and then fails?

@jorisvandenbossche jorisvandenbossche changed the title BUG: not recognized as a supported file format BUG: GeoJSON file from URL not recognized as a supported file format May 17, 2024
@jorisvandenbossche
Copy link
Member

A way to directly test with GDAL:

$ CPL_CURL_VERBOSE=YES ogrinfo -ro -al -so /vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json

This fails for me: the verbose logging shows that it again uses the cloudfare server, and it basically "hangs" for 6 minutes

The output I see with some extra debugging output enabled:

$ CPL_CURL_VERBOSE=YES CPL_DEBUG=ON CPL_TIMESTAMP=ON ogrinfo -ro -al -so /vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json
[Fri May 17 09:35:17 2024].3674, 0.0000: HTTP: libcurl/8.7.1 OpenSSL/3.3.0 zlib/1.2.13 zstd/1.5.6 libssh2/1.11.0 nghttp2/1.58.0
[Fri May 17 09:35:17 2024].3675, 0.0001: CURL_INFO_TEXT: Couldn't find host cdn.jsdelivr.net in the .netrc file; using defaults
[Fri May 17 09:35:17 2024].3685, 0.0011: CURL_INFO_TEXT: Host cdn.jsdelivr.net:443 was resolved.
[Fri May 17 09:35:17 2024].3685, 0.0011: CURL_INFO_TEXT: IPv6: 2606:4700::6812:ba1f, 2606:4700::6812:bb1f
[Fri May 17 09:35:17 2024].3685, 0.0011: CURL_INFO_TEXT: IPv4: 104.18.186.31, 104.18.187.31
[Fri May 17 09:35:17 2024].3685, 0.0011: CURL_INFO_TEXT:   Trying [2606:4700::6812:ba1f]:443...
[Fri May 17 09:35:17 2024].3906, 0.0232: CURL_INFO_TEXT: Connected to cdn.jsdelivr.net (2606:4700::6812:ba1f) port 443
[Fri May 17 09:35:17 2024].3934, 0.0260: CURL_INFO_TEXT: ALPN: curl offers h2,http/1.1
[Fri May 17 09:35:17 2024].3936, 0.0263: CURL_INFO_TEXT: TLSv1.3 (OUT), TLS handshake, Client hello (1):
[Fri May 17 09:35:17 2024].4022, 0.0348: CURL_INFO_TEXT:  CAfile: /home/joris/conda/envs/test-read-url/ssl/cacert.pem
[Fri May 17 09:35:17 2024].4022, 0.0348: CURL_INFO_TEXT:  CApath: none
[Fri May 17 09:35:17 2024].4208, 0.0535: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Server hello (2):
[Fri May 17 09:35:17 2024].4215, 0.0541: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
[Fri May 17 09:35:17 2024].4216, 0.0542: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Certificate (11):
[Fri May 17 09:35:17 2024].4227, 0.0554: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, CERT verify (15):
[Fri May 17 09:35:17 2024].4229, 0.0555: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Finished (20):
[Fri May 17 09:35:17 2024].4230, 0.0556: CURL_INFO_TEXT: TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
[Fri May 17 09:35:17 2024].4230, 0.0556: CURL_INFO_TEXT: TLSv1.3 (OUT), TLS handshake, Finished (20):
[Fri May 17 09:35:17 2024].4231, 0.0558: CURL_INFO_TEXT: SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 / x25519 / RSASSA-PSS
[Fri May 17 09:35:17 2024].4232, 0.0558: CURL_INFO_TEXT: ALPN: server accepted h2
[Fri May 17 09:35:17 2024].4232, 0.0558: CURL_INFO_TEXT: Server certificate:
[Fri May 17 09:35:17 2024].4232, 0.0558: CURL_INFO_TEXT:  subject: CN=*.jsdelivr.net
[Fri May 17 09:35:17 2024].4232, 0.0558: CURL_INFO_TEXT:  start date: May  4 00:00:00 2024 GMT
[Fri May 17 09:35:17 2024].4232, 0.0558: CURL_INFO_TEXT:  expire date: May  4 23:59:59 2025 GMT
[Fri May 17 09:35:17 2024].4232, 0.0558: CURL_INFO_TEXT:  subjectAltName: host "cdn.jsdelivr.net" matched cert's "*.jsdelivr.net"
[Fri May 17 09:35:17 2024].4232, 0.0559: CURL_INFO_TEXT:  issuer: C=GB; ST=Greater Manchester; L=Salford; O=Sectigo Limited; CN=Sectigo RSA Domain Validation Secure Server CA
[Fri May 17 09:35:17 2024].4233, 0.0559: CURL_INFO_TEXT:  SSL certificate verify ok.
[Fri May 17 09:35:17 2024].4233, 0.0559: CURL_INFO_TEXT:   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
[Fri May 17 09:35:17 2024].4233, 0.0559: CURL_INFO_TEXT:   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha384WithRSAEncryption
[Fri May 17 09:35:17 2024].4233, 0.0559: CURL_INFO_TEXT:   Certificate level 2: Public key type RSA (4096/152 Bits/secBits), signed using sha384WithRSAEncryption
[Fri May 17 09:35:17 2024].4234, 0.0560: CURL_INFO_TEXT: using HTTP/2
[Fri May 17 09:35:17 2024].4235, 0.0561: CURL_INFO_TEXT: [HTTP/2] [1] OPENED stream for https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json
[Fri May 17 09:35:17 2024].4235, 0.0561: CURL_INFO_TEXT: [HTTP/2] [1] [:method: HEAD]
[Fri May 17 09:35:17 2024].4235, 0.0561: CURL_INFO_TEXT: [HTTP/2] [1] [:scheme: https]
[Fri May 17 09:35:17 2024].4235, 0.0561: CURL_INFO_TEXT: [HTTP/2] [1] [:authority: cdn.jsdelivr.net]
[Fri May 17 09:35:17 2024].4235, 0.0561: CURL_INFO_TEXT: [HTTP/2] [1] [:path: /npm/vega-datasets@v1.29.0/data/earthquakes.json]
[Fri May 17 09:35:17 2024].4235, 0.0561: CURL_INFO_TEXT: [HTTP/2] [1] [user-agent: GDAL/3.8.5]
[Fri May 17 09:35:17 2024].4235, 0.0561: CURL_INFO_TEXT: [HTTP/2] [1] [accept: */*]
[Fri May 17 09:35:17 2024].4235, 0.0562: CURL_INFO_HEADER_OUT: HEAD /npm/vega-datasets@v1.29.0/data/earthquakes.json HTTP/2
Host: cdn.jsdelivr.net
User-Agent: GDAL/3.8.5
Accept: */*

[Fri May 17 09:35:17 2024].4236, 0.0562: CURL_INFO_TEXT: Request completely sent off
[Fri May 17 09:35:17 2024].4416, 0.0742: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
[Fri May 17 09:35:17 2024].4417, 0.0743: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
[Fri May 17 09:35:17 2024].4417, 0.0744: CURL_INFO_TEXT: old SSL session ID is stale, removing
[Fri May 17 09:35:17 2024].4555, 0.0881: CURL_INFO_HEADER_IN: HTTP/2 200 
[Fri May 17 09:35:17 2024].4556, 0.0882: CURL_INFO_HEADER_IN: date: Fri, 17 May 2024 07:35:17 GMT
[Fri May 17 09:35:17 2024].4557, 0.0883: CURL_INFO_HEADER_IN: content-type: application/json; charset=utf-8
[Fri May 17 09:35:17 2024].4557, 0.0883: CURL_INFO_HEADER_IN: access-control-allow-origin: *
[Fri May 17 09:35:17 2024].4557, 0.0884: CURL_INFO_HEADER_IN: access-control-expose-headers: *
[Fri May 17 09:35:17 2024].4558, 0.0884: CURL_INFO_HEADER_IN: timing-allow-origin: *
[Fri May 17 09:35:17 2024].4558, 0.0884: CURL_INFO_HEADER_IN: cache-control: public, max-age=31536000, s-maxage=31536000, immutable
[Fri May 17 09:35:17 2024].4558, 0.0884: CURL_INFO_HEADER_IN: cross-origin-resource-policy: cross-origin
[Fri May 17 09:35:17 2024].4558, 0.0884: CURL_INFO_HEADER_IN: x-content-type-options: nosniff
[Fri May 17 09:35:17 2024].4558, 0.0884: CURL_INFO_HEADER_IN: strict-transport-security: max-age=31536000; includeSubDomains; preload
[Fri May 17 09:35:17 2024].4558, 0.0885: CURL_INFO_HEADER_IN: x-jsd-version: 1.29.0
[Fri May 17 09:35:17 2024].4558, 0.0885: CURL_INFO_HEADER_IN: x-jsd-version-type: version
[Fri May 17 09:35:17 2024].4559, 0.0885: CURL_INFO_HEADER_IN: etag: W/"129d0d-nk6KiNV9fUTf5O95Ns8JhUD6yxk"
[Fri May 17 09:35:17 2024].4559, 0.0885: CURL_INFO_HEADER_IN: x-served-by: cache-fra-eddf8230110-FRA, cache-lga21924-LGA
[Fri May 17 09:35:17 2024].4559, 0.0885: CURL_INFO_HEADER_IN: x-cache: MISS, HIT
[Fri May 17 09:35:17 2024].4559, 0.0885: CURL_INFO_HEADER_IN: vary: Accept-Encoding
[Fri May 17 09:35:17 2024].4559, 0.0886: CURL_INFO_HEADER_IN: alt-svc: h3=":443"; ma=86400
[Fri May 17 09:35:17 2024].4560, 0.0886: CURL_INFO_HEADER_IN: cf-cache-status: HIT
[Fri May 17 09:35:17 2024].4560, 0.0886: CURL_INFO_HEADER_IN: age: 261334
[Fri May 17 09:35:17 2024].4560, 0.0886: CURL_INFO_HEADER_IN: report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=FB06QdK3XmzM8SogksyJCkQsxIhdf0vHNK6Yq%2BsMOSjssORWVFMECrQSx1zT%2BLRPzE9WAHCiUTH2DnqQfrZIL8QhcpaWcxwffSSAqIQWuxUpHeLkivAgmdXJTBiItdbsG%2FM44H37jATMhKDG2Gg%3D"}],"group":"cf-nel","max_age":604800}
[Fri May 17 09:35:17 2024].4560, 0.0886: CURL_INFO_HEADER_IN: nel: {"success_fraction":0.01,"report_to":"cf-nel","max_age":604800}
[Fri May 17 09:35:17 2024].4560, 0.0886: CURL_INFO_HEADER_IN: server: cloudflare
[Fri May 17 09:35:17 2024].4560, 0.0887: CURL_INFO_HEADER_IN: cf-ray: 8851ee0dfa197500-BRU
[Fri May 17 09:35:17 2024].4560, 0.0887: CURL_INFO_HEADER_IN: 
[Fri May 17 09:35:17 2024].4561, 0.0887: CURL_INFO_TEXT: Connection #0 to host cdn.jsdelivr.net left intact
[Fri May 17 09:35:17 2024].4561, 0.0887: VSICURL: HEAD did not provide file size. Retrying with GET
[Fri May 17 09:35:17 2024].4563, 0.0889: CURL_INFO_TEXT: Couldn't find host cdn.jsdelivr.net in the .netrc file; using defaults
[Fri May 17 09:35:17 2024].4563, 0.0889: CURL_INFO_TEXT: Found bundle for host: 0x5635b594f130 [can multiplex]
[Fri May 17 09:35:17 2024].4563, 0.0889: CURL_INFO_TEXT: Re-using existing connection with host cdn.jsdelivr.net
[Fri May 17 09:35:17 2024].4564, 0.0890: CURL_INFO_TEXT: [HTTP/2] [3] OPENED stream for https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json
[Fri May 17 09:35:17 2024].4564, 0.0890: CURL_INFO_TEXT: [HTTP/2] [3] [:method: GET]
[Fri May 17 09:35:17 2024].4564, 0.0890: CURL_INFO_TEXT: [HTTP/2] [3] [:scheme: https]
[Fri May 17 09:35:17 2024].4564, 0.0890: CURL_INFO_TEXT: [HTTP/2] [3] [:authority: cdn.jsdelivr.net]
[Fri May 17 09:35:17 2024].4564, 0.0890: CURL_INFO_TEXT: [HTTP/2] [3] [:path: /npm/vega-datasets@v1.29.0/data/earthquakes.json]
[Fri May 17 09:35:17 2024].4564, 0.0890: CURL_INFO_TEXT: [HTTP/2] [3] [user-agent: GDAL/3.8.5]
[Fri May 17 09:35:17 2024].4564, 0.0890: CURL_INFO_TEXT: [HTTP/2] [3] [accept: */*]
[Fri May 17 09:35:17 2024].4565, 0.0892: CURL_INFO_HEADER_OUT: GET /npm/vega-datasets@v1.29.0/data/earthquakes.json HTTP/2
Host: cdn.jsdelivr.net
User-Agent: GDAL/3.8.5
Accept: */*

[Fri May 17 09:35:17 2024].4566, 0.0892: CURL_INFO_TEXT: Request completely sent off
[Fri May 17 09:35:17 2024].4868, 0.1195: CURL_INFO_HEADER_IN: HTTP/2 200 
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: date: Fri, 17 May 2024 07:35:17 GMT
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: content-type: application/json; charset=utf-8
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: access-control-allow-origin: *
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: access-control-expose-headers: *
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: timing-allow-origin: *
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: cache-control: public, max-age=31536000, s-maxage=31536000, immutable
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: cross-origin-resource-policy: cross-origin
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: x-content-type-options: nosniff
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: strict-transport-security: max-age=31536000; includeSubDomains; preload
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: x-jsd-version: 1.29.0
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: x-jsd-version-type: version
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: etag: W/"129d0d-nk6KiNV9fUTf5O95Ns8JhUD6yxk"
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: x-served-by: cache-fra-eddf8230110-FRA, cache-lga21924-LGA
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: x-cache: MISS, HIT
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: vary: Accept-Encoding
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: alt-svc: h3=":443"; ma=86400
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: cf-cache-status: HIT
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: age: 261334
[Fri May 17 09:35:17 2024].4869, 0.1195: CURL_INFO_HEADER_IN: report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=qD1PukHgr7gs8llbj25dY2%2Bm5j6LJcLJueYOwqPd03Q7RIh0cIZRjnX5xxSghU91tuu0rvDQr64jm7ooy3DrFimWFtCNWVQdgO1sVeLmpbwI652T%2FJeEtMDeMQ1aNPw8mzLPJrSottiR4K5fGaA%3D"}],"group":"cf-nel","max_age":604800}
[Fri May 17 09:35:17 2024].4869, 0.1196: CURL_INFO_HEADER_IN: nel: {"success_fraction":0.01,"report_to":"cf-nel","max_age":604800}
[Fri May 17 09:35:17 2024].4869, 0.1196: CURL_INFO_HEADER_IN: server: cloudflare
[Fri May 17 09:35:17 2024].4869, 0.1196: CURL_INFO_HEADER_IN: cf-ray: 8851ee0e2a4b7500-BRU
[Fri May 17 09:35:17 2024].4869, 0.1196: CURL_INFO_HEADER_IN: 
[Fri May 17 09:35:17 2024].4870, 0.1196: CURL_INFO_TEXT: Failure writing output to destination, passed 1360 returned 0
[Fri May 17 09:35:17 2024].4870, 0.1196: CURL_INFO_TEXT: process_pending_input: nghttp2_session_mem_recv() returned -902:The user callback function failed
[Fri May 17 09:35:17 2024].4870, 0.1196: CURL_INFO_TEXT: Connection #0 to host cdn.jsdelivr.net left intact
[Fri May 17 09:35:17 2024].4870, 0.1196: VSICURL: GetFileSize(https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json): response_code=200, curl error msg=Failure writing output to destination, passed 1360 returned 0
[Fri May 17 09:35:17 2024].4870, 0.1196: VSICURL: Request at offset 0, after end of file
[Fri May 17 09:35:17 2024].4964, 0.1290: VSICURL: Request at offset 0, after end of file
[Fri May 17 09:35:17 2024].4965, 0.1291: VSICURL: GetFileList(/vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data)
[Fri May 17 09:35:17 2024].4965, 0.1292: CURL_INFO_TEXT: Couldn't find host cdn.jsdelivr.net in the .netrc file; using defaults
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: Found bundle for host: 0x5635b594f130 [can multiplex]
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: Re-using existing connection with host cdn.jsdelivr.net
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: [HTTP/2] [5] OPENED stream for https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: [HTTP/2] [5] [:method: GET]
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: [HTTP/2] [5] [:scheme: https]
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: [HTTP/2] [5] [:authority: cdn.jsdelivr.net]
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: [HTTP/2] [5] [:path: /npm/vega-datasets@v1.29.0/data/]
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: [HTTP/2] [5] [user-agent: GDAL/3.8.5]
[Fri May 17 09:35:17 2024].4966, 0.1292: CURL_INFO_TEXT: [HTTP/2] [5] [accept: */*]
[Fri May 17 09:35:17 2024].4966, 0.1293: CURL_INFO_HEADER_OUT: GET /npm/vega-datasets@v1.29.0/data/ HTTP/2
Host: cdn.jsdelivr.net
User-Agent: GDAL/3.8.5
Accept: */*

[Fri May 17 09:35:17 2024].4967, 0.1293: CURL_INFO_TEXT: Request completely sent off
[Fri May 17 09:41:57 2024].6567, 400.2893: CURL_INFO_TEXT: OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 0
[Fri May 17 09:41:57 2024].6568, 400.2894: CURL_INFO_TEXT: Failed receiving HTTP2 data: 56(Failure when receiving data from the peer)
[Fri May 17 09:41:57 2024].6568, 400.2894: CURL_INFO_TEXT: Connection died, retrying a fresh connect (retry count: 1)
[Fri May 17 09:41:57 2024].6569, 400.2895: CURL_INFO_TEXT: Closing connection
[Fri May 17 09:41:57 2024].6572, 400.2898: CURL_INFO_TEXT: Issue another request to this URL: 'https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/'
[Fri May 17 09:41:57 2024].6573, 400.2899: CURL_INFO_TEXT: Couldn't find host cdn.jsdelivr.net in the .netrc file; using defaults
[Fri May 17 09:41:57 2024].6890, 400.3217: CURL_INFO_TEXT: Host cdn.jsdelivr.net:443 was resolved.
[Fri May 17 09:41:57 2024].6891, 400.3217: CURL_INFO_TEXT: IPv6: 2606:4700::6812:bb1f, 2606:4700::6812:ba1f
[Fri May 17 09:41:57 2024].6891, 400.3217: CURL_INFO_TEXT: IPv4: 151.101.9.229
[Fri May 17 09:41:57 2024].6891, 400.3217: CURL_INFO_TEXT:   Trying [2606:4700::6812:bb1f]:443...
[Fri May 17 09:41:57 2024].7144, 400.3471: CURL_INFO_TEXT: Connected to cdn.jsdelivr.net (2606:4700::6812:bb1f) port 443
[Fri May 17 09:41:57 2024].7148, 400.3474: CURL_INFO_TEXT: ALPN: curl offers h2,http/1.1
[Fri May 17 09:41:57 2024].7150, 400.3477: CURL_INFO_TEXT: TLSv1.3 (OUT), TLS handshake, Client hello (1):
[Fri May 17 09:41:57 2024].7520, 400.3847: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Server hello (2):
[Fri May 17 09:41:57 2024].7525, 400.3851: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
[Fri May 17 09:41:57 2024].7526, 400.3852: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Certificate (11):
[Fri May 17 09:41:57 2024].7535, 400.3861: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, CERT verify (15):
[Fri May 17 09:41:57 2024].7536, 400.3862: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Finished (20):
[Fri May 17 09:41:57 2024].7537, 400.3863: CURL_INFO_TEXT: TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
[Fri May 17 09:41:57 2024].7537, 400.3863: CURL_INFO_TEXT: TLSv1.3 (OUT), TLS handshake, Finished (20):
[Fri May 17 09:41:57 2024].7538, 400.3864: CURL_INFO_TEXT: SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 / x25519 / RSASSA-PSS
[Fri May 17 09:41:57 2024].7538, 400.3865: CURL_INFO_TEXT: ALPN: server accepted h2
[Fri May 17 09:41:57 2024].7538, 400.3865: CURL_INFO_TEXT: Server certificate:
[Fri May 17 09:41:57 2024].7539, 400.3865: CURL_INFO_TEXT:  subject: CN=*.jsdelivr.net
[Fri May 17 09:41:57 2024].7539, 400.3865: CURL_INFO_TEXT:  start date: May  4 00:00:00 2024 GMT
[Fri May 17 09:41:57 2024].7539, 400.3865: CURL_INFO_TEXT:  expire date: May  4 23:59:59 2025 GMT
[Fri May 17 09:41:57 2024].7539, 400.3865: CURL_INFO_TEXT:  subjectAltName: host "cdn.jsdelivr.net" matched cert's "*.jsdelivr.net"
[Fri May 17 09:41:57 2024].7539, 400.3866: CURL_INFO_TEXT:  issuer: C=GB; ST=Greater Manchester; L=Salford; O=Sectigo Limited; CN=Sectigo RSA Domain Validation Secure Server CA
[Fri May 17 09:41:57 2024].7539, 400.3866: CURL_INFO_TEXT:  SSL certificate verify ok.
[Fri May 17 09:41:57 2024].7540, 400.3866: CURL_INFO_TEXT:   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
[Fri May 17 09:41:57 2024].7540, 400.3866: CURL_INFO_TEXT:   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha384WithRSAEncryption
[Fri May 17 09:41:57 2024].7540, 400.3866: CURL_INFO_TEXT:   Certificate level 2: Public key type RSA (4096/152 Bits/secBits), signed using sha384WithRSAEncryption
[Fri May 17 09:41:57 2024].7540, 400.3867: CURL_INFO_TEXT: using HTTP/2
[Fri May 17 09:41:57 2024].7541, 400.3867: CURL_INFO_TEXT: [HTTP/2] [1] OPENED stream for https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/
[Fri May 17 09:41:57 2024].7541, 400.3867: CURL_INFO_TEXT: [HTTP/2] [1] [:method: GET]
[Fri May 17 09:41:57 2024].7541, 400.3867: CURL_INFO_TEXT: [HTTP/2] [1] [:scheme: https]
[Fri May 17 09:41:57 2024].7541, 400.3867: CURL_INFO_TEXT: [HTTP/2] [1] [:authority: cdn.jsdelivr.net]
[Fri May 17 09:41:57 2024].7541, 400.3867: CURL_INFO_TEXT: [HTTP/2] [1] [:path: /npm/vega-datasets@v1.29.0/data/]
[Fri May 17 09:41:57 2024].7541, 400.3867: CURL_INFO_TEXT: [HTTP/2] [1] [user-agent: GDAL/3.8.5]
[Fri May 17 09:41:57 2024].7541, 400.3867: CURL_INFO_TEXT: [HTTP/2] [1] [accept: */*]
[Fri May 17 09:41:57 2024].7542, 400.3868: CURL_INFO_HEADER_OUT: GET /npm/vega-datasets@v1.29.0/data/ HTTP/2
Host: cdn.jsdelivr.net
User-Agent: GDAL/3.8.5
Accept: */*

[Fri May 17 09:41:57 2024].7542, 400.3868: CURL_INFO_TEXT: Request completely sent off
[Fri May 17 09:41:57 2024].7727, 400.4054: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
[Fri May 17 09:41:57 2024].7728, 400.4054: CURL_INFO_TEXT: TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
[Fri May 17 09:41:57 2024].7728, 400.4054: CURL_INFO_TEXT: old SSL session ID is stale, removing
[Fri May 17 09:41:57 2024].7863, 400.4190: CURL_INFO_HEADER_IN: HTTP/2 200 
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: date: Fri, 17 May 2024 07:41:57 GMT
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: content-type: text/html; charset=utf-8
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: access-control-allow-origin: *
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: access-control-expose-headers: *
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: timing-allow-origin: *
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: cache-control: public, max-age=43200
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: cross-origin-resource-policy: cross-origin
[Fri May 17 09:41:57 2024].7864, 400.4190: CURL_INFO_HEADER_IN: x-content-type-options: nosniff
[Fri May 17 09:41:57 2024].7864, 400.4191: CURL_INFO_HEADER_IN: strict-transport-security: max-age=31536000; includeSubDomains; preload
[Fri May 17 09:41:57 2024].7864, 400.4191: CURL_INFO_HEADER_IN: etag: W/"dcd6-45XJXoL1/OVruRVpAcMZYaBfuFE"
[Fri May 17 09:41:57 2024].7864, 400.4191: CURL_INFO_HEADER_IN: x-served-by: cache-fra-etou8220075-FRA, cache-lga21957-LGA
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: x-cache: HIT, HIT
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: vary: Accept-Encoding
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: alt-svc: h3=":443"; ma=86400
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: cf-cache-status: HIT
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: age: 400
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=nUxkXC%2Fpzdg%2FGPmS0TYFEbt%2BTXsc6fBjDTFFpVKvOQNtX%2BOjJiO6VvpqCpxzyOaOax3Of5vCZS7%2BlD3oXsYfVWG3T2nN5CkQtDEImAKu%2F3ceSU7yC5OlPTv%2BMQoPfsBm%2Fqg80btHmWGQUxLrLnU%3D"}],"group":"cf-nel","max_age":604800}
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: nel: {"success_fraction":0.01,"report_to":"cf-nel","max_age":604800}
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: server: cloudflare
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: cf-ray: 8851f7d3f9ffb9a8-BRU
[Fri May 17 09:41:57 2024].7865, 400.4191: CURL_INFO_HEADER_IN: 
[Fri May 17 09:41:57 2024].7909, 400.4235: CURL_INFO_TEXT: Connection #1 to host cdn.jsdelivr.net left intact
[Fri May 17 09:41:57 2024].7920, 400.4246: CURL_INFO_TEXT: Couldn't find host cdn.jsdelivr.net in the .netrc file; using defaults
[Fri May 17 09:41:57 2024].7920, 400.4246: CURL_INFO_TEXT: Found bundle for host: 0x5635b594b220 [can multiplex]
[Fri May 17 09:41:57 2024].7920, 400.4246: CURL_INFO_TEXT: Re-using existing connection with host cdn.jsdelivr.net
[Fri May 17 09:41:57 2024].7920, 400.4246: CURL_INFO_TEXT: [HTTP/2] [3] OPENED stream for https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.jso1
[Fri May 17 09:41:57 2024].7920, 400.4246: CURL_INFO_TEXT: [HTTP/2] [3] [:method: HEAD]
[Fri May 17 09:41:57 2024].7920, 400.4246: CURL_INFO_TEXT: [HTTP/2] [3] [:scheme: https]
[Fri May 17 09:41:57 2024].7920, 400.4246: CURL_INFO_TEXT: [HTTP/2] [3] [:authority: cdn.jsdelivr.net]
[Fri May 17 09:41:57 2024].7920, 400.4247: CURL_INFO_TEXT: [HTTP/2] [3] [:path: /npm/vega-datasets@v1.29.0/data/earthquakes.jso1]
[Fri May 17 09:41:57 2024].7920, 400.4247: CURL_INFO_TEXT: [HTTP/2] [3] [user-agent: GDAL/3.8.5]
[Fri May 17 09:41:57 2024].7920, 400.4247: CURL_INFO_TEXT: [HTTP/2] [3] [accept: */*]
[Fri May 17 09:41:57 2024].7921, 400.4247: CURL_INFO_HEADER_OUT: HEAD /npm/vega-datasets@v1.29.0/data/earthquakes.jso1 HTTP/2
Host: cdn.jsdelivr.net
User-Agent: GDAL/3.8.5
Accept: */*

[Fri May 17 09:41:57 2024].7921, 400.4247: CURL_INFO_TEXT: Request completely sent off
[Fri May 17 09:41:57 2024].9411, 400.5737: CURL_INFO_HEADER_IN: HTTP/2 404 
[Fri May 17 09:41:57 2024].9411, 400.5737: CURL_INFO_HEADER_IN: date: Fri, 17 May 2024 07:41:57 GMT
[Fri May 17 09:41:57 2024].9411, 400.5737: CURL_INFO_HEADER_IN: content-type: text/plain; charset=utf-8
[Fri May 17 09:41:57 2024].9411, 400.5737: CURL_INFO_HEADER_IN: access-control-allow-origin: *
[Fri May 17 09:41:57 2024].9411, 400.5737: CURL_INFO_HEADER_IN: access-control-expose-headers: *
[Fri May 17 09:41:57 2024].9411, 400.5737: CURL_INFO_HEADER_IN: timing-allow-origin: *
[Fri May 17 09:41:57 2024].9411, 400.5737: CURL_INFO_HEADER_IN: cache-control: public, max-age=86400, s-maxage=86400
[Fri May 17 09:41:57 2024].9411, 400.5738: CURL_INFO_HEADER_IN: cross-origin-resource-policy: cross-origin
[Fri May 17 09:41:57 2024].9411, 400.5738: CURL_INFO_HEADER_IN: x-content-type-options: nosniff
[Fri May 17 09:41:57 2024].9411, 400.5738: CURL_INFO_HEADER_IN: strict-transport-security: max-age=31536000; includeSubDomains; preload
[Fri May 17 09:41:57 2024].9411, 400.5738: CURL_INFO_HEADER_IN: etag: W/"49-yvgCyxFakZ5saewA0lx92e2d2ns"
[Fri May 17 09:41:57 2024].9411, 400.5738: CURL_INFO_HEADER_IN: x-served-by: cache-fra-eddf8230061-FRA, cache-lga21941-LGA
[Fri May 17 09:41:57 2024].9411, 400.5738: CURL_INFO_HEADER_IN: x-cache: MISS, MISS
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: vary: Accept-Encoding
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: alt-svc: h3=":443"; ma=86400
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: cf-cache-status: HIT
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=oCBDURNmlMmEe72M897v7yBXGvLlkbjp%2FmvBoUodgpxFE2Qd4H%2BDhH6P4qBL46I%2FhURsYBY3VNpkNjHwKK4a0d2VZsV7APxnWO%2FMh4seEUH%2FwbaZZ0XwSxzMbJO4CoY5KgS78X2BsYV6LHe6aRw%3D"}],"group":"cf-nel","max_age":604800}
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: nel: {"success_fraction":0.01,"report_to":"cf-nel","max_age":604800}
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: server: cloudflare
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: cf-ray: 8851f7d43a4db9a8-BRU
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_HEADER_IN: 
[Fri May 17 09:41:57 2024].9412, 400.5738: CURL_INFO_TEXT: Connection #1 to host cdn.jsdelivr.net left intact
[Fri May 17 09:41:57 2024].9412, 400.5738: VSICURL: GetFileSize(https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.jso1): response_code=404, server error msg=HTTP/2 404 
date: Fri, 17 May 2024 07:41:57 GMT
content-type: text/plain; charset=utf-8
access-control-allow-origin: *
access-control-expose-headers: *
timing-allow-origin: *
cache-control: public, max-age=86400, s-maxage=86400
cross-origin-resource-policy: cross-origin
x-content-type-options: nosniff
strict-transport-security: max-age=31536000; includeSubDomains; preload
etag: W/"49-yvgCyxFakZ5saewA0lx92e2d2ns"
x-served-by: cache-fra-eddf8230061-FRA, cache-lga21941-LGA
x-cache: MISS, MISS
vary: Accept-Encoding
alt-svc: h3=":443"; ma=86400
cf-cache-status: HIT
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=oCBDURNmlMmEe72M897v7yBXGvLlkbjp%2FmvBoUodgpxFE2Qd4H%2BDhH6P4qBL46I%2FhURsYBY3VNpkNjHwKK4a0d2VZsV7APxnWO%2FMh4seEUH%2FwbaZZ0XwSxzMbJO4CoY5KgS78X2BsYV6LHe6aRw%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0.01,"report_to":"cf-nel","max_age":604800}
server: cloudflare
cf-ray: 8851f7d43a4db9a8-BRU


ERROR 4: `/vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json' not recognized as a supported file format.
ogrinfo failed - unable to open '/vsicurl/https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json'.
[Fri May 17 09:41:57 2024].9428, 400.5754: GDAL: In GDALDestroy - unloading GDAL shared library.

One potentially interesting bit I see: VSICURL: HEAD did not provide file size. Retrying with GET. Because getting the file size doesn't work (VSICURL: GetFileSize(https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/earthquakes.json): response_code=200, curl error msg=Failure writing output to destination, passed 1360 returned 0), it tries to do a file listing for the "directory" in the url (https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data), and that is the part that hangs.

cc @rouault in case you have any insight here

@binste
Copy link

binste commented May 18, 2024

I did a re-run, but that still gives the same error (link to run). Temporarily pinning geopandas to 0.14.3 did make all tests pass again, so did that for now in vega/altair#3421.

By comparing the changes between 0.14.3 and 0.14.4, I see some potential related changes that might be introduced by #3232?

We now also get the error for geopandas 0.14.3: https://github.com/vega/altair/actions/runs/9138091817/job/25128832186?pr=3419 -> Seems to not be related to changes between 0.14.3 and 0.14.4.

@jorisvandenbossche
Copy link
Member

That's a pity for you (no easy workaround to have CI green), but at least something that makes sense! ;) (I otherwise really couldn't explain how the changes between 0.14.3 and 0.14.4 would have impacted this)

@binste
Copy link

binste commented May 18, 2024

:) Good to hear that it's consistent with what you would expect. And we can just disable that test for now to keep going, that's fine.

Btw, great Arrow tutorial at PyData Berlin! Really enjoyed it.

@mattijn
Copy link
Author

mattijn commented May 18, 2024

@binste, after empty the cache of that CI run it is working with 0.14.3. Unfortunately, the issue is still there with 0.14.4. Will have a look next week if I can isolate it further.

@mattijn
Copy link
Author

mattijn commented May 22, 2024

After numerous tests, I think this comment is correct:

One possible reason I can think of: by the time GDAL tries to read from it again (although this should only be a few ms later), it sees the cloudfare endpoint, and then fails?

Eventually I was able to make most of our CI happy again (from this branch), see https://github.com/vega/altair/actions/runs/9199264680/job/25303690232 with the following changes to geopandas: main...mattijn:geopandas:read_file_adaptation.
Since you are phasing out fiona, I'm not sure if there is interest to have this as an PR.

The failing CI is on python 3.8. Geopandas 1.0 drops support for this python version?

Btw, If I understand right, there is not yet partial data access support from urls using the pyogrio engine?

@m-richards
Copy link
Member

m-richards commented May 25, 2024

Thanks for the investigation @mattijn! Feel free to open a PR and we can have a look, we are switching the default io engine from pyogrio to fiona, but there aren't an plans to drop fiona as of now.

The failing CI is on python 3.8. Geopandas 1.0 drops support for this python version?

We actually dropped formal support for python 3.8 in geopandas 0.14 (on the basis of SPEC 0 timings, rather than us switching to use python 3.9+ exclusive features.

Btw, If I understand right, there is not yet partial data access support from urls using the pyogrio engine?

I'm not sure about this myself

@jorisvandenbossche
Copy link
Member

Btw, If I understand right, there is not yet partial data access support from urls using the pyogrio engine?

My understanding is that pyogrio should support that just as much as fiona does, because I thought this support came from passing the URL down to GDAL/OGR.
(but clearly something different is going on if you see this failure with pyogrio and not with fiona, after your fix with _url_supports_random_access)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants