Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple download does not work #358

Closed
gigjozsa opened this issue Sep 8, 2022 · 1 comment
Closed

Simple download does not work #358

gigjozsa opened this issue Sep 8, 2022 · 1 comment

Comments

@gigjozsa
Copy link

gigjozsa commented Sep 8, 2022

The following script (tested on several public data sets and on different computers and with different pythons) causes an error:

#! /usr/bin/env python
import katdal

file = '1629930087_sdp_l0.full.rdb'
d = katdal.open(file)
a = d.vis[0,0,0]

This is Python 3.6.9 on an Ubuntu box
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
Katdal has been installed using PyPi:

pip install katdal

Results in the error mirrored below. Not sure if this is an error on the server side or on katdal's side or even on my side (although I don't think so). Please help!

WARNING:katdal.dataset:Extending flux density model frequency range of 'J0408-6545' from 1410-8400 MHz to 855-8400 MHz
Traceback (most recent call last):
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 638, in get_chunk
    headers=headers, stream=True)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 594, in complete_request
    with self.request(method, url, chunk_name, **kwargs) as response:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 543, in request
    raise S3ObjectNotFound(msg)
katdal.chunkstore_s3.S3ObjectNotFound: Chunk '1629930087-sdp-l0/correlator_data/00000_00000_00000': Store responded with HTTP error 404 (Not Found) to request: GET http://archive-gw-1.kat.ac.za/1629930087-sdp-l0/correlator_data/00000_00000_00000.npy
Details of server response: <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><BucketName>1629930087-sdp-l0</BucketName><RequestId>tx00000000000000df9d20a-006319b9e6-da08a2b-default</RequestId><HostId>da08a2b-default-default</HostId></Error>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./rfitest.py", line 6, in <module>
    a = d.vis[0,0,0]
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 558, in __getitem__
    return self.get([self], keep)[0]
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 591, in get
    da.store(kept, out, lock=False)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/array/core.py", line 1041, in store
    result.compute(**kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/base.py", line 283, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/base.py", line 565, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/threaded.py", line 84, in get
    **kwargs
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 487, in get_async
    raise_exception(exc, tb)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 317, in reraise
    raise exc
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore.py", line 145, in __getitem__
    return self.getter(self.array_name, slices, self.dtype, **self.kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore.py", line 325, in get_chunk_or_placeholder
    return self.get_chunk(array_name, slices, dtype)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 641, in get_chunk
    self._verify_bucket(url, err)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 625, in _verify_bucket
    raise StoreUnavailable(msg) from chunk_error
katdal.chunkstore.StoreUnavailable: S3 bucket http://archive-gw-1.kat.ac.za/1629930087-sdp-l0 is empty - your data is not currently accessible
@ludwigschwardt
Copy link
Contributor

Hi Josh,

The key bit of info is the very last line:

katdal.chunkstore.StoreUnavailable: S3 bucket http://archive-gw-1.kat.ac.za/1629930087-sdp-l0 is empty - your data is not currently accessible

This dataset is more than a year old. We only keep datasets on disk for 200 days, where katdal can find them. After that, the datasets are shipped off to tape, and you have to request the archive folks to put specific ones back on disk for you. This restaging process may take up to 30 days.

That is why your data is "not currently accessible".

If the data has already been restaged and it is still not accessible, please contact the archive folks to sort it out. It might be a bug on the archive side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants