Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert platybrowser data to zarr #1

Open
tischi opened this issue Oct 16, 2020 · 45 comments
Open

convert platybrowser data to zarr #1

tischi opened this issue Oct 16, 2020 · 45 comments
Assignees
Labels

Comments

@tischi
Copy link
Owner

tischi commented Oct 16, 2020

@joshmoore

All those xml point to n5s3 datasets: https://github.com/mobie/platybrowser-datasets/tree/master/data/1.0.1/images/remote
Within the xml you can see all information needed to access the object in the bucket.

Cool would be to have those converted to zarr:

@joshmoore
Copy link
Collaborator

@tischi : want to add me to this repo so I can assign myself?

@joshmoore
Copy link
Collaborator

Do you have any info on the sizes of all these volumes? I'm running a recursive s3 ls now but it's taking ages.

@tischi
Copy link
Owner Author

tischi commented Oct 22, 2020

I'm running a recursive s3 ls now but it's taking ages.

That's the issue with these millions of small files...doing anything with them but lazy loading chunks is not much fun.
I will try to check....

@tischi
Copy link
Owner Author

tischi commented Oct 22, 2020

...running now du -sh sbem-6dpf-1-whole-raw.n5 but that also takes time........................

@tischi
Copy link
Owner Author

tischi commented Oct 22, 2020

2.0T sbem-6dpf-1-whole-raw.n5
18G sbem-6dpf-1-whole-segmented-cells.n5

and the other one is much smaller.

@joshmoore
Copy link
Collaborator

joshmoore commented Oct 22, 2020

Doh. My command was done after my 🏃 :

$ aws-embl-public s3 ls --summarize --human-readable --recursive s3://platybrowser/rawdata/sbem-6dpf-1-whole-raw.n5/
...
Total Objects: 4049381
   Total Size: 2.0 TiB

@tischi
Copy link
Owner Author

tischi commented Oct 22, 2020

I am curious how long it will take to copy and zarrify this. Maybe would be interesting to time it for future reference.

@joshmoore
Copy link
Collaborator

joshmoore commented Oct 25, 2020

The transfer is progressing extremely slowly. Do you have a small example dataset I could use to write a script and then you could transform locally?

Edit: actually, I'm now getting permission denied when I try to access s3.embl.de!

@tischi
Copy link
Owner Author

tischi commented Oct 26, 2020

The myosin data set in the list above is small.

Edit: actually, I'm now getting permission denied when I try to access s3.embl.de!

Interesting, this may be related to this:
mobie/mobie#18

@joshmoore
Copy link
Collaborator

No luck:

$ aws --endpoint-url=https://s3.embl.de --no-sign-request s3 ls s3://platybrowser/

[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)

$ aws --no-verify-ssl --endpoint-url=https://s3.embl.de --no-sign-request s3 ls s3://platybrowser/
/usr/lib/fence-agents/bundled/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)

An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied.

@tischi
Copy link
Owner Author

tischi commented Oct 26, 2020

I will write IT...

@tischi
Copy link
Owner Author

tischi commented Oct 26, 2020

We also have the exact same data on a file system.
Should I zip it and then provide you a download link for this?
Alternative is that we zarrify it at EMBL, but then we still need to zip and send you I think...

@joshmoore
Copy link
Collaborator

Yeah, if you can provide me a small- to mid-size download, I'll get started on a script and/or docker you can run.

@tischi
Copy link
Owner Author

tischi commented Nov 11, 2020

just to keep track what I do in case I have to repeat:

bash-4.2$ ./aws configure --profile tischi
AWS Access Key ID [None]: tischi
AWS Secret Access Key [None]: xyz
Default region name [None]: 
Default output format [None]:
bash-4.2$ ./aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 ls s3://idr-upload/tischi/
2020-11-10 16:39:33         84 README.txt
 ./aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/0.6.3/images/local/prospr-6dpf-1-whole-mhcl4.n5 s3://idr-upload/tischi/prospr-6dpf-1-whole-mhcl4.n5

note: is is important to add the root folder to the upload destination

@joshmoore
I uploaded one file: prospr-6dpf-1-whole-mhcl4.n5
Can you read it?

@tischi
Copy link
Owner Author

tischi commented Nov 11, 2020

sbem-6dpf-1-whole-segmented-cells.n5

sbatch -c 2 -t 10:00:00 --mem 16000 -e /g/cba/tischer/tmp/err.txt -o /g/cba/tischer/tmp/out.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/1.0.1/images/local/sbem-6dpf-1-whole-segmented-cells.n5 s3://idr-upload/tischi/sbem-6dpf-1-whole-segmented-cells.n5

started...

sacct --format="JobID,State,CPUTime,MaxRSS"

I think it finished:

-bash-4.2$ sacct --format="JobID,State,CPUTime,MaxRSS"
       JobID      State    CPUTime     MaxRSS 
------------ ---------- ---------- ---------- 
6315838       COMPLETED   06:23:16            
6315838.bat+  COMPLETED   06:23:16    119908K 
6315838.ext+  COMPLETED   06:23:24       352K 

Took 6.5 hours, seems to have arrived.

-bash-4.2$ /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 ls s3://idr-upload/tischi/
                           PRE prospr-6dpf-1-whole-mhcl4.n5/
                           PRE sbem-6dpf-1-whole-segmented-cells.n5/
                           PRE test/
                           PRE test2/
                           PRE test3/
                           PRE test5/
2020-11-10 16:39:33         84 README.txt
2020-11-11 12:47:47         22 attributes.json

TODO:

sbatch -c 2 -t ???:00:00 --mem 16000 -e /g/cba/tischer/tmp/err.txt -o /g/cba/tischer/tmp/out.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5

@tischi
Copy link
Owner Author

tischi commented Nov 12, 2020

@joshmoore

Based on above experiment, if I extrapolate how long it would take to upload the 3D volume EM raw data using aws sync, I get:

2000 GB / 18 GB * 7 hours / 24 hours = 32 days

Any thoughts?

@tischi
Copy link
Owner Author

tischi commented Nov 12, 2020

@constantinpape
Do you know how long it took to copy /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5 onto our local S3 storage?

@martinschorb
Do you know tricks to speed up copying to an S3 object store? I think you looked into this a bit, did you?

@tischi
Copy link
Owner Author

tischi commented Nov 12, 2020

One idea could be to start several copy processes, e.g., parallelising over the resolution layers:

-bash-4.2$ ls /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/
attributes.json  s0  s1  s2  s3  s4  s5  s6  s7  s8  s9

I would think both our local file system and Josh's the receiving s3 storage should handle 10 parallel processes.

@martinschorb
Copy link

I found that it was much faster from a 3dcloud VM than from the cluster. But that could be specific to the network connnectivity to the s3 machines.

@constantinpape
Copy link
Contributor

Do you know how long it took to copy /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5 onto our local S3 storage?

I think about a day. I used a cluster node (gpu6 or 7 probably).

@constantinpape
Copy link
Contributor

@joshmoore
@tischi

I am not sure if this is helpful, but I could also convert the data to zarr on the EMBL side.

@tischi
Copy link
Owner Author

tischi commented Nov 12, 2020

I am not sure if this is helpful, but I could also convert the data to zarr on the EMBL side.

I think this is very interesting indeed, but @joshmoore should comment, because I don't know whether he needs some specific zarr flavour.

@joshmoore
Copy link
Collaborator

Any thoughts?

Not immediately, unless you want to also try tar'ing it up.

I am not sure if this is helpful, but I could also convert the data to zarr on the EMBL side.

@constantinpape : if you want to kick off a n5-copy from n5 to zarr then I'll send you a script for the rest of the conversion. That being said, it would still be good to have the files on our servers for testing.

@tischi
Copy link
Owner Author

tischi commented Nov 12, 2020

I think I'll just start it, resolution layer by resolution layer...

@constantinpape
Copy link
Contributor

constantinpape commented Nov 12, 2020

Edit: Sorry I wrote this before tischis last comment. If you want to do it Tischi, Go ahead.

@constantinpape : if you want to kick off a n5-copy from n5 to zarr then I'll send you a script for the rest of the conversion. That being said, it would still be good to have the files on our servers for testing.

I would probably use a python script I have set up for this.
If we want to do it on the embl side it would be best to test on one of the smaller volumes first, so I do the conversion then
run the script from @joshmoore and we see if the result matches.

Let's start with myosin, I will convert it later.

@tischi
Copy link
Owner Author

tischi commented Nov 12, 2020

s9

sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s9.txt -o /g/cba/tischer/tmp/out_s9.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9

This finished instantly...
@constantinpape
Could it be that this level is empty?

-bash-4.2$ ls /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9/0/0/0
/g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9/0/0/0

@joshmoore
Copy link
Collaborator

I would probably use a python script I have set up for this.

👍 for however it happens but the equivalent, yeah. 👍

@constantinpape
Copy link
Contributor

I would probably use a python script I have set up for this.

+1 for however it happens but the equivalent, yeah. +1

@joshmoore ok, let's see if we can get Tischi's conversion to run first and then have this as a fallback.

This finished instantly...
@constantinpape
Could it be that this level is empty?

I can't log into VPN right now, will check later.
(But I can tell you already that the data is probably very small at s9 ;))

@constantinpape
Copy link
Contributor

@tischi s9 has exactly one chunk, which is 41kb, so I would expect it to copy almost immediately:

pape@gpu7:/g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9$ ls -lh 0/0/0 
-rw-r--r-- 1 pape kreshuk 41K 12. Feb 2020  0/0/0

@tischi
Copy link
Owner Author

tischi commented Nov 12, 2020

sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s3.txt -o /g/cba/tischer/tmp/out_s3.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s3 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s3
...level s3 with sync took 1h 48min
// All levels above are done with sync, now proceeding with cp --recursive, maybe its faster since it does not have to check? We can then add missing chunks with sync later, I guess.
sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s2.txt -o /g/cba/tischer/tmp/out_s2.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --quiet --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2
sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s1.txt -o /g/cba/tischer/tmp/out_s1.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --quiet --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1
sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s0.txt -o /g/cba/tischer/tmp/out_s0.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --quiet --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0

@joshmoore
Copy link
Collaborator

Here's a quick script which looks to be working locally. I'm unsure if setups are always channels for this data and if there's ever more than one channel and/or setup.

$ ./convert.py prospr-6dpf-1-whole-nachr.zarr prospr-6dpf-1-whole-nachr.ome.zarr
$ ome_zarr info prospr-6dpf-1-whole-nachr.ome.zarr/
/opt/data/tischi/prospr-6dpf-1-whole-nachr.ome.zarr [zgroup]
 - metadata
   - Multiscales
 - data
   - (1, 1, 519, 471, 500)
   - (1, 1, 260, 236, 250)
   - (1, 1, 130, 118, 125)
   - (1, 1, 65, 59, 63)
#!/usr/bin/env python

# This assumes that n5-copy has already been used

import argparse
import zarr

parser = argparse.ArgumentParser()
parser.add_argument("input")
parser.add_argument("output")
ns = parser.parse_args()

zin = zarr.open(ns.input)

sizes = []

def groups(z):
    rv = sorted(list(z.groups()))
    assert rv
    assert not list(z.arrays())
    return rv

def arrays(z):
    rv = sorted(list(z.arrays()))
    assert rv
    assert not list(z.groups())
    return rv

setups = groups(zin)
assert len(setups) == 1  # TODO: multiple channels?
for sname, setup in setups:
    timepoints = groups(setup)
    for tname, timepoint in timepoints:
        resolutions = arrays(timepoint)
        for idx, rtuple in enumerate(resolutions):
            rname, resolution = rtuple
            try:
                expected = sizes[idx]
                assert expected[0] == rname
                assert expected[1] == resolution.shape
                assert expected[2] == resolution.chunks
                assert expected[3] == resolution.dtype
            except:
                sizes.append((rname,
                              resolution.shape,
                              resolution.chunks,
                              resolution.dtype))


datasets = []
out = zarr.open(ns.output, mode="w")

for idx, size in enumerate(sizes):
    name, shape, chunks, dtype = size
    shape = tuple([len(timepoints), len(setups)] + list(shape))
    chunks = tuple([1, 1] + list(chunks))
    a = out.create_dataset(name, shape=shape, chunks=chunks, dtype=dtype)
    datasets.append({"path": name})
    for sidx, stuple in enumerate(groups(zin)):
        for tidx, ttuple in enumerate(groups(stuple[1])):
            resolutions = arrays(ttuple[1])
            a[tidx, sidx, :, :, :] = resolutions[idx][1]
out.attrs["multiscales"] = [
    {
        "version": "0.1",
        "datasets": datasets,
    }
]

@constantinpape
Copy link
Contributor

I think I'll just start it, resolution layer by resolution l

I'm unsure if setups are always channels for this data and if there's ever more than one channel and/or setup.

For now we always have a single setup, corresponding to a single channel.

@joshmoore
Copy link
Collaborator

I don't know if this is a problem in zarr.n5.N5Store or in the data prospr-6dpf-1-whole-nachr.n5 data I've been looking at, but not having "n5": "2.0.0" in the intermediate groups leads to the exception:

In [43]: list(zarr.hierarchy.Group(store=zarr.n5.N5Store("/opt/data/tischi/prospr-6dpf-1-whole-nachr.n5")).groups())
...
ValueError: group not found at path 'setup0'

whereas if I edit the file I get:

In [44]: list(zarr.hierarchy.Group(store=zarr.n5.N5Store("/opt/data/tischi/prospr-6dpf-1-whole-nachr.n5")).groups())
Out[44]: [('setup0', <zarr.hierarchy.Group '/setup0'>)]

@constantinpape
Copy link
Contributor

constantinpape commented Nov 12, 2020

I have written these files with z5py, which for n5 only write the version attribute to the root, as specified in
https://github.com/saalfeldlab/n5#file-system-specification point 3.

Did this maybe change recently to be more in line with the zarr group metadata? (It shouldn't without changing major version because I think this would be a breaking change.)

Or is it just a bug in the zarr.n5store?

Anyway, for now we can fix it by adding the attributes to find the underlying issue.

@constantinpape
Copy link
Contributor

@joshmoore
I had a closer look at the script you posted now, and I think we can do the same thing directly from our n5s and with much less copying around of the data. I implemented a script that should do this here:
https://github.com/constantinpape/i2k-2020-s3-zarr-workshop/blob/main/data-conversion/to_ome_zarr.py

Note that I am using z5py to read the n5 datasets because of the issue with the group level attributes, otherwise one could also use zarr.
Also, I am storing the zarr array with a NestedDirectoryStore; I would really prefer if we can do that otherwise the large datasets really overwhelm the FS. But if it's not supported yet we could also switch to the standard with flat hierarchy for the chunks.

@joshmoore
Copy link
Collaborator

Or is it just a bug in the zarr.n5store?

Yes. zarr-developers/zarr-python#651

I implemented a script that should do this here:

👍 I'll look more tomorrow.

But if it's not supported yet we could also switch to the standard with flat hierarchy for the chunks.

It previously wasn't on the zarr side, so in the ome-zarr spec it's prevented. I agree! I'd very much like to move to nested storage in the next version bump.

@constantinpape
Copy link
Contributor

Ok, I updated it to support the flat chunk hierarchy.

@tischi
Copy link
Owner Author

tischi commented Nov 17, 2020

@joshmoore
Feels like the storage is very slow for some reason. Maybe because you copy from it?
This only returns with a timeout for me:

aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 ls s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0/

Does it work for you?

@joshmoore
Copy link
Collaborator

Ah, possibly. I've canceled my mirror command. Let me know if it looks to be faster.

...
...point0/s0/120/159/1:  159.27 GiB / 159.27 GiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 1.83 MiB/s 24h44m10s
real    1484m16.708s
user    23m28.571s
sys     23m16.119s

@tischi
Copy link
Owner Author

tischi commented Nov 17, 2020

@joshmoore
Still slow (see my mail for a theory) for me.
Is it faster for you?

@joshmoore
Copy link
Collaborator

I think you are right that the lower paths are struggling under the number of subelements. Certainly listing the top .n5 works (--> setup0). Listing it on the server returns fine (50 elements).

@tischi
Copy link
Owner Author

tischi commented Nov 17, 2020

@joshmoore
If you can, could you please let me know the result of an ls for these three folders?

sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0/
sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/
sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2/

From the results I hope to deduce what has been copied already such that I do not start the sync in more subfolders than necessary.

@joshmoore
Copy link
Collaborator

@tischi Sure!

output
[jamoore@idrftp-ftp ~]$ mc ls idr-upload/idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0/
[2020-11-17 17:04:15 UTC]      0B 0/
[2020-11-17 17:04:15 UTC]      0B 1/
[2020-11-17 17:04:15 UTC]      0B 10/
[2020-11-17 17:04:15 UTC]      0B 100/
[2020-11-17 17:04:15 UTC]      0B 101/
[2020-11-17 17:04:15 UTC]      0B 102/
[2020-11-17 17:04:15 UTC]      0B 103/
[2020-11-17 17:04:15 UTC]      0B 104/
[2020-11-17 17:04:15 UTC]      0B 105/
[2020-11-17 17:04:15 UTC]      0B 106/
[2020-11-17 17:04:15 UTC]      0B 107/
[2020-11-17 17:04:15 UTC]      0B 108/
[2020-11-17 17:04:15 UTC]      0B 109/
[2020-11-17 17:04:15 UTC]      0B 11/
[2020-11-17 17:04:15 UTC]      0B 110/
[2020-11-17 17:04:15 UTC]      0B 111/
[2020-11-17 17:04:15 UTC]      0B 112/
[2020-11-17 17:04:15 UTC]      0B 113/
[2020-11-17 17:04:15 UTC]      0B 114/
[2020-11-17 17:04:15 UTC]      0B 115/
[2020-11-17 17:04:15 UTC]      0B 116/
[2020-11-17 17:04:15 UTC]      0B 117/
[2020-11-17 17:04:15 UTC]      0B 118/
[2020-11-17 17:04:15 UTC]      0B 119/
[2020-11-17 17:04:15 UTC]      0B 12/
[2020-11-17 17:04:15 UTC]      0B 120/
[2020-11-17 17:04:15 UTC]      0B 121/
[2020-11-17 17:04:15 UTC]      0B 122/
[2020-11-17 17:04:15 UTC]      0B 123/
[2020-11-17 17:04:15 UTC]      0B 124/
[2020-11-17 17:04:15 UTC]      0B 125/
[2020-11-17 17:04:15 UTC]      0B 126/
[2020-11-17 17:04:15 UTC]      0B 127/
[2020-11-17 17:04:15 UTC]      0B 128/
[2020-11-17 17:04:15 UTC]      0B 129/
[2020-11-17 17:04:15 UTC]      0B 13/
[2020-11-17 17:04:15 UTC]      0B 130/
[2020-11-17 17:04:15 UTC]      0B 131/
[2020-11-17 17:04:15 UTC]      0B 132/
[2020-11-17 17:04:15 UTC]      0B 133/
[2020-11-17 17:04:15 UTC]      0B 134/
[2020-11-17 17:04:15 UTC]      0B 135/
[2020-11-17 17:04:15 UTC]      0B 136/
[2020-11-17 17:04:15 UTC]      0B 137/
[2020-11-17 17:04:15 UTC]      0B 138/
[2020-11-17 17:04:15 UTC]      0B 139/
[2020-11-17 17:04:15 UTC]      0B 14/
[2020-11-17 17:04:15 UTC]      0B 140/
[2020-11-17 17:04:15 UTC]      0B 141/
[2020-11-17 17:04:15 UTC]      0B 142/
[jamoore@idrftp-ftp ~]$ mc ls idr-upload/idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/
[2020-11-17 17:04:29 UTC]      0B 0/
[2020-11-17 17:04:29 UTC]      0B 1/
[2020-11-17 17:04:29 UTC]      0B 10/
[2020-11-17 17:04:29 UTC]      0B 100/
[2020-11-17 17:04:29 UTC]      0B 101/
[2020-11-17 17:04:29 UTC]      0B 102/
[2020-11-17 17:04:29 UTC]      0B 103/
[2020-11-17 17:04:29 UTC]      0B 104/
[2020-11-17 17:04:29 UTC]      0B 105/
[2020-11-17 17:04:29 UTC]      0B 106/
[2020-11-17 17:04:29 UTC]      0B 107/
[2020-11-17 17:04:29 UTC]      0B 108/
[2020-11-17 17:04:29 UTC]      0B 109/
[2020-11-17 17:04:29 UTC]      0B 11/
[2020-11-17 17:04:29 UTC]      0B 110/
[2020-11-17 17:04:29 UTC]      0B 111/
[2020-11-17 17:04:29 UTC]      0B 112/
[2020-11-17 17:04:29 UTC]      0B 113/
[2020-11-17 17:04:29 UTC]      0B 114/
[2020-11-17 17:04:29 UTC]      0B 115/
[2020-11-17 17:04:29 UTC]      0B 116/
[2020-11-17 17:04:29 UTC]      0B 117/
[2020-11-17 17:04:29 UTC]      0B 118/
[2020-11-17 17:04:29 UTC]      0B 119/
[2020-11-17 17:04:29 UTC]      0B 12/
[2020-11-17 17:04:29 UTC]      0B 120/
[2020-11-17 17:04:29 UTC]      0B 121/
[2020-11-17 17:04:29 UTC]      0B 122/
[2020-11-17 17:04:29 UTC]      0B 123/
[2020-11-17 17:04:29 UTC]      0B 124/
[2020-11-17 17:04:29 UTC]      0B 125/
[2020-11-17 17:04:29 UTC]      0B 126/
[2020-11-17 17:04:29 UTC]      0B 127/
[2020-11-17 17:04:29 UTC]      0B 128/
[2020-11-17 17:04:29 UTC]      0B 129/
[2020-11-17 17:04:29 UTC]      0B 13/
[2020-11-17 17:04:29 UTC]      0B 130/
[2020-11-17 17:04:29 UTC]      0B 131/
[2020-11-17 17:04:29 UTC]      0B 132/
[2020-11-17 17:04:29 UTC]      0B 133/
[2020-11-17 17:04:29 UTC]      0B 134/
[2020-11-17 17:04:29 UTC]      0B 135/
[2020-11-17 17:04:29 UTC]      0B 136/
[2020-11-17 17:04:29 UTC]      0B 137/
[2020-11-17 17:04:29 UTC]      0B 138/
[2020-11-17 17:04:29 UTC]      0B 139/
[2020-11-17 17:04:29 UTC]      0B 14/
[2020-11-17 17:04:29 UTC]      0B 140/
[2020-11-17 17:04:29 UTC]      0B 141/
[2020-11-17 17:04:29 UTC]      0B 142/
[2020-11-17 17:04:29 UTC]      0B 143/
[2020-11-17 17:04:29 UTC]      0B 15/
[2020-11-17 17:04:29 UTC]      0B 16/
[2020-11-17 17:04:29 UTC]      0B 17/
[2020-11-17 17:04:29 UTC]      0B 18/
[2020-11-17 17:04:29 UTC]      0B 19/
[2020-11-17 17:04:29 UTC]      0B 2/
[2020-11-17 17:04:29 UTC]      0B 20/
[2020-11-17 17:04:29 UTC]      0B 21/
[2020-11-17 17:04:29 UTC]      0B 22/
[2020-11-17 17:04:29 UTC]      0B 23/
[2020-11-17 17:04:29 UTC]      0B 24/
[2020-11-17 17:04:29 UTC]      0B 25/
[2020-11-17 17:04:29 UTC]      0B 26/
[2020-11-17 17:04:29 UTC]      0B 27/
[2020-11-17 17:04:29 UTC]      0B 28/
[2020-11-17 17:04:29 UTC]      0B 29/
[2020-11-17 17:04:29 UTC]      0B 3/
[2020-11-17 17:04:29 UTC]      0B 30/
[2020-11-17 17:04:29 UTC]      0B 31/
[2020-11-17 17:04:29 UTC]      0B 32/
[2020-11-17 17:04:29 UTC]      0B 33/
[2020-11-17 17:04:29 UTC]      0B 34/
[2020-11-17 17:04:29 UTC]      0B 35/
[2020-11-17 17:04:29 UTC]      0B 36/
[2020-11-17 17:04:29 UTC]      0B 37/
[2020-11-17 17:04:29 UTC]      0B 38/
[2020-11-17 17:04:29 UTC]      0B 39/
[2020-11-17 17:04:29 UTC]      0B 4/
[2020-11-17 17:04:29 UTC]      0B 40/
[2020-11-17 17:04:29 UTC]      0B 41/
[2020-11-17 17:04:29 UTC]      0B 42/
[2020-11-17 17:04:29 UTC]      0B 43/
[2020-11-17 17:04:29 UTC]      0B 44/
[2020-11-17 17:04:29 UTC]      0B 45/
[2020-11-17 17:04:29 UTC]      0B 46/
[2020-11-17 17:04:29 UTC]      0B 47/
[2020-11-17 17:04:29 UTC]      0B 48/
[2020-11-17 17:04:29 UTC]      0B 49/
[2020-11-17 17:04:29 UTC]      0B 5/
[2020-11-17 17:04:29 UTC]      0B 50/
[2020-11-17 17:04:29 UTC]      0B 51/
[2020-11-17 17:04:29 UTC]      0B 52/
[2020-11-17 17:04:29 UTC]      0B 53/
[2020-11-17 17:04:29 UTC]      0B 54/
[2020-11-17 17:04:29 UTC]      0B 55/
[2020-11-17 17:04:29 UTC]      0B 56/
[2020-11-17 17:04:29 UTC]      0B 57/
[2020-11-17 17:04:29 UTC]      0B 58/
[2020-11-17 17:04:29 UTC]      0B 59/
[2020-11-17 17:04:29 UTC]      0B 6/
[2020-11-17 17:04:29 UTC]      0B 60/
[2020-11-17 17:04:29 UTC]      0B 61/
[2020-11-17 17:04:29 UTC]      0B 62/
[2020-11-17 17:04:29 UTC]      0B 63/
[2020-11-17 17:04:29 UTC]      0B 64/
[2020-11-17 17:04:29 UTC]      0B 65/
[2020-11-17 17:04:29 UTC]      0B 66/
[2020-11-17 17:04:29 UTC]      0B 67/
[2020-11-17 17:04:29 UTC]      0B 68/
[2020-11-17 17:04:29 UTC]      0B 69/
[2020-11-17 17:04:29 UTC]      0B 7/
[2020-11-17 17:04:29 UTC]      0B 70/
[2020-11-17 17:04:29 UTC]      0B 71/
[2020-11-17 17:04:29 UTC]      0B 72/
[2020-11-17 17:04:29 UTC]      0B 73/
[2020-11-17 17:04:29 UTC]      0B 74/
[2020-11-17 17:04:29 UTC]      0B 75/
[2020-11-17 17:04:29 UTC]      0B 76/
[2020-11-17 17:04:29 UTC]      0B 77/
[2020-11-17 17:04:29 UTC]      0B 78/
[2020-11-17 17:04:29 UTC]      0B 79/
[2020-11-17 17:04:29 UTC]      0B 8/
[2020-11-17 17:04:29 UTC]      0B 80/
[2020-11-17 17:04:29 UTC]      0B 81/
[2020-11-17 17:04:29 UTC]      0B 82/
[2020-11-17 17:04:29 UTC]      0B 83/
[2020-11-17 17:04:29 UTC]      0B 84/
[2020-11-17 17:04:29 UTC]      0B 85/
[2020-11-17 17:04:29 UTC]      0B 86/
[2020-11-17 17:04:29 UTC]      0B 87/
[2020-11-17 17:04:29 UTC]      0B 88/
[2020-11-17 17:04:29 UTC]      0B 89/
[2020-11-17 17:04:29 UTC]      0B 9/
[2020-11-17 17:04:29 UTC]      0B 90/
[2020-11-17 17:04:29 UTC]      0B 91/
[2020-11-17 17:04:29 UTC]      0B 92/
[2020-11-17 17:04:29 UTC]      0B 93/
[2020-11-17 17:04:29 UTC]      0B 94/
[jamoore@idrftp-ftp ~]$ mc ls idr-upload/idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2/
[2020-11-12 16:53:58 UTC]    143B attributes.json
[2020-11-17 17:04:32 UTC]      0B 0/
[2020-11-17 17:04:32 UTC]      0B 1/
[2020-11-17 17:04:32 UTC]      0B 10/
[2020-11-17 17:04:32 UTC]      0B 11/
[2020-11-17 17:04:32 UTC]      0B 12/
[2020-11-17 17:04:32 UTC]      0B 13/
[2020-11-17 17:04:32 UTC]      0B 14/
[2020-11-17 17:04:32 UTC]      0B 15/
[2020-11-17 17:04:32 UTC]      0B 16/
[2020-11-17 17:04:32 UTC]      0B 17/
[2020-11-17 17:04:32 UTC]      0B 18/
[2020-11-17 17:04:32 UTC]      0B 19/
[2020-11-17 17:04:32 UTC]      0B 2/
[2020-11-17 17:04:32 UTC]      0B 20/
[2020-11-17 17:04:32 UTC]      0B 21/
[2020-11-17 17:04:32 UTC]      0B 22/
[2020-11-17 17:04:32 UTC]      0B 23/
[2020-11-17 17:04:32 UTC]      0B 24/
[2020-11-17 17:04:32 UTC]      0B 25/
[2020-11-17 17:04:32 UTC]      0B 26/
[2020-11-17 17:04:32 UTC]      0B 27/
[2020-11-17 17:04:32 UTC]      0B 28/
[2020-11-17 17:04:32 UTC]      0B 29/
[2020-11-17 17:04:32 UTC]      0B 3/
[2020-11-17 17:04:32 UTC]      0B 30/
[2020-11-17 17:04:32 UTC]      0B 31/
[2020-11-17 17:04:32 UTC]      0B 32/
[2020-11-17 17:04:32 UTC]      0B 33/
[2020-11-17 17:04:32 UTC]      0B 34/
[2020-11-17 17:04:32 UTC]      0B 35/
[2020-11-17 17:04:32 UTC]      0B 36/
[2020-11-17 17:04:32 UTC]      0B 37/
[2020-11-17 17:04:32 UTC]      0B 38/
[2020-11-17 17:04:32 UTC]      0B 39/
[2020-11-17 17:04:32 UTC]      0B 4/
[2020-11-17 17:04:32 UTC]      0B 40/
[2020-11-17 17:04:32 UTC]      0B 41/
[2020-11-17 17:04:32 UTC]      0B 42/
[2020-11-17 17:04:32 UTC]      0B 43/
[2020-11-17 17:04:32 UTC]      0B 44/
[2020-11-17 17:04:32 UTC]      0B 45/
[2020-11-17 17:04:32 UTC]      0B 46/
[2020-11-17 17:04:32 UTC]      0B 47/
[2020-11-17 17:04:32 UTC]      0B 48/
[2020-11-17 17:04:32 UTC]      0B 49/
[2020-11-17 17:04:32 UTC]      0B 5/
[2020-11-17 17:04:32 UTC]      0B 50/
[2020-11-17 17:04:32 UTC]      0B 51/
[2020-11-17 17:04:32 UTC]      0B 52/
[2020-11-17 17:04:32 UTC]      0B 53/
[2020-11-17 17:04:32 UTC]      0B 54/
[2020-11-17 17:04:32 UTC]      0B 55/
[2020-11-17 17:04:32 UTC]      0B 56/
[2020-11-17 17:04:32 UTC]      0B 57/
[2020-11-17 17:04:32 UTC]      0B 58/
[2020-11-17 17:04:32 UTC]      0B 59/
[2020-11-17 17:04:32 UTC]      0B 6/
[2020-11-17 17:04:32 UTC]      0B 60/
[2020-11-17 17:04:32 UTC]      0B 61/
[2020-11-17 17:04:32 UTC]      0B 62/
[2020-11-17 17:04:32 UTC]      0B 63/
[2020-11-17 17:04:32 UTC]      0B 64/
[2020-11-17 17:04:32 UTC]      0B 65/
[2020-11-17 17:04:32 UTC]      0B 66/
[2020-11-17 17:04:32 UTC]      0B 67/
[2020-11-17 17:04:32 UTC]      0B 68/
[2020-11-17 17:04:32 UTC]      0B 69/
[2020-11-17 17:04:32 UTC]      0B 7/
[2020-11-17 17:04:32 UTC]      0B 70/
[2020-11-17 17:04:32 UTC]      0B 71/
[2020-11-17 17:04:32 UTC]      0B 8/
[2020-11-17 17:04:32 UTC]      0B 9/

@tischi
Copy link
Owner Author

tischi commented Nov 17, 2020

@joshmoore
Using sync I am now getting this error:

bash-4.2$ /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94
fatal error: Read timeout on endpoint URL: "https://idr-ftp.openmicroscopy.org/idr-upload?list-type=2&prefix=tischi%2Fsbem-6dpf-1-whole-raw.n5%2Fsetup0%2Ftimepoint0%2Fs1%2F94%2F&encoding-type=url"

any ideas?

@tischi
Copy link
Owner Author

tischi commented Nov 17, 2020

@joshmoore
And using cp I am also getting an error:

upload failed: ../../g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94/100/0 to s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94/100/0 An error occurred (ServiceUnavailable) when calling the PutObject operation (reached max retries: 4): Please reduce your request rate.

Maybe the server is kind of down?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants