Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3-api permission issue #61

Closed
denvdm opened this issue Jan 15, 2021 · 31 comments
Closed

S3-api permission issue #61

denvdm opened this issue Jan 15, 2021 · 31 comments

Comments

@denvdm
Copy link

denvdm commented Jan 15, 2021

Getting permission errors when attempting to use s3-API on NIRD to import, export, or peform any other action. This worked fine last month. Exact error pasted below

ERROR: Error parsing xml: mismatched tag: line 6, column 2
ERROR: b'<html>\r\n<head><title>401 Authorization Required</title></head>\r\n<body>\r\n<center><h1>401 Authorization Required</h1></center>\r\n<hr><center>nginx/1.16.1</center>\r\n</body>\r\n</html>\r\n'
ERROR: S3 error: 401 (Unauthorized)
@ofrei
Copy link
Contributor

ofrei commented Jan 18, 2021

@denvdm have you tried tacl api?
https://www.uio.no/english/services/it/research/sensitive-data/use-tsd/import-export/import-data-using-the-tsd-api.html
My feeling is that TSD team have a low-priority for maintaining tsd-s3cmd. I confirmed last week tsd-s3cmd it is officially supported, but they advice users to go for tacl whenever possible, and in the long term TSD may consider deprecating tsd-s3cmd. If there are things that tacl can't do then we can push for maintaining tsd-s3cmd, but I'd like to check if there is a real need for this.

Could you please try tacl next time you import/export, and tell us here if it doesn't cover your needs or is a step back compared to tsd-s3cmd?

@ofrei
Copy link
Contributor

ofrei commented Jan 26, 2021

For me tacl looks promising, but currently it don't work. I've submitted an RT ticket #4242337.

>tacl p33 --upload CorticalArea.csv
CorticalArea.csv |################################| 100%
401 Client Error: Unauthorized for url: https://api.tsd.usit.no/v1/p33/files/stream/CorticalArea.csv?group=p33-member-group
The request was unsuccesful. Exiting.

@ofrei
Copy link
Contributor

ofrei commented Jan 29, 2021

I've solved my tacl --upload problem by running tacl p33 --session-delete. Now it works, also for uploads of folders.

I've upgraded to tacl v3.3.1, but still got the same error. Then I've tried uploading to p697 - this worked, but for p33 it still failed.
I've tried to register tacl with p33 project again, using "tacl --register" -- this worked, did not resolve my problem with uploads. Finally, I ran "tacl p33 --session-delete" , and this helped. It's a bit strange - the issue persisted for a few days and I had to re-type my password & OTP many times, so there must be some state that is only cleaned by --session-delete.

@denvdm
Copy link
Author

denvdm commented Mar 20, 2021 via email

@denvdm
Copy link
Author

denvdm commented Mar 24, 2021 via email

@ofrei
Copy link
Contributor

ofrei commented Mar 24, 2021

@denvdm eh, too bad... what was the error? I don't see it attached.
Also, do you use screen session? It's best to run sync within screen to make sure it survives a disconnect . But it's not an excuse for tacl not been able to resume the session - tacl should resume just fine, let's investigate why it doesn't

@denvdm
Copy link
Author

denvdm commented Mar 24, 2021 via email

@denvdm
Copy link
Author

denvdm commented Mar 25, 2021

Re. the above, unfortunately, turns out I did not figure it out. The error in the end did not seem to be caused by trying to restart a running process. The transfer this time is definitely dead (number of files in import dir hasnt increased in hours) and when I try to start up tacl upload again (tacl p33 --upload 20250) I am still getting the same error, see pasted below (btw, that file is definitely present). Any thoughts?

File "/nird/home/dennisva/.local/bin/tacl", line 11, in
sys.exit(cli())
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/tsdapiclient/tacl.py", line 518, in cli
uploader.sync()
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/tsdapiclient/sync.py", line 286, in sync
self._transfer(resource, integrity_reference=integrity_reference)
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/tsdapiclient/sync.py", line 585, in _transfer
resource, integrity_reference=integrity_reference
File "/nird/home/dennisva/.local/lib/python3.6/site-packages/tsdapiclient/sync.py", line 425, in _transfer_local_to_remote
if os.stat(resource).st_size > CHUNK_THRESHOLD:
FileNotFoundError: [Errno 2] No such file or directory: '20250//3138077_20250_2_0.zip'

@ofrei
Copy link
Contributor

ofrei commented Mar 25, 2021

@denvdm Yeah, this is weird. The file definitely exists (I know where to look - checked just now), and "chmod" permissions are fine.

I've noticed that 20250 folder within TSD had exactly 5000 files - sounds like some sort of limit. Could you please submit a ticket to TSD-drift? Add please add a link to this github ticket - here it's a good discussion....

I've tried removing one file (the text file with field lists), and then re-running the sync with tacl p33 --upload-sync 20250.
First problem - this took ~15 minutes doing fetching information about directory. So slow perf is too bad, --upload-sync can't spend ~15 minutes to go just over ~5000 files, `rsync`` can do it in less than a second.

$ tacl p33 --upload-sync 20250
uploading directory 20250
fetching information about directory: 20250
fetching information about directory: 20250
fetching information about directory: 20250
fetching information about directory: 20250

Finally, after ~15 minutes, tacl started copying the same files as you've already had in the target folder on TSD - so it still has 4999 files and I couldn't validate my theory about 5000 files. That's a third problem - why the same files are synced again even though I'm running --update-sync, not --update?

@ofrei
Copy link
Contributor

ofrei commented Mar 25, 2021

@denvdm btw, for me tsd-s3cmd works - I guess it could be the quickest way to resolve your data transfer.

Long term, let's push for a better tacl - it seem quite handy. If there is a limit of 5000 it's likely a quick fix, but I'm more concerned about slow perf in --upload-sync.

@leondutoit
Copy link

What behaviour do you want from the sync here? Do you want files that are removed locally (from NIRD) to be removed remotely (from TSD)?

@leondutoit
Copy link

The 5000 limit sounds weird, and I cannot imagine where that would come from. I will try to reproduce it.

If this is just a directory upload, and not a sync of a routinely changing directory, then tacl p33 --upload {directory} is more appropriate since there will be no waiting and automatic resume.

If it is a sync (and you want local changes to propagate to the remote) then you need to explicitly enable caching so you get resume:

tacl --guide sync

...

By default, there is no caching for sync, because the normal
use case would be to copy a directory which has many files
in total, but only a few changing ones. If you are in control
of the changes, and you know there will not be any changes while
your transfer is running, then you can enable caching like this:

    tacl p11 --download-sync mydir --cache-sync

This will allow resuming the sync without having to query the API
and the local filesystem for its current state.

You'll be using --upload-sync though.

@denvdm
Copy link
Author

denvdm commented Mar 27, 2021

@leondutoit, I indeed used tacl p33 --upload dir. I dont know what may be causing the error but as @ofrei indicated it does seem awfully coincidental it gets stuck at such a round number. Then again I already got an error a day earlier when it hadnt reached this number yet (see earlier messages). Re. the specific error message, as Alex also checked, the file definitely did exist.
By the way (perhaps unfortunately wrt figuring this out), Ivan has moved on with these files and they have now been removed from the import dir. I've started a new run of tacl for the remaining files and that is running smoothly.

@leondutoit
Copy link

leondutoit commented Mar 27, 2021

Ah ok, --upload {dir} should do the right thing. I tested the 5000 limit like this: mkdir -p d1 && for i in `seq 1 5001`; do mkfile 1k d1/$i.txt; done; time tacl p11 --basic --upload d1 and didn't see any issues. Conceptually each file is an independent transfer, so there should be no issue.

@leondutoit
Copy link

I can reproduce the FileNotFoundError like this:

ldt:~ leondutoit$ mkdir -p d3 && for i in `seq 1 10`; do mkfile 1k d3/$i.txt; done; tacl p11 --basic --upload d3
uploading directory d3
d3/10.txt |################################| 100%
d3/9.txt |################################| 100%
d3/8.txt |################################| 100%
d3/5.txt |################################| 100%
d3/4.txt |################################| 100%
^C
Aborted!
ldt:~ leondutoit$ rm d3/3.txt
ldt:~ leondutoit$ tacl p11 --basic --upload d3
uploading directory d3
resuming directory transfer from cache
d3/4.txt |################################| 100%
d3/6.txt |################################| 100%
d3/7.txt |################################| 100%
Traceback (most recent call last):
  File "/usr/local/bin/tacl", line 33, in <module>
    sys.exit(load_entry_point('tsd-api-client==3.3.1', 'console_scripts', 'tacl')())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/tsd_api_client-3.3.1-py3.9.egg/tsdapiclient/tacl.py", line 518, in cli
  File "/usr/local/lib/python3.9/site-packages/tsd_api_client-3.3.1-py3.9.egg/tsdapiclient/sync.py", line 286, in sync
  File "/usr/local/lib/python3.9/site-packages/tsd_api_client-3.3.1-py3.9.egg/tsdapiclient/sync.py", line 586, in _transfer
  File "/usr/local/lib/python3.9/site-packages/tsd_api_client-3.3.1-py3.9.egg/tsdapiclient/sync.py", line 427, in _transfer_local_to_remote
FileNotFoundError: [Errno 2] No such file or directory: 'd3/3.txt'

What's happing here is that the local cache contains all files in the directory when the upload starts, and removes them as they succeed. Then I cancel the upload, delete a file that has not been uploaded, and restart the upload. Now the missing local file is still listed in the cache, and when trying to upload it, it fails.

@denvdm
Copy link
Author

denvdm commented Mar 27, 2021 via email

@leondutoit
Copy link

@ofrei I made a configuration change on the server, which sped up the scanning part of my sync of 5001 files from 2min to 9sec.

@leondutoit
Copy link

I'll assume this is not an issue anymore, but if so just ping me here.

@ofrei
Copy link
Contributor

ofrei commented Apr 6, 2021

@leondutoit Thank you for fixing this! I'm busy (major grant deadline this Thursday), will re-test tacl sync performance on Friday.
@denvdm are you continue some of large-scale transfers, or already done by now?

@denvdm
Copy link
Author

denvdm commented Apr 6, 2021 via email

@leondutoit
Copy link

@ofrei the latest release of tacl: https://pypi.org/project/tsd-api-client/3.4.0/ includes better sync performance, better Windows support, and some other small improvements, feel free to give it a go

@ofrei
Copy link
Contributor

ofrei commented Apr 19, 2021

@leondutoit Upgraded to tacl 3.4.0 - now testing sync of github repo. From time to time it gives the following error:


DEBUG streaming data to https://api.tsd.usit.no/v1/p697/files/stream/p697-member-group/github/norment/moba_qc_imputation/.git/logs/HEAD?group=p697-member-group

DEBUG reading file: github/norment/moba_qc_imputation/.git/logs/HEAD
github/norment/moba_qc_imputation/.git/logs/HEAD
DEBUG reading chunk
github/norment/moba_qc_imputation/.git/logs/HEAD |################################| 100%
DEBUG chunk read complete

DEBUG reading chunk
github/norment/moba_qc_imputation/.git/logs/HEAD |################################| 100%
DEBUG no more data to read

405 Client Error: Method Not Allowed for url: https://api.tsd.usit.no/v1/p697/files/stream/p697-member-group/github/norment/moba_qc_imputation/.git/logs/HEAD?group=p697-member-group
The request was unsuccesful. Exiting.

@leondutoit
Copy link

@ofrei Ah yes, I saw that too. It is a config option on the server side which has to allow deletion of files, which I forgot to set. Will do later today.

@ofrei
Copy link
Contributor

ofrei commented Apr 20, 2021

@leondutoit Thank you! Happy to re-test when this is ready.
I've also noticed that the files are synced in unpredictable order. I.e. if I sync 3 folders 10 files each, the order of the files will be completely random, interleaving folders at random. Is there a way to fix this? Perhaps it's a simple as adding "sorted", or the changing some data structure to something like OrderedDict ?

@leondutoit
Copy link

The order of the upload? Why is this a problem?

@ofrei
Copy link
Contributor

ofrei commented Apr 21, 2021

It's not a big problem, but any non-deterministic behaviour is less user-friendly than when things happen in well determined order.
Take the error that I reported as an example. TACL worked fine to sync a few files, than it encounter a delete operation, and failed. I re-start - and it start syncing some other files until the next delete operation. So the failure look quite unpredictable to me - I couldn't even see if it's related to a specific file.

@leondutoit
Copy link

I already explained the delete issue. As for "non-deterministic order" this is how python returns directory entries:

In [81]: os.listdir?
Signature: os.listdir(path=None)
Docstring:
Return a list containing the names of the files in the directory.

path can be specified as either str, bytes, or a path-like object.  If path is bytes,
  the filenames returned will also be bytes; in all other circumstances
  the filenames returned will be str.
If path is None, uses the path='.'.
On some platforms, path may also be specified as an open file descriptor;\
  the file descriptor must refer to a directory.
  If this functionality is unavailable, using it raises NotImplementedError.

The list is in arbitrary order.  It does not include the special
entries '.' and '..' even if they are present in the directory.
Type:      builtin_function_or_method

Note The list is in arbitrary order. I'm not going to sacrifice performance and memory usage to force a certain order.

@ofrei
Copy link
Contributor

ofrei commented Apr 21, 2021 via email

@leondutoit
Copy link

The delete issue should be fixed now.

@ofrei
Copy link
Contributor

ofrei commented Apr 26, 2021

@leondutoit I upgraded tacl, re-run --upload-sync, and still have this issue

405 Client Error: Method Not Allowed for url: https://api.tsd.usit.no/v1/p697/files/stream/p697-member-group/tsd_monitoring/.git/logs/refs/heads/master?group=p697-member-group

However I'm closing this ticket - the original question from @denvdm is solved and we're back to using tsd-s3cmd.

@ofrei ofrei closed this as completed Apr 26, 2021
@leondutoit
Copy link

Can't reproduce btw:

ldt:~ leondutoit$ rm d3/10.txt
ldt:~ leondutoit$ tacl  p11 --upload-sync d3
uploading directory d3
fetching information about directory: d3
deleting: d3/10.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants