Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot get file with Datalad #3513

Closed
mathdugre opened this issue Jul 4, 2019 · 3 comments
Closed

Cannot get file with Datalad #3513

mathdugre opened this issue Jul 4, 2019 · 3 comments

Comments

@mathdugre
Copy link

What is the problem?

I installed corr/RawDataBIDS from dataset.datald.org but it looks like I cannot get the files using datalad while it works well with git-annex. For instance, trying to get file participants.tsv in NYU_2 just fails silently:

(dask-dist) [centos@scheduler NYU_2]$ datalad get -v participants.tsv 
(dask-dist) [centos@scheduler NYU_2]$ file participants.tsv 
participants.tsv: broken symbolic link to .git/annex/objects/qF/8V/MD5E-s1896--439b33f3e4d39bb3502c0e7d2fc2c9b7.tsv/MD5E-s1896--439b33f3e4d39bb3502c0e7d2fc2c9b7.tsv

... while with git-annex it works fine:

(dask-dist) [centos@scheduler NYU_2]$ git annex get participants.tsv 
get participants.tsv (from web...) 
(checksum...) ok
(recording state in git...)
(dask-dist) [centos@scheduler NYU_2]$ file participants.tsv 
participants.tsv: symbolic link to .git/annex/objects/qF/8V/MD5E-s1896--439b33f3e4d39bb3502c0e7d2fc2c9b7.tsv/MD5E-s1896--439b33f3e4d39bb3502c0e7d2fc2c9b7.tsv

What steps will reproduce the problem?

  • datalad install ///corr/RawDataBIDS -r -g -J8
  • cd corr/RawDataBIDS/NYU_2
  • datalad get participants.tsv # fails silently
  • git annex get participants.tsv # downloads the file

What version of DataLad are you using (run datalad --version)? On what operating system (consider running datalad wtf)?

(dask-dist) [centos@scheduler NYU_2]$ datalad wtf
# WTF
## configuration <SENSITIVE, report disabled by configuration>
## datalad 
  - version: 0.11.5
  - full_version: 0.11.5
## dataset 
  - path: /nfs/bids-data/RawDataBIDS/NYU_2
  - repo: AnnexRepo
  - metadata: <SENSITIVE, report disabled by configuration>
## dependencies 
  - cmd:annex: 7.20190626-g85db42d
  - tqdm: 4.32.2
  - cmd:git: 2.19.1
  - cmd:system-git: 2.19.1
  - cmd:system-ssh: 7.4p1
  - appdirs: 1.4.3
  - boto: 2.49.0
  - git: 2.1.11
  - gitdb: 2.0.5
  - humanize: 0.5.1
  - iso8601: 0.1.12
  - keyring: 19.0.2
  - keyrings.alt: 3.1.1
  - msgpack: 0.6.1
  - requests: 2.22.0
  - six: 1.12.0
  - wrapt: 1.11.2
## environment 
  - PYTHON_PATH: :/nfs/SOEN-499-Project
  - PATH: /home/centos/miniconda3/envs/dask-dist/bin:/home/centos/miniconda3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/nfs/spark/bin:/home/centos/.local/bin:/home/centos/bin
  - LANG: en_CA.UTF-8
  - GIT_PYTHON_GIT_EXECUTABLE: /home/centos/miniconda3/envs/dask-dist/bin/git
## extensions 
## git-annex 
  - version: 7.20190626-g85db42d
  - build flags: 
    - Assistant
    - Webapp
    - Pairing
    - S3
    - WebDAV
    - Inotify
    - DBus
    - DesktopNotify
    - TorrentParser
    - MagicMime
    - Feeds
    - Testsuite
  - dependency versions: 
    - aws-0.21.1
    - bloomfilter-2.0.1.0
    - cryptonite-0.25
    - DAV-1.3.3
    - feed-1.0.1.0
    - ghc-8.4.2
    - http-client-0.5.14
    - persistent-sqlite-2.9.2
    - torrent-10000.1.1
    - uuid-1.3.13
    - yesod-1.6.0
  - key/value backends: 
    - SHA256E
    - SHA256
    - SHA512E
    - SHA512
    - SHA224E
    - SHA224
    - SHA384E
    - SHA384
    - SHA3_256E
    - SHA3_256
    - SHA3_512E
    - SHA3_512
    - SHA3_224E
    - SHA3_224
    - SHA3_384E
    - SHA3_384
    - SKEIN256E
    - SKEIN256
    - SKEIN512E
    - SKEIN512
    - BLAKE2B256E
    - BLAKE2B256
    - BLAKE2B512E
    - BLAKE2B512
    - BLAKE2B160E
    - BLAKE2B160
    - BLAKE2B224E
    - BLAKE2B224
    - BLAKE2B384E
    - BLAKE2B384
    - BLAKE2S256E
    - BLAKE2S256
    - BLAKE2S160E
    - BLAKE2S160
    - BLAKE2S224E
    - BLAKE2S224
    - BLAKE2SP256E
    - BLAKE2SP256
    - BLAKE2SP224E
    - BLAKE2SP224
    - SHA1E
    - SHA1
    - MD5E
    - MD5
    - WORM
    - URL
  - remote types: 
    - git
    - gcrypt
    - p2p
    - S3
    - bup
    - directory
    - rsync
    - web
    - bittorrent
    - webdav
    - adb
    - tahoe
    - glacier
    - ddar
    - hook
    - external
  - operating system: linux x86_64
  - supported repository versions: 
    - 5
    - 7
  - upgrade supported from repository versions: 
    - 0
    - 1
    - 2
    - 3
    - 4
    - 5
    - 6
  - local repository version: 5
## location 
  - path: /nfs/bids-data/RawDataBIDS/NYU_2
  - type: dataset
## metadata_extractors 
  - annex: 
    - module: datalad.metadata.extractors.annex
    - version: None
    - load_error: None
  - audio: 
    - module: datalad.metadata.extractors.audio
    - load_error: No module named 'mutagen' [audio.py:<module>:17]
  - datacite: 
    - module: datalad.metadata.extractors.datacite
    - version: None
    - load_error: None
  - datalad_core: 
    - module: datalad.metadata.extractors.datalad_core
    - version: None
    - load_error: None
  - datalad_rfc822: 
    - module: datalad.metadata.extractors.datalad_rfc822
    - version: None
    - load_error: None
  - exif: 
    - module: datalad.metadata.extractors.exif
    - load_error: No module named 'exifread' [exif.py:<module>:16]
  - frictionless_datapackage: 
    - module: datalad.metadata.extractors.frictionless_datapackage
    - version: None
    - load_error: None
  - image: 
    - module: datalad.metadata.extractors.image
    - version: None
    - load_error: None
  - xmp: 
    - module: datalad.metadata.extractors.xmp
    - load_error: No module named 'libxmp' [xmp.py:<module>:20]
## system 
  - type: posix
  - name: Linux
  - release: 3.10.0-862.11.6.el7.x86_64
  - version: #1 SMP Tue Aug 14 21:49:04 UTC 2018
  - distribution: centos/7.5.1804/Core
  - max_path_length: 288
  - encoding: 
    - default: utf-8
    - filesystem: utf-8
    - locale.prefered: UTF-8

Is there anything else that would be useful to know in this context?

Have you had any success using DataLad before? (to assess your expertise/prior luck. We would welcome your testimonial additions to https://github.com/datalad/datalad/wiki/Testimonials as well)

First time user

@yarikoptic
Copy link
Member

Recent annex has regression affecting datalad http://git-annex.branchable.com/bugs/Regression_in___96__find_--json__96___output/
I bet @joeyh will fix it soon. But lining forward ideally we should make our code more robust and fail with an informative message indeed

kyleam added a commit to kyleam/datalad that referenced this issue Jul 8, 2019
On git-annex 7.20190626, 'find --json' produces regular output rather
than json.  In addition to being broken, this is problematic because,
before the previous commit, we silently dropped lines that didn't
start with "{".  In the case of 'datalad get', this led to us
incorrectly claiming that an absent file was present.

We now fail with a RuntimeError in this situation, but let's raise a
more informative error when 'find --json' is used with git-annex
7.20190626.

This has been fixed in b8ef1bf3b (Fix find --json to output json once
more, 2019-07-05).

Re: http://git-annex.branchable.com/bugs/Regression_in___96__find_--json__96___output/
Closes datalad#3513.
@kyleam
Copy link
Contributor

kyleam commented Jul 8, 2019

fail with an informative message

gh-3516 does this.

@kyleam
Copy link
Contributor

kyleam commented Jul 8, 2019

The latest git-annex release (7.20190708) includes the find --json fix.

@kyleam kyleam closed this as completed in 0863ca8 Jul 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants