Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GIT LFS seems to have corrupted data online. Local machine data is still fine. Fails while executing on Google Colab #3598

Closed
SachitNayak opened this issue Apr 7, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@SachitNayak
Copy link

commented Apr 7, 2019

Hey folks,

I just found out that a huge chunk of data that I stored on LFS no longer can be parsed properly. Code is just skipping over the datasets stored on LFS and not parsing them at all. This is terrible; all the data that I stored now doesn't seem to work at all.

on line 0
can't parse line 0 so skipping
version https://git-lfs.github.com/spec/v1
...
can't parse line 1 so skipping
oid sha256:91125602f730fea7ca768736c6f442e668b49db095682bf2aad375db061c21ed
...
vector for line 2 has size 1 so skipping
size 1037962819
...
on line 0
can't parse line 0 so skipping
version https://git-lfs.github.com/spec/v1
...
can't parse line 1 so skipping
oid sha256:91125602f730fea7ca768736c6f442e668b49db095682bf2aad375db061c21ed
...
vector for line 2 has size 1 so skipping
size 1037962819
...
writing vectors!
CONSTRUCTING DEV SETS
WRITING LABEL MAPPINGS
  Writing label mapping for CHUNK BIOES
  0 classes

Now I'm really irritated that all my efforts since 4 days have completely been wiped out. Is there something that can make the data still parse-able? fix my LFS gitattributes maybe? On my local machine that data is fine and works as intended, generating 42 classes in the end (unllike 0 classes as shown above).

I'm wondering if my issue is similar to this https://github.com/git-lfs/git-lfs/issues/3531
or this https://github.com/git-lfs/git-lfs/issues/2503

Please help me understand how to efficiently use git LFS.

@SachitNayak

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

Hey folks,

I just found out that a huge chunk of data that I stored on LFS no longer can be parsed properly. Code is just skipping over the datasets stored on LFS and not parsing them at all. This is terrible; all the data that I stored now doesn't seem to work at all.

on line 0
can't parse line 0 so skipping
version https://git-lfs.github.com/spec/v1
...
can't parse line 1 so skipping
oid sha256:91125602f730fea7ca768736c6f442e668b49db095682bf2aad375db061c21ed
...
vector for line 2 has size 1 so skipping
size 1037962819
...
on line 0
can't parse line 0 so skipping
version https://git-lfs.github.com/spec/v1
...
can't parse line 1 so skipping
oid sha256:91125602f730fea7ca768736c6f442e668b49db095682bf2aad375db061c21ed
...
vector for line 2 has size 1 so skipping
size 1037962819
...
writing vectors!
CONSTRUCTING DEV SETS
WRITING LABEL MAPPINGS
  Writing label mapping for CHUNK BIOES
  0 classes

Now I'm really irritated that all my efforts since 4 days have completely been wiped out. Is there something that can make the data still parse-able? fix my LFS gitattributes maybe? On my local machine that data is fine and works as intended, generating 42 classes in the end (unllike 0 classes as shown above).

I'm wondering if my issue is similar to this https://github.com/git-lfs/git-lfs/issues/3531
or this https://github.com/git-lfs/git-lfs/issues/2503

Please help me understand how to efficiently use git LFS.

I decided to store the 10 GB of data on my google drive and connect it to my colab notebook... seems to work fine now. My experience with git LFS is really depressing

@SachitNayak SachitNayak closed this Apr 8, 2019

@bk2204

This comment has been minimized.

Copy link
Contributor

commented Apr 8, 2019

Hey, sorry to hear you were having trouble.

What it looks like happened here, just for future reference, is that the system you were checking on didn't have Git LFS installed for the repository, preventing the LFS objects from being checked out. As a consequence, the only things that server saw were the pointer files, which couldn't be parsed by your tool. If you make sure the remote server has Git LFS installed before checking out and run git lfs install) (or install it after the checkout and then run git lfs checkout), then you should be able to get your files.

@SachitNayak

This comment has been minimized.

Copy link
Author

commented Apr 9, 2019

Oh I see. Thanks. I understood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.