Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When attempting to edit dataset, told 'file does not exist' #15

Open
physicsnerd opened this issue Jul 10, 2021 · 3 comments · May be fixed by #17
Open

When attempting to edit dataset, told 'file does not exist' #15

physicsnerd opened this issue Jul 10, 2021 · 3 comments · May be fixed by #17
Assignees

Comments

@physicsnerd
Copy link

I uploaded a large dataset of Amazon reviews to Kobra. It's almost 100 mb, so definitely not small. I wanted to rename one of the columns once it was in there, so I used the mini-editor in Kobra. When trying to save the change, it gave the error that file 'does not exist'. I tried to print a random value from the dataset, and it just prints null. However, Kobra doesn't throw any errors when I set a data variable to the dataset I uploaded, and the dataset shows as uploaded (when I clicked to edit it, all the values showed up normally as well - it's clearly there). I checked through the docs and I uploaded the file just as they describe, and it's a standard .csv file. The dataset itself is available on kaggle here, and it's been used in research before.

Am I doing something wrong?

I'm working on a pretty new PC running Windows 10, and I use Chrome, so I don't think it's my machine.

Screenshot of code/workspace (you can also see the null values from the print statement):

code

Screenshot of error when trying to edit file:

error

Let me know if there's any other pieces of info I can provide! Kobra seems like a very cool tool that I'm hoping to continue using.

@Merlin04
Copy link
Member

Thanks so much for letting us know! It appears that the issue is with the datasets API. For some reason there is an error internally and it is returning a key of undefined when that dataset file is uploaded. Kobra doesn't know that there was an error (it just thinks that undefined is the key that the server assigned the dataset), so it stores the dataset in the local cache. When you try to save edits to the dataset, you get a "file doesn't exist" error because there isn't actually a dataset called undefined.

I'm not too familiar with how the actual dataset API works, @veritem wrote most of it and can probably take the debugging from here.

The issue with the null values appears to be a separate issue. It seems like it occurs on this line:

const dataset = parsedData
// This line is the problem
    .map((element) => String(element).split(",").map(Number))
    .slice(1, parsedData.length);

This is trying to convert each value of the dataset from a string to a number. I'd guess that it's being thrown off by some of the values having commas in them, but I'm not entirely sure. @pranavnt probably wrote this; can we switch to doing it something like this?

const dataset = parsedData
    .slice(1, parsedData.length)
    .map(row => row.map(val => Number(val)));

@Merlin04
Copy link
Member

Merlin04 commented Jul 10, 2021

I'll split the null values issue out into a new one and also transfer this to the datasets-api repo

@veritem
Copy link
Contributor

veritem commented Jul 10, 2021

Our current setup can't handle large files, The error was caused by this API which didn't finish uploading the dataset.

@veritem veritem linked a pull request Jul 10, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants