Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import CSV #6

Closed
mwessley opened this issue Jul 25, 2017 · 3 comments
Closed

Import CSV #6

mwessley opened this issue Jul 25, 2017 · 3 comments

Comments

@mwessley
Copy link

The import process of own databases is not very clear.
From my understanding each input dataset has to be saved in a separate file.
Is that right?

Is it possible to import a database, where the inputs are saved in one file,
and the desired labels are saved in a separate file?

@olivierbichler-cea
Copy link
Contributor

Hi. There are several ways to import your own dataset.
For data with a single label per data, you can save each data to a separate file and categorize the files in sub-folders corresponding to each label. Then you can use the built-in DIR_Database database driver to load your data, as explained in the manual.
If all inputs are saved in one file and labels in a separate file, I guess you are looking to a driver similar to the MNIST_IDX_Database. But depending on how the data is stored, you may need to write your own database driver module.
Do you have an example of input data and labels you want to handle? Is it images, vector of numbers?
We will see what is the best way to handle them!

@mwessley
Copy link
Author

Thank you for the fast response. You are right, the data is stored in a MNIST kind of way.
As an example the data is stored one input per line and comma separated:
2.070008516069267,4.339715409523769,-1.2698122291539113,-7.00010533569940957

As the Input should be 2D a reshaping needs to be performed, but as far as I have seen this can be handled by N2D2.

Another point would be whether the output data need to be the labels themselves or also could be probabilities for each label. (So basically the output of a teacher-network.)

@olivierbichler-cea
Copy link
Contributor

Closing this old issue, which should be solved using the new CSV_Database driver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants