Data migration / setting fixed splits

## Problem
In issue #10 we introduced a new file structure for the ChEBI datasets. To help users in transfering their data into this new structure, we need a migration script that automates this step.
For the most part, this should be relatively easy - taking files from one directory and copying them to another directory. The splits are of course a bit more difficult. If we want users to be able to continue their current splits, this requires a new features: setting datasplits based on a list of ids. 
The latter would also have the advantage that we can circumvent the performance issue (#32) by saving the configuration of the current split as a list of ids and reload the splits from this list. (This might look like a step back, but importantly, we do not save the splits as separate files. The standard method of creating splits via a seed stays intact.)

## Solution
The behaviour in the end should be:
- When initialising a dataset, the user has the option to provide a file path to csv file that contains a list of chebi ids and their assignment to a dataset (either train, validation or test). Then, instead of creating a new split, the provided split will be used
- When initialising the dataset _without_ providing such a file, the splits will get created automatically (as before) and the resulting split is saved as a csv file
- When running the migration script, the chebi data files will be copied into the new structure. For the splits, the split files are combined into one file and a csv file for the split assignment will be created in addition.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data migration / setting fixed splits #34

Problem

Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Data migration / setting fixed splits #34

Description

Problem

Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions