Update dataset info #389

bruAristimunha · 2023-05-28T22:01:26Z

Closes #381.

I created this small code to get all the information from the dataset and merge it across datasets. Maybe we can convert it into a function for the library. What do you think, @sylvchev?

Still running P300 and SSVEP paradigms.

import moabb
from moabb.utils import set_download_dir
from moabb.datasets.utils import dataset_search
import mne
import pandas as pd
import os.path
from mne import count_events

#set_download_dir("/workdir/dataset")

base_path = "/mnt/beegfs/projects/moabb/"

set_download_dir(f"{base_path}/mne_data/")

paradigms = {}
paradigms.update({'imagery': moabb.paradigms.MotorImagery()})
paradigms.update({'ssvep': moabb.paradigms.SSVEP()})
paradigms.update({'p300': moabb.paradigms.P300()})

for parad_name, p in paradigms.items():

	dataset_list = dataset_search(paradigm=parad_name)

	metainfo = []

	for dataset in dataset_list:

		dataset_name = str(dataset).split(".")[-1].split(" ")[0]

		path = f"/mnt/beegfs/home/chevallier/metainfo/metainfo_{dataset_name}.csv"

		if not os.path.exists(path):

			print(dataset)

			try:
				
				_, _, metadata = p.get_data(
				dataset, None, return_epochs=False
				)
				subjects = len(metadata['subject'].unique())
				session =  len(metadata['session'].unique())
				runs =  len(metadata['run'].unique())


				X, y, metadata = p.get_data(
				dataset, [1], return_epochs=True
				)

				sfreq = int(X.info['sfreq'])
				nchan = X.info['nchan']

				classes = len(X.event_id)
				epoch_size = X.tmax - X.tmin

				trials_per_events = count_events(X.events)
				total_trials = int(sum(trials_per_events.values()))

				info_dataset = pd.Series([dataset_name, subjects, nchan, classes, trials_per_events, X.event_id, 
							  epoch_size, sfreq, session, runs,
							  session*runs*total_trials*subjects],
							  index=['Dataset', '#Subj', '#Chan', '#Classes', '#Trials_per_subject','trials_ids',
							  		 'Window Size', 'Freq', '#Session', '#Runs', 'Total_trials'])

				info_dataset.to_csv(path)

				metainfo.append(info_dataset)
			except Exception as ex:
					print(f"Error with {dataset}")
					print(f"{ex}")



	df = pd.concat(metainfo, axis=1).T

	df.columns = ['Dataset', '#Subj', '#Chan', '#Classes','#Trials_per_subject','trials_ids', \
				  'Window Size', 'Freq', '#Session', '#Runs', 'Total_trials']

	df.to_csv(f"/mnt/beegfs/home/chevallier/metainfo/metainfo_{parad_name}.csv", index=None)

sylvchev · 2023-05-29T09:58:32Z

I created this small code to get all the information from the dataset and merge it across datasets. Maybe we can convert it into a function for the library. What do you think, @sylvchev?

Yes, we could add it in scripts.

No need to extract the run information, so you could use paradigms or moabb.datasets.utils.dataset_search to get the dataset list. Then you could get number of subjects and sessions directly from the dataset object instead of using paradigm.get_data which will take a lot of time to process the data. The part to get sfreq, nchan, ... with paradigm is good.

bruAristimunha · 2023-05-29T12:11:02Z

Done @sylvchev!

docs/source/dataset_summary.rst

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

sylvchev

Nice script!
It is a good idea to add the run, the information is not used yet but it could be useful later

docs/source/dataset_summary.rst

moabb/datasets/gigadb.py

scripts/generating_metainfo.py

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

docs/source/dataset_summary.rst

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

bruAristimunha · 2023-06-02T14:18:14Z

Thank you for the review @sylvchev and @carraraig!

bruAristimunha added 3 commits May 28, 2023 23:52

Fixing data meta info

6281a91

Fixing description Cho and BNCI

e080f6a

Fixing order

96cc109

bruAristimunha added 6 commits May 29, 2023 12:42

Fixing the SSVEP and P300

8741807

Adding new script

6f3c9c4

Fixing saving

8e12de7

Updating script

f264fc7

Fixing columns

3817122

Updating the script to process trial/events

17a15bb

Updating the whats_new.rst

e2387fd

sylvchev reviewed May 31, 2023

View reviewed changes

docs/source/dataset_summary.rst Outdated Show resolved Hide resolved

docs/source/dataset_summary.rst Outdated Show resolved Hide resolved

bruAristimunha and others added 3 commits May 31, 2023 18:03

Update docs/source/dataset_summary.rst

2babf37

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update docs/source/dataset_summary.rst

092fa3b

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Merge branch 'develop' into update_dataset_info

de93a45

sylvchev reviewed May 31, 2023

View reviewed changes

bruAristimunha and others added 13 commits June 2, 2023 12:40

Update docs/source/dataset_summary.rst

71eb6db

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update moabb/datasets/gigadb.py

55ad4ae

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

f5e1339

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

a057927

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

caf6c77

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

7522f4d

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

58ff99b

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

b705198

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

75c0065

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

175161a

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

28a88eb

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

7cbb9df

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

292f442

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

bruAristimunha and others added 2 commits June 2, 2023 12:43

Update scripts/generating_metainfo.py

df1343b

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

0f44aab

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

carraraig reviewed Jun 2, 2023

View reviewed changes

docs/source/dataset_summary.rst Outdated Show resolved Hide resolved

bruAristimunha and others added 8 commits June 2, 2023 12:44

Update scripts/generating_metainfo.py

b987b9d

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

e8b3efb

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

85fee27

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

a43a844

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

99c174d

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Update scripts/generating_metainfo.py

1a71144

Co-authored-by: Sylvain Chevallier <sylvain.chevallier@universite-paris-saclay.fr>

Fixing small things

44d05d1

Merge branch 'develop' into update_dataset_info

a2da3ea

bruAristimunha merged commit 2938fcc into NeuroTechX:develop Jun 2, 2023
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dataset info #389

Update dataset info #389

bruAristimunha commented May 28, 2023

sylvchev commented May 29, 2023

bruAristimunha commented May 29, 2023

sylvchev left a comment

bruAristimunha commented Jun 2, 2023

Update dataset info #389

Update dataset info #389

Conversation

bruAristimunha commented May 28, 2023

sylvchev commented May 29, 2023

bruAristimunha commented May 29, 2023

sylvchev left a comment

Choose a reason for hiding this comment

bruAristimunha commented Jun 2, 2023