Adding PCA script for dimension reduction of metrics #673

+
+The script can take directly as input a connectoflow output folder. Simply use the --connectoflow flag. For
+other type of folder input, the script expects a single folder containing all matrices for all subjects. Example:
+                                                            input_folder/sub-01_ad.npy


Would it be possible to use this structure?

frheault · 2023-03-07T14:08:43Z

scripts/scil_compute_pca.py

+                                                                        /...
+
+Output connectivity matrix will be saved next to the other metrics in the input folder. The plots and tables
+will be outputted in the designated folder from the <output> argument.


Could you explain a bit how to start interpretation? I ran your test data with the command:
scil_compute_pca.py ./ ../lol --list_ids test.text --metrics ad rd fa md --connectoflow

and my first column is : PC1 0.516503256371792 0.496327739954754 0.473191960436531 0.512808472323706 (in one sentence how do you interpret PCA, you can use chatGPT or stack overflow, but the user needs a starts or a least a link to a good resource for beginner (simpler than the paper, what is PCA)

Also, can you fix this kind of display overlap ?

frheault · 2023-03-07T14:09:18Z

scripts/scil_compute_pca.py

+                                                                        /sub-02_md.npy
+                                                                        /...
+
+Output connectivity matrix will be saved next to the other metrics in the input folder. The plots and tables


Is there another name than connectivity matrix? Since the input is also named connectivity matrices...

frheault · 2023-03-07T14:10:19Z

scripts/scil_compute_pca.py

+# Import required libraries.
+import argparse
+import logging
+import numpy as np


Split the built-in from third-party and scilpy should be alone in a third block

frheault · 2023-03-07T14:11:52Z

scripts/scil_compute_pca.py

+                   help='Path to the input folder.')
+    p.add_argument('out_folder',
+                   help='Path to the output folder to export graphs and tables. \n'
+                        '*** Please note, PC connectivity matrix will be outputted in the original input folder'


I don't like that, why not save this in the output?

frheault · 2023-03-07T14:13:32Z

scripts/scil_compute_pca.py

+                        '*** Please note, PC connectivity matrix will be outputted in the original input folder'
+                        'next to all other metrics ***')
+    p.add_argument('--metrics', nargs='+', required=True,
+                   help='List of all metrics to include in PCA analysis.')


Could you specify these are expected to be suffixes and the extension must be immediately following and be .npy

frheault · 2023-03-07T14:16:39Z

scripts/scil_compute_pca.py

+    p.add_argument('--metrics', nargs='+', required=True,
+                   help='List of all metrics to include in PCA analysis.')
+    p.add_argument('--list_ids', required=True,
+                   help='List containing all ids to use in PCA computation.')


This is unclear, this is not a like --metrics is a list, it is a file containing a list.

Adding a metavar argument like metavar=FILE, could help.
I think the help should say path to a file containing a list of all ids is also crucial.

frheault · 2023-03-07T14:17:58Z

scripts/scil_compute_pca.py

+                   help='List of all metrics to include in PCA analysis.')
+    p.add_argument('--list_ids', required=True,
+                   help='List containing all ids to use in PCA computation.')
+    p.add_argument('--common', choices=['true', 'false'], default='true',


Instead, you should use action=store_true, which store directly true or false without having to type it.

Renaming it to --only_common would be clearer

I changed it to --not_only_common with action=store_true since I believe the common option should be the default one.

frheault · 2023-03-07T14:19:50Z

scripts/scil_compute_pca.py

+        d = {f'{m}': [load_matrix_in_any_format(f'{args.in_folder}/{a}_{m}.npy') for a in subjects]
+             for m in args.metrics}
+        # Assert that all metrics have the same number of subjects.
+        nb_sub = [len(d[f'{m}']) for m in args.metrics]


I dont think the whole f'{m}' is required since m is already a string

gagnonanthony · 2023-03-07T16:35:28Z

@frheault, I updated the script according to your comments. Should be better now!

arnaudbore · 2023-03-07T17:00:25Z

Build passed ! Good Job 🍻 !

frheault

@arnaudbore this answered my comments

arnaudbore self-requested a review February 2, 2023 21:32

gagnonanthony force-pushed the pcadwi branch from 6586b77 to be9d8ab Compare March 1, 2023 15:16

gagnonanthony added 9 commits March 1, 2023 14:34

Added pca script

d155d71

Added alternative input structure support

12407bc

Fix typo in script description

21eccf4

Added required = true to --list_ids

de01fe7

Fixed PEP8 errors

53f2cf4

Fixed PEP8 errors

a03ccde

Fixed typo

6fc6944

Added new test data and fixed test functions

c4b8da3

Modified handling of subjects files and removed unused function

73b6bdc

gagnonanthony force-pushed the pcadwi branch from ccdce2b to 73b6bdc Compare March 1, 2023 19:35

arnaudbore reviewed Mar 1, 2023

View reviewed changes

scripts/scil_compute_pca.py Outdated Show resolved Hide resolved

scripts/scil_compute_pca.py Outdated Show resolved Hide resolved

scripts/scil_compute_pca.py Outdated Show resolved Hide resolved

arnaudbore changed the title ~~[WIP] Adding PCA script for dimension reduction of metrics~~ Adding PCA script for dimension reduction of metrics Mar 1, 2023

arnaudbore requested review from mdesco and arnaudbore March 1, 2023 21:46

modified according to the comments

2ae9bf6

added overwrite option in test.

04e3ada

frheault requested changes Mar 7, 2023

View reviewed changes

gagnonanthony added 11 commits March 7, 2023 09:48

Updated example input structure

0e20104

Updated example input structure

d4968d3

Added interpretation example

121d64c

Fix import order

96f225b

Change saving directory for output folder

a5425eb

Specifying help for --metrics and --list_ids

6c78456

Changing --common args to --not_only_common

d908c13

added specifications in function and removed the f'{}' when unnecessary'

2094aa4

Fix matplotlib plot display

bd090a4

Change --connectoflow flag to --input_connectoflow

d51bba3

Fix test to fit the args change

46cd79a

frheault approved these changes Mar 7, 2023

View reviewed changes

arnaudbore merged commit 0179bc0 into scilus:master Mar 7, 2023

gagnonanthony deleted the pcadwi branch March 8, 2023 03:07

gagnonanthony restored the pcadwi branch March 8, 2023 03:07

gagnonanthony deleted the pcadwi branch March 8, 2023 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding PCA script for dimension reduction of metrics #673

Adding PCA script for dimension reduction of metrics #673

gagnonanthony commented Feb 2, 2023

arnaudbore commented Feb 2, 2023

arnaudbore commented Feb 20, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore left a comment

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 2, 2023

arnaudbore commented Mar 6, 2023

frheault commented Mar 6, 2023

gagnonanthony commented Mar 6, 2023

frheault left a comment

frheault Mar 7, 2023

frheault Mar 7, 2023

frheault Mar 7, 2023

frheault Mar 7, 2023

frheault Mar 7, 2023

frheault Mar 7, 2023

frheault Mar 7, 2023

frheault Mar 7, 2023

gagnonanthony Mar 7, 2023

frheault Mar 7, 2023

gagnonanthony commented Mar 7, 2023

arnaudbore commented Mar 7, 2023

frheault left a comment

Adding PCA script for dimension reduction of metrics #673

Adding PCA script for dimension reduction of metrics #673

Conversation

gagnonanthony commented Feb 2, 2023

arnaudbore commented Feb 2, 2023

arnaudbore commented Feb 20, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore left a comment

Choose a reason for hiding this comment

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 1, 2023

arnaudbore commented Mar 2, 2023

arnaudbore commented Mar 6, 2023

frheault commented Mar 6, 2023

gagnonanthony commented Mar 6, 2023

frheault left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gagnonanthony commented Mar 7, 2023

arnaudbore commented Mar 7, 2023

frheault left a comment

Choose a reason for hiding this comment