-
Notifications
You must be signed in to change notification settings - Fork 2
Steps to analyse the data in MDR
Anusha Ranganathan edited this page May 21, 2024
·
2 revisions
To analyse the data in MDR, we will generate a CSV file listing the properties in each dataset, with a separate CSV file for each nested property.
The PR https://github.com/nims-dpfc/nims-hyrax/pull/563 has the code needed to generate these CSV files.
The spreadsheets will be saved within a directory named data_analysis_{datetime}
in /srv/ngdr/data/
. For example: data_analysis_20240521T004339.tar.gz copied from /srv/ngdr/data/data_analysis_20240521T004339
is the directory created from the last run of data.
In the test system: the run took a few hours to produce a csv file
-
Run a rails console in the web container
docker exec -it nims-hyrax-web-1 /bin/bash rails c
-
Run the code to generate the csv files
data_base_dir = "data" a = DataModelAnalysis.new(data_base_dir) a.run
-
The code will
- Create a directory named
data_analysis_{datetime}
indata
which is shared with the host at/srv/ngdr/data/
- Within the
data_analysis_{datetime}
directory, there will be many csv files, starting withworks.csv
and one csv file for each nested property, as shown below
root@805d968fd6fd:/data# cd data/ root@805d968fd6fd:/data/data# ls -l
total 4 drwxr-xr-x 2 root root 4096 May 21 01:11 data_analysis_20240521T004339
root@805d968fd6fd:/data/data# cd data_analysis_20240521T004339/ root@805d968fd6fd:/data/data/data_analysis_20240521T004339# ls -ltr
total 12324 -rw-r--r-- 1 root root 5033784 May 21 01:11 works.csv -rw-r--r-- 1 root root 334729 May 21 01:11 works_head.csv -rw-r--r-- 1 root root 334729 May 21 01:11 works_tail.csv -rw-r--r-- 1 root root 63184 May 21 01:11 works_complex_date.csv -rw-r--r-- 1 root root 152956 May 21 01:11 works_complex_identifier.csv -rw-r--r-- 1 root root 1898723 May 21 01:11 works_complex_person.csv -rw-r--r-- 1 root root 9977 May 21 01:11 works_complex_version.csv -rw-r--r-- 1 root root 98641 May 21 01:11 works_complex_source.csv -rw-r--r-- 1 root root 9414 May 21 01:11 works_rights_notes.csv -rw-r--r-- 1 root root 9414 May 21 01:11 works_complex_rights.csv -rw-r--r-- 1 root root 6574 May 21 01:11 works_complex_event.csv -rw-r--r-- 1 root root 70470 May 21 01:11 works_updated_subresources.csv -rw-r--r-- 1 root root 1815 May 21 01:11 works_custom_property.csv -rw-r--r-- 1 root root 575869 May 21 01:11 works_complex_relation.csv -rw-r--r-- 1 root root 649087 May 21 01:11 works_complex_funding_reference.csv -rw-r--r-- 1 root root 603019 May 21 01:11 works_complex_contact_agent.csv -rw-r--r-- 1 root root 267653 May 21 01:11 works_complex_instrument.csv -rw-r--r-- 1 root root 1337316 May 21 01:11 works_complex_specimen_type.csv -rw-r--r-- 1 root root 513240 May 21 01:11 works_complex_chemical_composition.csv -rw-r--r-- 1 root root 38861 May 21 01:11 works_complex_structural_feature.csv -rw-r--r-- 1 root root 562590 May 21 01:11 works_complex_software.csv
- Create a directory named