# Data download and preliminaries

From prior work and data:

- Forbes, R. et al. (2018) ‘Quantum-beat photoelectron-imaging spectroscopy of Xe in the VUV’, Physical Review A, 97(6), p. 063417. Available at: https://doi.org/10.1103/PhysRevA.97.063417. arXiv: http://arxiv.org/abs/1803.01081, Authorea (original HTML version): https://doi.org/10.22541/au.156045380.07795038
- Data (OSF): https://osf.io/ds8mk/

## Preliminaries

### Configure env

### Obtain data

- Use [osfclient](https://github.com/osfclient/osfclient) for python methods below. Run `pip install osfclient` if required.
- Alternatively can just pull data via web interface.
- For full OSFclient CLI clone, `clone` pulls 99 files/1.1Gb.
- For python case, use local module `qbanalysis` - run `pip install -e .` from repo root to install.

Options:

1. Clone full OSF repository/project.
2. Pull only data matching published case, file  `Xe_hyperfine_VMI_processing_distro_211217.zip` in OSF repo.

In [1]:
from pathlib import Path

project = 'ds8mk'
# dataPath = Path('~/tmp/xe_analysis_2024_scratch')
dataPath = Path('/tmp/xe_analysis')
dataFile = 'Xe_hyperfine_VMI_processing_distro_211217.zip'

In [2]:
# Option (1): download full repo at CLI
# fetch all files from a project and store them in `output_directory`
# Should pull pulls 99 files/1.1Gb.

# !osf -p {project} clone {dataPath.as_posix()}

In [3]:
# Option (2) Minimal data via Python API
# Just pull final analysis `Xe_hyperfine_VMI_processing_distro_211217.zip`
# Note can also use CLI, `!osf fetch {project}/{dataFile} {(dataPath/dataFile).as_posix()}`
# If local env is configure for this.

# Load module
# import qbanalysis as qb
from qbanalysis import getOSFdata

# Get data
# Alternatively can run with project defaults as `getOSFdata.main()`
projDict = getOSFdata.getProjectFile(project,dataPath,dataFile)

[32m2024-06-06 13:53:16.148[0m | [1mINFO    [0m | [36mqbanalysis.config[0m:[36m<module>[0m:[36m11[0m - [1mPROJ_ROOT path is: /home/jovyan/code-share/github-share/Quantum-Beat_Photoelectron-Imaging_Spectroscopy_of_Xe_in_the_VUV[0m


[32m2024-06-06 13:53:17.487[0m | [1mINFO    [0m | [36mqbanalysis.getOSFdata[0m:[36mgetProjectFile[0m:[36m61[0m - [1mFound OSF project: Quantum Beat Photoelectron Imaging Spectroscopy of Xe in the VUV, https://osf.io/ds8mk/ .[0m
[32m2024-06-06 13:53:18.490[0m | [1mINFO    [0m | [36mqbanalysis.getOSFdata[0m:[36mgetProjectFile[0m:[36m80[0m - [1mFound local file at /tmp/xe_analysis/Xe_hyperfine_VMI_processing_distro_211217.zip.[0m
[32m2024-06-06 13:53:18.490[0m | [1mINFO    [0m | [36mqbanalysis.getOSFdata[0m:[36mgetProjectFile[0m:[36m86[0m - [1mSkipping download, pass `overwrite=True` to redownload.[0m


In [4]:
# The returned dictionary contains a file list and other info
projDict.keys()

dict_keys(['project', 'name', 'URL', 'dataPath', 'dataFile', 'fullPath', 'fileList', 'fileNames'])

## Quick plot to check dataset

Basic functions are configured to reformat the raw data, and plot the $\beta_{LM}(t)$ - this should match figure 5 in the manuscript.

In [5]:
from qbanalysis.dataset import loadFinalDataset
from qbanalysis.plots import plotFinalDatasetBLMt

OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.


* sparse not found, sparse matrix forms not available. 
* natsort not found, some sorting functions not available. 


* Setting plotter defaults with epsproc.basicPlotters.setPlotters(). Run directly to modify, or change options in local env.


* Set Holoviews with bokeh.
* pyevtk not found, VTK export not available. 


In [6]:
dataDict = loadFinalDataset(dataPath)

[32m2024-06-06 13:53:24.130[0m | [1mINFO    [0m | [36mqbanalysis.dataset[0m:[36mloadDataset[0m:[36m150[0m - [1mLoaded data cpBasex_results_cycleSummed_rot90_quad1_ROI_results_with_FT_NFFT1024_hanningWindow_270717.mat.[0m
[32m2024-06-06 13:53:24.180[0m | [1mINFO    [0m | [36mqbanalysis.dataset[0m:[36mloadDataset[0m:[36m150[0m - [1mLoaded data cpBasex_results_allCycles_ROIs_with_FTs_NFFT1024_hanningWindow_270717.mat.[0m
[32m2024-06-06 13:53:24.487[0m | [1mINFO    [0m | [36mqbanalysis.dataset[0m:[36mloadFinalDataset[0m:[36m132[0m - [1mProcessed data to Xarray OK.[0m


In [7]:
plotFinalDatasetBLMt(**dataDict)

Cf. Figure 5 in the manuscript, lower two panels for $\beta$ parameters.

(Figure from Authorea version: https://doi.org/10.22541/au.156045380.07795038.)

<img src="https://www.authorea.com/users/71114/articles/188337/master/file/figures/image/Xe_hyperfine_ROIs_3up_rough_260218-01.png" />

## Save reformatted data

Write Xarrays to file. Here use routines from :py:mod:`epsproc.IO`, which [includes complex number handling](https://epsproc.readthedocs.io/en/latest/dataStructures/ePSproc_dataStructures_demo_070622.html#Basic-data-IO-(Xarray-data-file-read/write)), although this may not be necessary with newer versions of Xarray (TBC).

In [8]:
from epsproc import IO

for item in dataDict.items():
    IO.writeXarray(item[1], fileName=f'Xe_dataset_{item[0]}', filePath=dataPath)
    # print(item[0])

writeXarray caught exception: Invalid value for attr 'harmonics': {'dtype': 'sph', 'kind': 'complex', 'normType': 'ortho', 'csPhase': True}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple
Retrying file write with sanitized attrs.
['Written to h5netcdf format, with sanitized attribs (may be lossy)', '/tmp/xe_analysis/Xe_dataset_BLMall.nc']
writeXarray caught exception: Invalid value for attr 'harmonics': {'dtype': 'sph', 'kind': 'complex', 'normType': 'ortho', 'csPhase': True}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple
Retrying file write with sanitized attrs.
['Written to h5netcdf format, with sanitized attribs (may be lossy)', '/tmp/xe_analysis/Xe_dataset_BLMerr.nc']
writeXarray caught exception: Invalid value for attr 'harmonics': {'dtype': 'sph', 'kind': 'complex', 'normType': 'ortho', 'csPhase': True}. For serialization t

In [9]:
# Check data - read from HDF5/NetCDF files
dictFileTest = {}
for item in dataDict.items():
    dictFileTest[item[0]] = IO.readXarray(fileName=f'Xe_dataset_{item[0]}.nc', filePath=dataPath.as_posix())

*** Read /tmp/xe_analysis/Xe_dataset_BLMall.nc.
*** Read /tmp/xe_analysis/Xe_dataset_BLMerr.nc.
*** Read /tmp/xe_analysis/Xe_dataset_BLMerrCycle.nc.


In [10]:
# Test for identical values to verify round-trip to file
import numpy as np
for item in dataDict.items():
    diff = (dictFileTest[item[0]] - dataDict[item[0]]).sum()
    
    if np.abs(diff) < 1e-10:
        print(f'{item[0]}: OK')
    else:
        print(f'{item[0]} Diff = {np.abs(diff)}')

BLMall: OK
BLMerr: OK
BLMerrCycle: OK


## Versions

In [11]:
import scooby
scooby.Report(additional=['qbanalysis','pemtk','epsproc', 'holoviews', 'hvplot', 'xarray', 'matplotlib', 'bokeh'])

0,1,2,3,4,5,6,7
Thu Jun 06 13:53:25 2024 EDT,Thu Jun 06 13:53:25 2024 EDT,Thu Jun 06 13:53:25 2024 EDT,Thu Jun 06 13:53:25 2024 EDT,Thu Jun 06 13:53:25 2024 EDT,Thu Jun 06 13:53:25 2024 EDT,Thu Jun 06 13:53:25 2024 EDT,Thu Jun 06 13:53:25 2024 EDT
OS,Linux,CPU(s),64,Machine,x86_64,Architecture,64bit
RAM,62.8 GiB,Environment,Jupyter,File system,btrfs,,
"Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]","Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]","Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]","Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]","Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]","Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]","Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]","Python 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]"
qbanalysis,0.0.1,pemtk,0.0.1,epsproc,1.3.2-dev,holoviews,1.16.2
hvplot,0.8.4,xarray,2022.3.0,matplotlib,3.5.3,bokeh,3.1.1
numpy,1.23.5,scipy,1.10.1,IPython,8.13.2,scooby,0.7.2


In [12]:
# # Check current Git commit for local ePSproc version
# from pathlib import Path
# !git -C {Path(qbanalysis.__file__).parent} branch
# !git -C {Path(qbanalysis.__file__).parent} log --format="%H" -n 1

In [13]:
# # Check current remote commits
# !git ls-remote --heads https://github.com/phockett/qbanalysis

In [14]:
# Check current Git commit for local code version
import qbanalysis
!git -C {Path(qbanalysis.__file__).parent} branch
!git -C {Path(qbanalysis.__file__).parent} log --format="%H" -n 1

* [32mmaster[m
b96870e7364001f73bed8ca755b4643145886dbb


In [15]:
# Check current remote commits
!git ls-remote --heads https://github.com/phockett/Quantum-Beat_Photoelectron-Imaging_Spectroscopy_of_Xe_in_the_VUV

b96870e7364001f73bed8ca755b4643145886dbb	refs/heads/master
