# This jupyter notebook documents the efforts to read the mat files from CALCE website. 

The data is located [here](https://web.calce.umd.edu/batteries/data.htm#pls) from experiments on various cells called 'PL Samples' - PL 3, 10 ,4, 5 , 9, 25 etc. After unzipping the files, give the path to the file stored. 


In [79]:
# Replace this with the path to the downloaded mat files in your system
matdata_dir = '/home/ubuntu_cp/uwdirect/BattDeg/data/raw_mat_PL0310/'

In [80]:
# Import the necessary packages
import scipy.io as spio
from os.path import isfile, join
import numpy as np
import h5py 
import tables

## 1. `scipy.io` seems to be one of the most popular methods

In [81]:
# Load the file into a python object
mat = spio.loadmat(join(matdata_dir, 'PL03.mat'), squeeze_me=True)

In [82]:
# Display the contents as read
mat

{'__header__': b'MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Mon Feb 08 20:42:18 2016',
 '__version__': '1.0',
 '__globals__': [],
 'PL03': array([['Operation', 'Start Date', 'Data'],
        ['Single Cycle', 'December 09, 2014',
         MatlabOpaque([(b'', b'MCOS', b'table', array([3707764736,          2,          1,          1,          1,
                 1], dtype=uint32))],
              dtype=[('s0', 'O'), ('s1', 'O'), ('s2', 'O'), ('arr', 'O')])],
        ['75 Partial Cycles', 'December 10, 2014',
         MatlabOpaque([(b'', b'MCOS', b'table', array([3707764736,          2,          1,          1,          2,
                 1], dtype=uint32))],
              dtype=[('s0', 'O'), ('s1', 'O'), ('s2', 'O'), ('arr', 'O')])],
        ['75 Partial Cycles', 'December 30, 2014',
         MatlabOpaque([(b'', b'MCOS', b'table', array([3707764736,          2,          1,          1,          3,
                 1], dtype=uint32))],
              dtype=[('s0', 'O'), ('s1', 'O'), 

In [83]:
# Check the type of the object
type(mat)

dict

### For the dictionary object, read the keys

In [84]:
mat.keys()

dict_keys(['__header__', '__version__', '__globals__', 'PL03', '__function_workspace__'])

In [85]:
# Get the data in key 'PL03'
PL03 = mat['PL03']

In [86]:
# Get the type of this object
type(PL03)

numpy.ndarray

In [88]:
# Inspect the numpy array, starting form the second element, where all the cycling data is stored
pl03_1 = PL03[1]

In [90]:
pl03_1

array(['Single Cycle', 'December 09, 2014',
       MatlabOpaque([(b'', b'MCOS', b'table', array([3707764736,          2,          1,          1,          1,
                1], dtype=uint32))],
             dtype=[('s0', 'O'), ('s1', 'O'), ('s2', 'O'), ('arr', 'O')])],
      dtype=object)

### THis is where we hit the wall, as apparently scipy is trying to hint us with the name of the object - `MatlabOpaque` and not a lot of details in the documentation. 

> "Subclass to signal this is a matlab opaque matrix" [1](https://www.pydoc.io/pypi/scipy-1.0.0rc2/autoapi/io/matlab/mio5_params/index.html#io.matlab.mio5_params.MatlabOpaque.__new__)

In [92]:
# Trying various functions that are available with this object as listed here https://kite.com/python/docs/scipy.io.matlab.mio5_params.MatlabOpaque
pl03_1.tolist()

['Single Cycle',
 'December 09, 2014',
 MatlabOpaque([(b'', b'MCOS', b'table', array([3707764736,          2,          1,          1,          1,
                 1], dtype=uint32))],
              dtype=[('s0', 'O'), ('s1', 'O'), ('s2', 'O'), ('arr', 'O')])]

### In other words, a deadend, as confirmed here: https://stackoverflow.com/questions/15512560/access-mat-file-containing-matlab-classes-in-python. Some success has been achieved in reading the matlab files in Julia by the author, but that seems like a big aside for this project. 

__________________________________________________

## 2. Trying other packages in python all result in error 
### 2.1 `mat4py`

In [78]:
from mat4py import loadmat
data = loadmat(join(matdata_dir, 'PL03.mat'))

ParseError: Got type 1, expected 5 (miINT32)

### 2.2 `h5py`

In [93]:
 h5py.File(join(matdata_dir, 'PL03.mat'),'r')

OSError: Unable to open file (file signature not found)

 This error stems from the fact that `h5py` can read HDF5 format files, however our mat file since it is MATLAB 5.0 file, as is evedent in the first few lines of this file, cannot be therefore read by `h5py` as confirmed [here](https://stackoverflow.com/questions/38089950/error-opening-file-in-h5py-file-signature-not-found)

### 2.3 `pytables`

In [94]:
# Trying another package
tables.open_file(join(matdata_dir, 'PL03.mat'))

HDF5ExtError: HDF5 error back trace

  File "H5F.c", line 511, in H5Fopen
    unable to open file
  File "H5Fint.c", line 1604, in H5F_open
    unable to read superblock
  File "H5Fsuper.c", line 413, in H5F__super_read
    file signature not found

End of HDF5 error back trace

Unable to open/create file '/home/ubuntu_cp/uwdirect/BattDeg/data/raw_mat_PL0310/PL03.mat'

Again since this is not a HDF5 format file. It would have ben good if matlab provided an online file converter for their old mat files, but I did not find any today. There was another suggestion of reading these files as binary files [here](https://stackoverflow.com/a/26295900/1328232), however that seems like a gross overkill for this project too, though quite interesting. 

## 3. Converting the files on matlab

#### `The .mat files were successfully coverted on matlab using the following code`

load mat_file_name  
writetable(old_table_name,'new_csv_file_name.csv')

### `Following code is an example and shows how the tables in the .mat files can be converted to csv files`

#### Example: Converting a table names PL03{2,3} in the file PL03 to PL03(2).csv
load PL03  
writetable(PL03{2,3},'PL03(2).csv')