- This brain tumor dataset containing 3064 T1-weighted contrast-enhanced MRI (T1w CE-MRI) images from 233 patients with three kinds of brain tumor:
Sr. No. | Tumor Name | Number of Observations |
---|---|---|
1 | Meningioma | 708 |
2 | Glioma | 1426 |
3 | Pituitary Tumor | 930 |
- This dataset is organized in MATLAB data (
.mat
file) format which is one of the data-exchange format by MATLAB. Each.mat
file stores astruct
containing the following fields for an MRI image:
cjdata.label: 1 for meningioma, 2 for glioma, 3 for pituitary tumor
cjdata.PID: patient ID
cjdata.image: image data
cjdata.tumorBorder: a vector storing the coordinates of discrete points on tumor border.
For example, [x1, y1, x2, y2,...] in which x1, y1 are planar coordinates on tumor border.
It was generated by manually delineating the tumor border. So we can use it to generate
binary image of tumor mask.
cjdata.tumorMask: a binary image with 1s indicating tumor region
- The jupyter notebook
preprocessing_mat_files.ipynb
is also provided which consist of python code to extract the information usingh5py
library which provide aFile
class to open and process.mat
files. - The following function is written to process the
.mat
file and extract the different image related information which is stored in an instance of aFile
class ofh5py
library with a dictionary kind of format. The rest of the source-code exports the images into the required directories (created on the basis of tumor classes) after image data is being extracted from the.mat
file using the following function:
def mat_file_to_dict(filepath: str) -> dict:
tumor_class = {1: 'meningioma', 2: 'glioma', 3: 'pituitary_tumor'}
tumor_data_dict = {}
with h5py.File(filepath, mode = 'r') as image_data:
cjdata_struct = image_data['cjdata']
tumor_data_dict['class'] = tumor_class[int(cjdata_struct['label'][0, 0])]
tumor_data_dict['image'] = cjdata_struct['image'][:].transpose()
tumor_data_dict['tumor_border'] = cjdata_struct['tumorBorder'][0]
tumor_data_dict['tumor_mask'] = cjdata_struct['tumorMask'][:].transpose()
return tumor_data_dict
- The dataset was used in the following research papers:
NOTE: MATLAB source-codes are available on the repository: https://github.com/chengjun583/brainTumorRetrieval