This project aims to support the analysis of photocatalysis data through a high-throughput approach, generating datasets that can be analyzed using machine learning methods. The project’s workflow is inspired by the paper "High-Throughput Photocatalysis for Generating Reliable Datasets Analyzed by Machine Learning" and is organized into modules for data management, machine learning analysis, and visualization.
The project is structured into the following main directories and files:
-
Database_generator/: This directory handles data generation and machine learning processes.Data/: Stores the experimental data generated through high-throughput photocatalysis experiments.experiment.ipynb: The main Jupyter notebook for running machine learning analyses on the experimental data. This notebook is designed to preprocess data, train models, and evaluate performance based on the datasets generated.
-
Draw_Heatmaps/: Contains resources for visual analysis of results through heatmaps.draw_heatmaps.ipynb: A Jupyter notebook dedicated to creating heatmaps to visualize trends and insights from the machine learning analysis results.
- Python 3.9/3.10 and necessary packages
- Packages requirements are listed in a
requirements.txtfile
- Clone the repository.
git clone <repository-url> cd <repository-directory>
- Install dependencies.
pip install -r requirements.txt
- Open and run
Database_generator/experiment.ipynbto process the high-throughput photocatalysis data stored inDatabase_generator/Data/. This notebook will load the data, preprocess it, and run machine learning analyses to derive insights from the photocatalysis experiments.
- After running the machine learning analysis, open
Draw_Heatmaps/draw_heatmaps.ipynb. This notebook will create heatmaps based on the generated data and model predictions, allowing for a visual representation of patterns, which aids in further analysis.
This project supports the generation of reliable photocatalysis datasets that can be effectively analyzed by machine learning algorithms. By integrating high-throughput experimental data with computational methods, this approach aims to facilitate discoveries in photocatalysis, streamline data analysis, and enhance the robustness of data-driven conclusions.