Skip to content

MEEG-44403/54403 Machine Learning for Mechanical Engineers at the University of Arkansas

License

Notifications You must be signed in to change notification settings

hanhuark/MEEG-54403

Repository files navigation

MEEG-44403/54403: Machine Learning for Mechanical Engineers

Open in MATLAB Online

Instructor: Han Hu

Course Description:

Overview:

This course covers an introduction to supervised and unsupervised learning algorithms for engineering applications, such as visualization-based physical quantity predictions, dynamic signal classification, and prediction, data-driven control of dynamical systems, surrogate modeling, and dimensionality reduction, among others. The lectures cover the fundamental concepts and examples of developing machine learning models using Python and MATLAB. This course includes four homework assignments to practice the application of different machine learning algorithms in specific mechanical engineering problems and a project assignment that gives the students the flexibility of selecting their topics to study using designated machine learning tools. The overarching goal of this project is to equip mechanical engineers with machine learning skills and deepen the integration of data science into the mechanical engineering curriculum. Compared to machine learning courses offered by computer science and data science programs, this course has a much stronger focus on integration with mechanical engineering problems. Students will be provided with concrete and specific engineering problems with experimental data. The projects, presentations, and in-class peer review practice are designed to foster students’ professional skills following the National Association of Colleges and Employers (NACE) competencies, including critical thinking, communication, teamwork, technology, leadership, and professionalism. Graduate students are required to complete an extra assignment (selected from three provided options) and a supercomputing assignment.

Learning Objectives:

Students completing this course are expected to be capable of
• Develop, train, and test machine learning models using Python/TensorFlow and MATLAB
• Develop machine learning models for image classification and clustering
• Perform data dimensionality reduction for physics extraction
• Analyze images/maps from experiments and simulations to predict physical quantities
• Adapt trained machine learning models to new applications
• Analyze time series for classification and regression
• Develop surrogate models for computationally expensive numerical simulations
• Benchmark the scalability of machine learning models on CPU and GPU clusters
• Develop complex machine learning models by integrating two or multiple mechanisms in tandem

Textbook:

Steven L. Brunton, J. Nathan Kutz, Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, 1st ed, Cambridge University Press, 2019

Software Packages:

Python Packages:

  • TensorFlow
  • PyTorch
  • NumPy
  • SciPy
  • scikit-learn
  • Keras
  • Pandas
  • Matplotlib
  • Seaborn
  • OpenCV

MATLAB and Toolboxes:

Assignment 1 - Regression:

In pool boiling experiments, the boiling heat flux can be estimated as the supplied power divided by the heater surface. However, this estimation will not be very accurate due to heat loss and other non-ideal conditions in experiments, especially for thin-film heaters with relatively low thermal conductivities (e.g., ITO heaters). Conventionally, finite-element simulations are used to evaluate the heat loss to validate or correct the experimental assumptions. Machine learning provides another perspective for tackling this issue. The heat loss and other non-ideal conditions can be captured and accounted for by the hidden layers of neural networks. The target of Problem 1-1 is to develop an MLP model to predict heat flux using temperature. The data set includes the temperature and the heat flux during a transient pool boiling test. a. Set up and train an MLP and a GPR model to predict the heat flux based on the temperature. Report the training curves (training/validation accuracy/loss vs. epoch) and the training time (time/epoch, time till the best model). b. Circumvent the effects of overfitting using k-fold cross-validation (e.g., using 100 foldings).

Assignment 2 - Image Classification

Pool boiling is a heat transfer mechanism that dissipates a large amount of heat with minimal temperature increase by taking the advantage of the high latent heat of the working fluids. As such, boiling has been widely implemented in the thermal management of high-power-density systems, e.g., nuclear reactors, power electronics, and jet engines, among others. The critical heat flux (CHF) condition is a practical limit of pool boiling heat transfer. When CHF is triggered, the heater surface temperature ramps up rapidly (~ 150 C/min), leading to detrimental device failures. There is an increasing research interest in predicting CHF based on boiling images. Under the directory /ocean/projects/mch210006p/shared/HW1/classification, there are two folders, namely, “pre-CHF” and “post-CHF” that contain pool boiling images before and after CHF is triggered, respectively. The target of this problem is to develop a machine learning model to classify the boiling regime (pre or post CHF) based on boiling images. a. Split the data set into training, validation, and testing. This can be done before training with a separate package “split-folders” or directly in the code during training. b. Set up and train a model to classify the pre-CHF and post-CHF images. Report the training curves (training/validation accuracy/loss vs. epoch) and the training time (time/epoch, time till the best model). Use EarlyStopping for fast convergence. c. Test the model using the reserved test data, report the confusion matrix, accuracy, precision, recall, F1 score, the receiver operating characteristic (ROC), and area under the curve (AUC).

Assignment 3 - Dimensionality Reduction and Clustering

Run dimensionality reduction and clustering analysis of the same dataset used in HW-2. (a) Run single value decomposition (SVD) or principal component analysis (PCA) of the images and plot the percentage explained variance vs. the number of principal components (PC). (b) Pick a representative image, run PCA and plot the reconstructed images using a different number of PCs (e.g. using PC1, PCs 1-2, PCs, 1-10, PCs 1-20, etc.). (c) Calculate the error of the reconstructed images relative to the original image and plot the error as a function of the number of PCs. (d) Run a clustering analysis of the boiling images using the PCs (the number of PCs to use is up to your choice) and evaluate the results of clustering.

Assignment 4 - Time Series Regression

The data file vapor_fraction.txt includes the vapor fraction (second column, dimensionless) vs. time (first column, unit: ms) of the boiling image sequences. The data are sampled with a frequency of 3,000 Hz (namely, a time step of 0.33 ms). Develop a recurrent neural network (RNN) model to forecast vapor fraction of future frames based on the past frames, e.g., predicting the vapor fraction profile of t = 33.33 ms – 66 ms using the vapor fraction history of t = 0.33 – 33 ms. Options include regular RNN, bidirectional RNN, gated recurrent unit (GRU), bidirectional GRU, long short-term memory (LSTM), bidirectional LSTM. (a) Develop a baseline model with an input sequence length of 16.33 ms (50 data points) and an output sequence length of 16.33 ms (50 data points). Plot the model-predicted signal vs. the true signal. (b) Vary the input and output sequence lengths to evaluate their effect on the error of the model predictions.

Extra Assignment 1 - Image Classification using PCA-MLP

Re-do the image classification problem in HW-2 using PCA-MLP. Run SVD or PCA to obtain the PCs of the images. Feed the PCs to an MLP neural network to classify the regime of the boiling images.

Extra Assignment 2 - Image Regression

The vapor fraction (second) column of the data file vapor_fraction.txt are the labels of the images under the folder images Train a convolutional neural network (CNN) model to predict the vapor fraction of the images and compare the model prediction against the true data.

Extra Assignment 3 - Sequence to sequence prediction

The image dataset represents a boiling image sequence under transient heat loads. The images have a frame rate of 1,000 fps (or a time step of 1 ms). Run PCA to obtain the PC profiles of the image sequence. Feed the extracted PC profiles to an RNN model to forecast the PCs of future frames. Reconstruct image sequences using the predicted PCs and compare the reconstructed images against the true images. The recommended RNN models include LSTM or BiLSTM.

Note:

  1. Use an input vector length of 100 and an output vector length of 100 for the model.
  2. Downsample the image sequence (e.g., reducing the frame rate from 1,000 fps to 500 fps) in case of memory issues.

Tutorials for Assignments (MATLAB and Python):

Developer: Najee Stubbs

Acknowledgments:

The initial development of the course was supported by the Department of Mechanical Engineering at the University of Arkansas(led by Darin Nutter) and the Arkansas NSF EPSCoR Data Analytics that are Robust and Trusted (DART) Project (led by Jennifer Fowler). The experimental datasets used in the assignments were prepared by Hari Pandey and the development of the original course content was supported by Connor Heo and Christy Dunlap. The course syllabi and content were updated following the comments from the Department of Mechanical Engineering Curriculum Committee (chaired by Steve Tung) and students enrolled in this course from 2021 - 2023. The tutorials for the assignments were developed by Najee Stubbs under the support of a gift from the MathWorks Curriculum Development Support program (organized by Mehdi Vahab).

Publications:

A. Publications consisting of course projects/assignments

B. Educational papers