Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Automated turbulent structure identification charter
Turbulent flows dominate many flows of engineering interest, regulating mixing, heat transfer and drag on vehicles. While hard to concisely define, turbulence is noisy, stochastic, and contains eddies of a wide range of scales that combine to create a chaotic set of movements that are difficult to decipher. Relatively recently, however, it has been discovered that turbulent flows can be decomposed into sets of coherent structures that can explain many previous statistical observations. While such structures can be identified by eye this identification can be subjective, and new measurement and simulation techniques now produce such significant quantities of data that manual methods are impractical. New methods of automated detection and analysis of turbulent structure are desperately needed to enable more in-depth statistical analysis. Many other important questions about the nature of coherent structures also still remain to be answered, with the full range of structure geometries, their prevalence and the interactions between them poorly understood.
We aim to develop a new tool for automated detection and analysis of turbulent structures through the application of established machine learning (ML) techniques. Leveraging advances in machine learning for the study of turbulent boundary layer structure will standardize the recognition of turbulent structures and allow for significantly greater yield from the largest and most modern computational and experimental datasets, some of which can require a terabyte of storage just to describe a single snapshot of the flow.
The primary objective of this project will be to establish the most appropriate ML tool for the study of turbulent vector fields and then to apply it to sets of experimental turbulent boundary layer data. This work will be conducted in Python with available libraries.
Over the course of the incubator we will implement ML routines to satisfy the following criteria:
- A major difficulty will be the identification of structures across a wide range of physical scales and flow velocities since turbulent structures may be topologically similar but convect at different speeds and have a range of sizes. As such, preference will be given to routines that offer some form of scale invariance.
- Dimensional reduction techniques such as PCA (POD) or LLE can be leveraged to identify a set of topological modes that may separate different structures into groups that can be used for the training of a ML algorithm. These methods may identify/classify a range of interesting structures on their own, as well.
- Supervised learning algorithms can then be trained using sets of identified structures for the efficient tagging of flow topologies in large datasets.
The success of this project will be determined through the automated identification of turbulent structures in either of the following ways:
The identification of physically relevant structural modes using dimensionality reduction techniques.
The automated tagging of hairpin vortices in any two dimensional vector field using a trained algorithm or number of processing steps.
Much of this work is dependent on initial test results and as a result is less detailed toward the end of the project.
First Half - Data exploration with unsupervised learning Week 3: POD on small snapshots of data localized on potential hairpins or randomized subsets.
Week 4: LLE on same data as week 3. The aim will be to play with different processing options and compare and contrast the results.
Week 5: Explore multi-scale features of the flow and ability to capture them. Examine the inclusion of scale independent feature representation options (re-interpolation, fourier modes or wavelets) in order to improve classification of structures identified in Weeks 3 & 4.
Second Half - Data classification pipeline
Weeks 6-10 - Examine what we have learned about feature identification to set up a classification pipeline to compute statistics of identified structures.