- When: 10am - 10:50am, Monday and Wednesday
- Where: Steward 204 (or N305 if needed)
This is a graduate level elective course aiming at providing the interface between astronomical data analysis problems and modern statistics methods. Modern astronomy and astrophysics is undergoing a revolution with dramatic increases in both the volume and complexity of astronomical data. The last decade saw the emergence of many terabyte-level sky surveys across the electromagnetic spectrum; the next decade, data volumes will enter the petabyte regime, with an ever strong time domain component. These new data sets represent quantum leaps in our abilities for new astronomical discoveries, but also present significant challenges to standard analysis tools normally employed in astronomy.
The goal of this course is to bridge the gap between modern large data surveys and the data analysis tools that have been provided in normal graduate courses. The course will start with a brief review of the modern statistics framework relevant to large scale data analysis, including probabilities and statistical distribution, classical and Bayesian statistical inferences. Then it will cover the main topics of the course: data mining and machine learning, including density estimation, clustering analysis, dimensionality reduction, regression and model fitting, classification and time series analysis. Another key component of the course is to introduce commonly used data mining and machine learning tools, in the context of Python-based packages, which will be used in solving data problems throughout the course.
The course will be given in a combination of instructor lectures, student-led seminars and guest lectures. After the first section of introductory material, students will lead discussion and demonstration of most of the topics, and guest lectures will introduce important current and future key big data projects in astronomy. The class will conclude with final projects on using data mining and machine learning tools of your own research data.