Welcome to INFO 490: Foundations of Data Science
Professor: Dr. Robert Brunner
Course Administrator Edward Kim (Ph.D. candidate Physics)
- Taeyoung Kim (Statistics + CS undergraduate)
- Xinyang Lu (Ph.D. candidate Astronomy)
- Andrew Mehrmann (MS candidate Statistics)
- Samantha Thrush (Ph.D. candidate Astronomy)
For contact information see the course moodle site
This class is an asynchronous, online course. This course will build a practical foundation for data science by teaching students basic tools and techniques that can scale to large computational systems and massive data sets.
Students will first learn how to work at a Unix command prompt before learning about source code control software like git and the github site. Next, the Python programming language will be covered, with a focus on specific aspects of the language and associated Python modules that are relevant for Data Science. Python will be introduced and used primarily via the IPython (or Jupyter) Notebooks, and will cover the Numpy, Scipy, MatPlotlib, Pandas, Seaborn, and scikit_learn Python modules. These capabilities will be demonstrated through simple data science tasks such as obtaining data, cleaning data, visualizing data, and basic data analysis. Students must have access to a fairly modern computer, ideally that supports hardware virtualization, on which they can install software.
This class is open to sophomores, juniors, seniors and graduate students in any discipline.
Please refer to the course syllabus for more information about course content and grading policies.
If you have any questions, or if something is not working properly, PLEASE look through the FAQs wiki page (please look at the right tool bar on the Github course page and click the icon labeled "Wiki" that looks like an open book) and the Moodle Q&A Forum before emailing TA or course instructor.
Click the link below to get live help on Gitter: