Course page on notion pre-2023 course page: 資料科學軟體實作 (notion.site) github repo: https://github.com/tjwei/DataScienceSoftware
Begin in 2023 fall semester, we chose English as the primary communication language and Rust as the primary programming language in lecture examples. When submitting assignments, students can use either Python or Rust.
Python is the obvious programming language to choose because it is now the dominant programming language in data science, machine learning, and deep learning. Mostly because of the massive ecosystem that has grown up around Python. Other programming languages, including C, C++, SQL, Rust, and JavaScript, play important roles in this ecosystem, but Python is unquestionably at the center, connecting everything.
Rust is a more modern language that is rapidly gaining popularity. It has decent Python compatibility via PyO3 and can be used with a Jupyter notebook via evcxr. It may not replace Python anytime soon due to its lengthy compiling time and somewhat steep learning curve at first, but it is still a language worth studying.
We anticipate that students will become acquainted with both Python and Rust. We chose Rust as our primary educational tool because we have lots of Python materials in the prior course repo, and there are plenty of opportunities and resources to learn these Python packages. We will still introduce many Python packages through PyO3.
This semester, we are collaborating with the KKCompany ARC Team. On kaggle, they will host the first KK Data Game. This competition is expected to attract 500 students from 11 machine learning and data science courses across nine schools. The competition counts as one homework/project and scores are given based on performance in the game.
In our class examples, we will also leverage data provided by KKCompany.