The Scala programming language has a repository of real-world project data with over 30,000 commits and a history of over ten years. As a mature, general-purpose language, Scala has gained popularity among data scientists in recent years.
One advantage of Scala is that it is an open source project, meaning that its entire development history is publicly available, including information on who made changes, what changes were made, and code reviews.
In this project, we will explore and analyze the Scala project repository data, which includes data from both Git (version control system) and GitHub (project hosting site). Through data cleaning and visualization, we aim to identify the most influential contributors to the development of Scala and uncover the experts in this field.