Skip to content

Find out about the evolution of the Linux operating system by exploring its version control system.

Notifications You must be signed in to change notification settings

NoahPlage/Exploring-the-Evolution-of-Linux

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Exploring-the-Evolution-of-Linux (DataCamp Project)

Find out about the evolution of the Linux operating system by exploring its version control system.

Project Description

Version control repositories like CVS, Subversion or Git store rich evolution information about a software project. In this project, you'll be challenged to read in, clean up and visualize a real world Git repository dataset of the Linux kernel. With almost 700k commits and thousands of contributors (find out the exact number in this project ;-) ) there are some little data cleaning and wrangling challenges that you'll encounter. But you'll also gain insights about the development activities over the last 13 years.

For this Project, you need to be familiar with Pandas DataFrames, especially the read_csv and groupby functions, as well as working with time series data.

Technology

  • Python

Topics

  • Importing and cleaning data
  • Data manipulation
  • Correcting and grouping data
  • Data visualisation

Outline

  1. Introduction
  2. Reading in the dataset
  3. Getting an overview
  4. Finding the TOP 10 contributors
  5. Wrangling the data
  6. Treating wrong timestamps
  7. Grouping commits per year
  8. Visualizing the history of Linux
  9. Conclusion

Visualisation from the project

image

About

Find out about the evolution of the Linux operating system by exploring its version control system.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published