Skip to content

Latest commit

 

History

History
46 lines (31 loc) · 2.15 KB

README.md

File metadata and controls

46 lines (31 loc) · 2.15 KB

Please view the blog at https://jamiepotter17.github.io/stackoverflow/

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Installation

Code in the Jupyter Notebook file 'CRISP-DM Process.ipynb' should run on a standard Anaconda distribution of Python 3.*.

However, it ran without errors on this specific installation if a failsafe is needed:

  • python 3.6.13
  • jupyter 1.0.0
  • matplotlib 3.3.4
  • numpy 1.17.0
  • pandas 1.1.5

Project Motivation

This project was completed as part of Udacity's Data Scientist Nanodegree Program. Information about this online course can be found here. Specifically, it was the first project required.

For this project, I was interested in using Stack Overflow survey data longitudinally in order to evaluate:

  1. What Size Organisation do People Work For and Has This Changed?
  2. What Countries are Developers From?
  3. What Programming Languages are Developers Using?

File Descriptions

  • ./docs/ - contains files for the maintenance of the blog post, which is viewable at https://jamiepotter17.github.io/stackoverflow/.
  • .gitignore and ./.ipynb_checkpoints - github files.
  • README.md - this file.
  • CRISP-DM Process.ipynb - the Jupyter Notebook file used in the exploration and analysis of the Stack Overflow data.
  • StackOverflowData.zip - contains the relevant .csv files used by the Jupyter Notebook file.

Results

The main findings are summarised on the blog at https://jamiepotter17.github.io/stackoverflow/.

Licensing, Authors, Acknowledgements

Many thanks to Stack Overflow for collecting and publishing the data, which is available here. Feel free to use this information and code as you will.