Introduction to Data Science Modules

This tutorial aims to give a small overview of 4 of the most known libraries for data analysis:

Jupyter - Jupyter is where we will run our code and document our findings and methods.
Numpy - Numpy is the numerical library that is the bases for most scientific libraries today. It introduces the concepts of array and matrices that the pure Python lacks
Pandas - Pandas is a very known library built on top of Numpy. It makes it easy in some aspects to deal with data, since it introduces the concepts of columns names and indexes.
Matplotlib - Is the first library dedicated for visualization.

It is divided into 5 jupyter notebooks:

Jupyter - Introduction to the environment we will work on
Numpy - Introduction to Numpy methods, arrays and matrices
Pandas Series - Introduction to the concept of Series in Pandas (similar to arrays)
Pandas DataFrames - Introduction to the concept of DataFrames in Pandas. DataFrames are close to matrices but they are very similar to Excel spreadsheets
Matplotlib - Introduction to the Matplotlib visualization library. We will also explore how to use Matplotlib builtin charts that Pandas implements.

All these libraries have wonderful documentation. I encourage you to go check it out once you understand these basics

Do it yourself!

You can check the notebooks on MyBinder

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
1 - Introducing Jupyter Notebooks.ipynb		1 - Introducing Jupyter Notebooks.ipynb
2-Numpy.ipynb		2-Numpy.ipynb
3-Series.ipynb		3-Series.ipynb
4-DataFrames.ipynb		4-DataFrames.ipynb
5-Matplotlib.ipynb		5-Matplotlib.ipynb
README.md		README.md
world-happiness-report.csv		world-happiness-report.csv