Investigate a Dataset: Influence of the nature and wellbeing of a Country in its ecological behaviour
This project is the final task for part 2 of the Udacity Data Analyst Nano Degree.
The project consists in creating a jupyter notebook to present an analysis on a dataset of my choice.
- Know all the steps involved in a typical data analysis process
- Be comfortable posing questions that can be answered with a given dataset and then answering those questions
- Know how to investigate problems in a dataset and wrangle the data into a format you can use
- Have experience communicating the results of your analysis
- Be able to use vectorized operations in NumPy and pandas to speed up your data analysis code
- Be familiar with pandas' Series and DataFrame objects, which let you access your data more conveniently
- Know how to use Matplotlib to produce plots showing your findings
This analysis is based on date extracted from Gapminder, focused on enviromental pollution and what influences it.
The analysis consists of the following indicators:
-
C02 Emission (tonnes per person)
-
Forest coverage(%)
-
Democracy score
-
Human Development Index (HDI)
More detailed information about the datasets can be found in the project.
.ipynb file where the analysis has been developed, contains code with Markdown cells from Jupyter Notebook. .html file output of the .ipynb converted to web version for easy viewing .csv files with the data used to conduct the analysis
No installation is needed to view the analysis. To reproduce the project an installation of Python 3.5 and the following libraries is needed:
- pandas
- NumPy
- Matplotlib
- csv A link to instructions written by me in a blog post here