This notebook contains analysis of fertility rates of different countries from 1960s to 2013 as part of Udacity Data Science Nanaodegree.
There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python.
Libraries used are:
- Pandas
- Numpy
- Scikitlearn
- Scipy
- Matplotlib
For this project, I was interested to know the trends in fertility rates of different countries over the years. I have used Kaggle dataset provided here to do the analysis and get some insights on the following questions.
- Does African countries have greater fertility rates?
- Which are the countries that have highest and lowest fertility rates?
- What are the trends of fertility rates in developed and developing countries? Is there a difference in trends?
- Can you predict the fertility rates given 53 years fertility rates of the countries?
- How fertility rates have been changing over the years ?
There are 2 files in this repository.
-
.csv
file : Dataset used for the analysis is provided- Dataset contains the fertiltiy rates of 215 countries from 1963 to 2013
- 215 rows and 58 columns
-
.ipynb
file : Jupyter notebook with all the analysis code- This notebook contains:
- Data cleansing to get required data
- Data imputation using KNNImputer
- 5 questions that was answered using various techniques like EDA and Machine Learning algorithm.
- Visualisations to support the insights
- This notebook contains:
Results after the analysis are :
- African countries have greater fertiltiy rates than other regions of the world
- Niger is the country with highest fertility rate and Liechtenstien is the country with lowest fertility rates.
- Developed countries showed a similar pattern in the variation of fertility rates. Some of the developed countries have maintained its fertilty rate over the years. These results can be seen in detail here.
- A model to predict the fertility rates can be seen in the notebook provided in the repository.