The dataset I've chosen provide information about the catalogue of applications available on Google Play Store.
The goal of this project was to extract insights about our dataset and do the same data visualisation with Python (matplotlib/seaborn) and with Tableau.
For instance, among other questions, I asked the following one:
What are the most popular categories of apps by number of installs?
50% of users have all these same top 9 types of apps bellow, but the types of apps that have the maximum off dowloads are : Game, Video Players, Communication, entertainment and photography
Apps that are the most downloaded in average are communication’s apps followed by video player’s apps
With matplotlib visualization, you really can do what you want but I was limited because of my technical skills. With Tableau, you can create really great visualization but someties it can't do exactly what we want (here, I wanted to do a scatter plot but the software wouldn't do it).
- Python - The programming language used
- Pandas - library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language
- Tableau - Popular Data visualization tool
- MatPlotLib - Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms
- Seaborn - Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
- NumPy - NumPy is a library adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
- SciPy - SciPy is an ecosystem of open-source software for mathematics, science, and engineering.