# Play store apps dataset

## Why this dataset is interesting?

The Google Play Store is Google's app marketplace. Most people access the Google Play Store when they want to install new apps onto Android their phones.

Like any market apps in the play store are subject to **supply** and **demand**... that is to say that certain kinds of apps get downloaded a lot while others don't. Certain kinds of apps get paid for while others don't. Some categories of apps have lots and lots of competition while others don't.

A dataset like this can help you spot opportunities.

## Ideas for questions this data can help you answer

* What categories of applications get a lot of downloads per day?
* What categories of applications don't get many downloads per day?
* In what app categories are there market leaders (one app that clearly is getting downloaded more than the others)?
* How many downloads per day might you expect if you took the time to build an app?
* What can the data tell you about monetization approaches?

## Where you can go to get this kind of data

I wrote a crawler to collect this dataset. The crawler was based off of [facundoolano's very awesome google-play-scraper](https://github.com/facundoolano/google-play-scraper) library.

If you're going to write a crawler it's best to make sure it's ok to do the crawl... so...

In [1]:
import urllib.robotparser

rp = urllib.robotparser.RobotFileParser()
rp.set_url("https://www.etsy.com/robots.txt")
rp.read()
rp.crawl_delay("*")

test_crawl_url="https://play.google.com/store/apps/details?id=com.wildnotion.poetscorner"

can_crawl_listings = rp.can_fetch("*", test_crawl_url)
print("We can crawl Google Play? {0}".format(can_crawl_listings))

We can crawl Google Play? True


**Sweet! We can crawl the listings! 'twould be kinda funny if Google didn't let you crawl their site...**

Anyway... on to the data...

## Some data - 62683 apps

In [6]:
import pandas as pd

df = pd.read_csv('../datasets/google-play-store-11-2018.csv')
df.describe()

Unnamed: 0,reviews,ratings,min_installs,score,ratings_per_day,price,rating_one_star,rating_two_star,rating_three_star,rating_four_star,rating_five_star
count,62683.0,62683.0,62694.0,62683.0,62694.0,62694.0,62694.0,62694.0,62694.0,62694.0,62694.0
mean,15298.43,49363.28,2035663.0,4.221624,38.620506,0.414998,3078.124,1211.618,3094.328,7227.599,34742.95
std,226150.5,769025.5,23868720.0,0.815517,430.42277,3.793236,60502.31,21934.03,52302.51,111706.6,536804.2
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,16.0,41.0,1000.0,4.100497,0.0,0.0,2.0,1.0,2.0,4.0,28.0
50%,144.0,398.0,50000.0,4.403101,0.0,0.0,24.0,9.0,23.0,47.0,264.0
75%,1500.0,4488.0,500000.0,4.637007,6.0,0.0,307.0,109.0,284.0,609.0,2933.0
max,22053770.0,81284860.0,1000000000.0,5.0,40526.0,369.99,9658715.0,3368101.0,7164984.0,12223420.0,52952660.0


In [7]:
df.head()

Unnamed: 0,app_id,title,reviews,ratings,min_installs,score,offers_iap,ad_supported,released,ratings_per_day,genre,genre_id,price,rating_one_star,rating_two_star,rating_three_star,rating_four_star,rating_five_star
0,com.prettyteengames.royal.princess.wedding.mak...,Royal Princess Wedding Makeover and Dress Up,375.0,1023.0,100000,4.179863,True,True,2017-12-20,3,Casual,GAME_CASUAL,0.0,115,31,98,90,689
1,com.MayGreenStudio.dressup,Momo's Dressup,13492.0,25974.0,1000000,4.711096,False,True,2017-03-07,42,Casual,GAME_CASUAL,0.0,673,213,806,2561,21721
2,air.theflash.f2game.PrettyGirl23,Princess Pretty Girl,1974.0,4610.0,500000,4.295445,False,True,2015-01-18,3,Casual,GAME_CASUAL,0.0,382,206,287,528,3207
3,air.com.dressupone.animeschooluniforms,Anime School Uniforms,2586.0,6081.0,500000,4.209505,False,True,2013-08-20,3,Casual,GAME_CASUAL,0.0,628,193,524,668,4068
4,air.theflash.f2game.PrettyGirl7,Wedding Pretty girl,1409.0,3728.0,500000,4.195011,False,True,2014-09-01,2,Casual,GAME_CASUAL,0.0,358,185,300,414,2471
