# Top Games on Google Play Store

**Let's find out which are the most installed and loved games on the Google Play Store. Do you have any ideas? What are your favorite games on the Google Play Store?**

![](https://th.bing.com/th/id/R1a10f46b065287fa2fd22673e6728774?rik=YOhZpC2tGpJrvA&pid=ImgRaw)

**Import useful libraries**

In [None]:
import math
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

In [None]:
filepath = '../input/top-play-store-games/android-games.csv'
data = pd.read_csv(filepath)

In [None]:
df = data.copy()

**Taking a little overview of the dataframe**

In [None]:
df.head()

**Let's see what kind of features are**

In [None]:
df.info()

**Finally, let's see if there are any missing data**

In [None]:
df.isnull().sum()

# **Easy Data Visualization**

**What are the categories of games most present?**

In [None]:
plt.figure(figsize=(14, 7))
labels=df['category'].value_counts().index
plt.pie(df['category'].value_counts().values,labels=labels,
        explode=[0.15, 0.12, 0.1, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08],
        autopct='%1.1f%%', startangle=90)
plt.title('Category Pie Chart',fontsize=20,pad=40)
plt.axis('equal')
plt.show()

**Looking at this pie chart we can see that the categories are present in almost equal quantities. Only card games, word games and casual games stand out a little.**

**Free or paid games?**

In [None]:
def Paid(paid):
    if paid  == 0 : return 'Free'
    else: return 'Costs'
df['paid'] = df.apply(lambda x: Paid(x["paid"]), axis = 1)

In [None]:
plt.figure(figsize = (10, 7))
fig = px.histogram(df, x = 'paid',
                   title='Free vs Paid games',
                   labels={'paid':'Category'})
#ax = sns.countplot(x = "paid", data=df)
fig.show()

**Looking at this histogram we can see that the majority of the top games on Google Play Store are free.**

In [None]:
import plotly.express as px
top_rated = df[0:10]
fig =px.sunburst(
    top_rated,
    path=['title', 'category', 'paid'],
    values='average rating',
    color='average rating',
)
fig.update_layout(
    grid= dict(columns=2, rows=1),
    margin = dict(t=0, l=0, r=0, b=0)
)
fig.show()

**With this graph representing the top 15 games sorted by average score we can see that action games are the most loved on the Google Play Store and we can also see that everyone is free to play.**

# **Installs**

In [None]:
def Install(install):
    if install  == "10.0 M" : return 10000000
    elif install  == "50.0 M" : return 50000000
    elif install  == "5.0 M" : return 5000000
    elif install  == "100.0 M" : return 100000000
    elif install  == "1.0 M" : return 1000000
    elif install  == "500.0 k" : return 500000
    elif install  == "500.0 M" : return 500000000
    elif install  == "100.0 k" : return 100000
    else: return 1000000000
df['installs'] = df.apply(lambda x: Install(x["installs"]), axis = 1)

In [None]:
fig = px.bar(top_rated,
             x='title',
             y='installs',
             title='What are the most downloaded games?')
fig.update_xaxes(categoryorder = 'total descending')
fig.show()

**As we could have predicted, the most popular games are also the ones most downloaded by users. In this graph we can see that the number of installations (the figure contained in the outermost circle) is almost always corresponding to the ranking (the position in the ranking is indicated by the number corresponding to the game, present in the innermost circle), the only exception is given by Temple Run 2. How strange, did you expect this large number of installations for Temple Run 2?**

# Thank you for reading my notebook (which I will update soon by completing it), please put an upvote and if you have any advice write it in the comments that I would like to know yours also to improve me. Happy Kaggling! 