<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Google Analytics - GoogleAnalytics Get pageview ranking
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Google%20Analytics/GoogleAnalytics_Get_pageview_ranking.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"/></a>

**Tags:** #googleanalytics #pageviews

**Author:** [Charles Demontigny](https://www.linkedin.com/in/charles-demontigny/)

Pre-requisite: Create your own <a href="">Google API JSON credential</a>

## Input

In [None]:
#-> Uncomment the 2 lines below (by removing the hashtag) to schedule your job everyday at 8:00 AM (NB: you can choose the time of your scheduling bot)
# import naas
# naas.scheduler.add(cron="0 8 * * *")

#-> Uncomment the line below (by removing the hashtag) to remove your scheduler
# naas.scheduler.delete()

### Import library

In [1]:
import pandas as pd
import plotly.graph_objects as go
import naas
from naas_drivers import googleanalytics

### Get your credential from Google Cloud Platform

In [2]:
json_path = 'naas-googleanalytics.json'

### Get view id from google analytics

In [3]:
view_id = "228952707"

### Setup your output paths

In [4]:
csv_output = "googleanalytics_pages_views.csv"
html_output = "googleanalytics_pages_views.html"

## Model

### Ranking: Most visited web pages

In [5]:
df_pageview = googleanalytics.connect(json_path=json_path).views.get_pageview(view_id)
df_pageview

Unnamed: 0,Pages,Pageview
0,/,24092.0
1,/pricing,3980.0
2,/free-forever,3587.0
3,/tools,796.0
4,/templates,751.0
5,/community,354.0
6,/tools/airtable,206.0
7,/tools/aws,109.0
8,/tools/google-sheets,123.0
9,/tools/linkedin,129.0


## Output

### Save dataframe in csv

In [6]:
df_pageview.to_csv(csv_output, index=False)

### Plotting horizontal barchart

In [9]:
def plot_pageview(df: pd.DataFrame):
    """
    Plot PageView in Plotly.
    """
    # Prep dataframe
    df.loc[df.Pages == "/", "Pages"] = "landing"
    df.loc[df.Pages != "landing", "Pages"] = df.Pages.str[1:]
    
    # Get total views
    value = "{:,.0f}".format(df["Pageview"].sum()).replace(",", " ")
    
    # Create data
    data = go.Bar(y=df['Pages'],
                  x=df['Pageview'],
                  text=df['Pageview'],
#                   marker=dict(color="black"),
                  orientation="h")
    # Create layout
    layout = go.Layout(
        yaxis={'categoryorder': 'total ascending'},
        margin={"l":150, "pad": 20},
        title=f"<b>Most visited web pages, by total visits</b><br><span style='font-size: 13px;'>Total visits: {value}</span>",
        title_font=dict(family="Arial", size=18, color="black"),
        xaxis_title="No of views",
        xaxis_title_font=dict(family="Arial", size=11, color="black"),
        plot_bgcolor="#ffffff",
        width=1200,
        height=800,
        margin_pad=10,
    )
    fig = go.Figure(data=data, layout=layout)
    fig.update_traces(textposition="outside")
    return fig

fig = plot_pageview(df_pageview)

### Export and share graph

In [10]:
fig.write_html(html_output)

#-> Uncomment the line below (by removing the hashtag) to share your asset with naas
# naas.asset.add(html_output, params={"inline": True})

#-> Uncomment the line below (by removing the hashtag)  to delete your asset
# naas.asset.delete(html_output)

👌 Well done! Your Assets has been sent to production.



<IPython.core.display.Javascript object>

Button(button_style='primary', description='Copy URL', style=ButtonStyle())

Output()

PS: to remove the "Assets" feature, just replace .add by .delete


'https://public.naas.ai/bWV0cmljcy00MG5hYXMtMkVhaQ==/asset/8edf3a3bfca321ebe0e9e5b357fbd0ff1b0d6fc5dcc0335ccfd285ceef3d'