<a href="https://colab.research.google.com/github/arnav-is-op/google-collab/blob/main/views.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import sys
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

# If running in Google Colab, install PostgreSQL and restore the database
if 'google.colab' in sys.modules:
    # Update package installer
    !sudo apt-get update -qq > /dev/null 2>&1

    # Install PostgreSQL
    !sudo apt-get install postgresql -qq > /dev/null 2>&1

    # Start PostgreSQL service (suppress output)
    !sudo service postgresql start > /dev/null 2>&1

    # Set password for the 'postgres' user to avoid authentication errors (suppress output)
    !sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'password';" > /dev/null 2>&1

    # Create the 'colab_db' database (suppress output)
    !sudo -u postgres psql -c "CREATE DATABASE contoso_100k;" > /dev/null 2>&1

    # Download the PostgreSQL .sql dump
    !wget -q -O contoso_100k.sql https://github.com/lukebarousse/Int_SQL_Data_Analytics_Course/releases/download/v.0.0.0/contoso_100k.sql

    # Restore the dump file into the PostgreSQL database (suppress output)
    !sudo -u postgres psql contoso_100k < contoso_100k.sql > /dev/null 2>&1

    # Shift libraries from ipython-sql to jupysql
    !pip uninstall -y ipython-sql > /dev/null 2>&1
    !pip install jupysql > /dev/null 2>&1

# Load the sql extension for SQL magic
%load_ext sql

# Connect to the PostgreSQL database
%sql postgresql://postgres:password@localhost:5432/contoso_100k

# Enable automatic conversion of SQL results to pandas DataFrames
%config SqlMagic.autopandas = True

# Disable named parameters for SQL magic
%config SqlMagic.named_parameters = "disabled"

# Display pandas number to two decimal places
pd.options.display.float_format = '{:.2f}'.format

# **VIEWS**

路 **Why Use Views in PostgreSQL?**

路 Simplifies complex queries by storing them as reusable, named objects.

. Ensures consistency and readability when multiple queries rely on the same logic.

路 Enhances security by restricting access to specific rows/columns.

. Improves maintainability by centralizing changes to the query logic.

**CREATE VIEW**

路 Syntax:

CREATE VIEW view_name AS

SELECT

column1,

column2,

column3

FROM table_name

In [16]:
%%sql
DROP VIEW cohort_analysis

In [17]:
%%sql


 CREATE VIEW cohort_analysis AS

WITH customer_revenue AS (

SELECT
s.customerkey,
s.orderdate,
SUM(s.quantity*s.netprice*s.exchangerate) AS total_net_revenue,
COUNT(s.orderkey) AS num_orders,
--c.* -- it indicate all info from c table.. as we are joining na..  using this we seee the table and find out what values we need and so we keep that only here
c.countryfull,
c.age,
c.givenname,
c.surname
FROM
sales s
LEFT JOIN customer c ON s.customerkey = c.customerkey
GROUP BY
s.customerkey,
s.orderdate,
c.countryfull,
c.age,
c.givenname,
c.surname

-- now for all these customers we need to make a cohort na so we need window function so keep this all inside a cte..

)

SELECT
cr.*,
MIN(cr.orderdate) OVER(PARTITION BY cr.customerkey) AS first_purchase_date,
-- this line of code is for cohort date
EXTRACT(YEAR FROM MIN(cr.orderdate) OVER(PARTITION BY cr.customerkey) ) AS cohort_year
-- this line gives first purchase year ie cohort year
FROM
customer_revenue cr

-- now this entire thing we made it into a view in beaver app itself..

In [None]:
%%sql
-- daily revenue view
--CREATE VIEW daily_revenue AS
--
--SELECT
--	orderdate,
--	SUM(quantity * netprice * exchangerate) AS total_revenue
--FROM
--	sales
--GROUP BY
--	orderdate;

--DROP VIEW daily_revenue

# **Project: View--Cohort Analysis View**

now suppose we want to find sum of net revenue in every cohort year

In [11]:
%%sql
SELECT
cohort_year,
SUM(total_net_revenue)
FROM
cohort_analysis
GROUP BY
cohort_year
ORDER BY cohort_year
-- we will get the output by this insead of writing that much code again..

Unnamed: 0,cohort_year,sum
0,2015,14892230.47
1,2016,18360521.74
2,2017,21979733.96
3,2018,36460385.42
4,2019,36696243.88
5,2020,11921900.97
6,2021,18387736.18
7,2022,29872808.3
8,2023,14979328.33
9,2024,2856649.33




---



# **Updating Views--Cohort Analysis View**

ALTER VIEW [ IF EXISTS ] name ALTER [ COLUMN ] column_name SET DEFAULT expression

ALTER VIEW [ IF EXISTS ] name ALTER [ COLUMN ] column_name DROP DEFAULT

ALTER VIEW [ IF EXISTS ] name OWNER TO { new_owner | CURRENT_ROLE |

CURRENT_USER | SESSION_USER }

ALTER VIEW [ IF EXISTS ] name RENAME [ COLUMN ] column_name TO new_column_name

ALTER VIEW [ IF EXISTS ] name RENAME TO new_name

ALTER VIEW [ IF EXISTS ] name SET SCHEMA new_schema

ALTER VIEW [ IF EXISTS ] name SET ( view_option_name [= view_option_value]
[, ... ] )

ALTER VIEW [ IF EXISTS ] name RESET ( view_option_name [, ... ] )



---



[views collab notes](https://colab.research.google.com/drive/1k80otGRJsSVvHHDh1RHsKGP7eBmsfNFP?authuser=0#scrollTo=8-oImNlYRJlk)