# View: Revenue and NumberOfPatients by Payment Status

## Data Source
- **Visits:** `workspace.hospital_silver.visits`

## Details: 
- Location: `workspace.hospital_gold.view_revenue_paymentstatus`
- Description: Sum Revenue and Count Distinct of Patient in Payment Status


In [0]:
# Databricks Storage
catalog_name = "workspace"
schema_bronze = "hospital_bronze"
schema_silver = "hospital_silver"
schema_gold = "hospital_gold"

# view name: name of the view in schema and checkpoint
view_name = "view_revenue_paymentstatus"

# data source path
data_source = "s3://buckethospitaldata/view/"

# for streaming: schema and checkpoint location (stored in data source S3 buckets)
checkpoint_location = f"{data_source}_checkpoints/{view_name}"

%md
## Read data from Silver Layer

In [0]:
df_visits = spark.read.table(f"{catalog_name}.{schema_silver}.visits")

## Aggregate data

In [0]:
from pyspark.sql.functions import countDistinct, sum, approx_count_distinct

view_revenue_paymentstatus = df_visits.groupBy("Payment_Status").agg(sum("Revenue_per_visit").alias("Revenue"),approx_count_distinct("Patient_ID").alias("Number_of_unique_patient")).sort("Payment_Status")

## Write data as a View in Gold Layer

In [0]:
(
    view_revenue_paymentstatus.write
    .format("delta")
    .mode("overwrite")  
    .option("overwriteSchema", "true") 
    .saveAsTable(f"{catalog_name}.{schema_gold}.{view_name}")
)