# SNOWFLAKE database

Shared database from Snowflake.
By default, the SNOWFLAKE database is visible to all users.
but the ***access*** to all objects is only to the ACCOUNTADMIN role

Schemas:


## ACCOUNT_USAGE -> frequent used and focus of this exercise

Views that display object metadata and usage metrics for your account

## ALERT

Functions that are intended for use in alert objects.

## BILLING

Views that contains billing information for the customers of Snowflake resellers and distributors.
Only resellers and distributors can access the views in the BILLING schema.

## CORE
Contains views and other schema objects to support:
- system tags used with classifying data 
- system data metric functions used to measure data quality

## DATA_PRIVACY

Contains functions and stored procedures related to data privacy. Also contains the custom_classifier class.

## DATA_SHARING_USAGE

Views that display object metadata and usage metrics related to listings published in the Snowflake Marketplace or a data exchange.

## EXTERNAL_ACCESS

Schema that contains built-in network rules specific to connections for network traffic outbound from Snowflake.

## INFORMATION_SCHEMA

This schema is automatically created in all databases. In a shared database, such as SNOWFLAKE, this schema doesn’t serve a purpose and can be disregarded.

## LOCAL

This schema is used by some account-level Snowflake features for logging to telemetry event tables. For more information about this schema, see LOCAL.

## ML

Contains ML functions, which is a suite of analysis tools built by Snowflake, and the DOCUMENT_INTELLIGENCE class used in Document AI.

## MONITORING

Views that provide historical information for objects in your account. In the Information Schema, the views and table functions that return historical information will eventually be migrated to the MONITORING schema in the future.

## NETWORK_SECURITY

Schema that contains built-in network rules that define the set of allowed IP addresses that a frequently used, third-party partner application uses to connect with Snowflake. For more information about Snowflake-managed network rules, see Snowflake-managed network rules.

## NOTIFICATION

Stored procedures and functions for sending notifications.

## ORGANIZATION_USAGE

Views that display historical usage data across all the accounts in your organization.

## READER_ACCOUNT_USAGE

Similar to ACCOUNT_USAGE, but only contains views relevant to the reader accounts (if any) provisioned for the account

## SPCS

Functions for use with Snowpark Container Services.

## TELEMETRY


Tables, views, and stored procedures to support collecting telemetry data such as log messages, trace event data, and metrics data.

## TRUST_CENTER

Views that display data about the Trust Center extensions.



# ACCOUNT_USAGE  vs INFORMATION_SCHEMA

### ACCOUNT_USAGE:
Enable querying object metadata and historical usage data for your account and all reader accounts.

### READER_ACCOUNT_USAGE:
Same as account usage but for reader-only type of users.

### INFORMATION_SCHEMA:
aka “Data Dictionary” consists of a set of system-defined views and table functions that provide extensive metadata information about the objects created in your account.


### Main differences:
Dropped objects:
- ACCOUNT_USAGE: Includes dropped objects
- INFORMATION_SCHEMA: No dropped objects (except cases where some cost is still involved)

Data Latency:
- ACCOUNT_USAGE: Latency of 45 min to 3 hours.
- INFORMATION_SCHEMA: None

Data Retention:
- ACCOUNT_USAGE: 1 year.
- INFORMATION_SCHEMA: Data Retention of 7 days to 6 months

# INFORMATION_SCHEMA -> main views / functions

## Common arguments for table_functions

- TIME_RANGE_START => constant_expr 
- TIME_RANGE_END => constant_expr
- RESULT_LIMIT => num - default = 100
- USER_NAME => 'string' - default = current_user (only for functions _by_user)
- WAREHOUSE_NAME => 'string' - default = current_warehouse (only for functions warehouse_)

### LOGIN_HISTORY() / LOGIN_HISTORY_BY_USER()

Login attempts by Snowflake users.

Those are table functions, not views, so use "from table()".

    from table(information_schema.login_history()

    
####  LOGIN_HISTORY_BY_USER() samples:

last 100 login events of the current user.

    select * from table(information_schema.login_history_by_user()) order by event_timestamp;

Retrieve up to the last 1000 login events of the specified user:

    select * from table(information_schema.login_history_by_user(USER_NAME => 'USER1', result_limit => 1000)) order by event_timestamp;

####  LOGIN_HISTORY() samples:

Retrieve up to 100 login events of every user your current role is allowed to monitor in the last hour:


    select * from table(information_schema.login_history(TIME_RANGE_START => dateadd('hours',-1,current_timestamp()),current_timestamp()))
order by event_timestamp;


### QUERY_HISTORY()

table functions to query Snowflake query history:

- QUERY_HISTORY returns queries within a specified time range.
- QUERY_HISTORY_BY_SESSION returns queries within a specified session and time range.
- QUERY_HISTORY_BY_USER returns queries submitted by a specified user within a specified time range.
- QUERY_HISTORY_BY_WAREHOUSE returns queries executed by a specified warehouse within a specified time range.

### WAREHOUSE_LOAD_HISTORY()

query the activity history (defined as the “query load”) for a single warehouse within a specified date range.

### COPY_HISTORY

query Snowflake data loading history along various dimensions within the last 14 days

#### Arguments

- TABLE_NAME => 'string'
- PIPE_NAME => 'string

### LOGIN_HISTORY 

Query login attempts by Snowflake users
These functions return login activity within the last 7 days.

- LOGIN_HISTORY returns login events within a specified time range.

- LOGIN_HISTORY_BY_USER returns login events of a specified user within a specified time range.

### TABLES view

displays a row for each table and view in the specified (or current) database

- the view does not honor the MANAGE GRANTS privilege and consequently may show less information compared to a SHOW command
- Querying the sum(bytes) for a table does not represent the total storage usage, because the amount does not include Time Travel and Fail-safe usage
- The view does not include tables that have been dropped

### TABLE_STORAGE_METRICS view

displays table-level storage utilization information.

- including tables that have been dropped, but are still incurring storage costs.
- you must use the ACCOUNTADMIN role. The view is visible to other views and can be queried, but the queries will return no rows.
- There may be a 1-2 hour delay in updating storage related statistics for active_bytes, time_travel_bytes, failsafe_bytes, and retained_for_clone_bytes
- ID does not change for a table throughout its lifecycle, including if the table is renamed or dropped
- CLONE_GROUP_ID is the ID of the oldest ancestor of a clone, including if the table has been dropped, but is still accruing storage costs



In [None]:
use database sf_cert_prep ; 

In [ ]:
---track a list of all users who logged into the system in the last 60 minutes

select *
from table(information_schema.LOGIN_HISTORY(TIME_RANGE_START => dateadd('hours',-1,current_timestamp()),current_timestamp()))
order by event_timestamp;


In [None]:
--Last copy commands with no latency but limited to 14 days.
select * from  sf_cert_prep.information_schema.LOAD_HISTORY where table_name='ORDERS';


In [None]:
SELECT query_id, user_name, warehouse_name, total_elapsed_time/1000 AS sec, query_text
FROM TABLE(information_schema.QUERY_HISTORY(
  end_time_range_start => dateadd('hour', -1, current_timestamp()),
  end_time_range_end   => current_timestamp(),
  result_limit         => 10000
))
ORDER BY total_elapsed_time DESC
LIMIT 20;

In [None]:
SELECT warehouse_name,
       DATE_TRUNC('hour', end_time) AS hour_utc,
       AVG(avg_queued_load)         AS avg_q_overload
FROM TABLE(information_schema.warehouse_load_history(
       date_range_start => dateadd('day', -1, current_timestamp()),
       date_range_end   => current_timestamp()
))
GROUP BY 1,2
ORDER BY avg_q_overload DESC
LIMIT 20;

In [None]:
SELECT warehouse_name,
       ROUND(SUM(credits_used), 2)                  AS credits,
       ROUND(SUM(credits_used_compute), 2)          AS compute,
       ROUND(SUM(credits_used_cloud_services), 2)   AS cloud_services
FROM TABLE(information_schema.warehouse_metering_history(
       date_range_start => dateadd('day', -7, current_date()),
       date_range_end   => current_date()
))
GROUP BY 1
ORDER BY credits DESC;

In [None]:
SELECT last_load_time, status, file_name, row_count, pipe_name
FROM TABLE(information_schema.copy_history(
       table_name => 'sf_cert_prep.public.orders',
       start_time => dateadd('day', -14, current_timestamp())
))
ORDER BY last_load_time DESC;

In [None]:
SELECT event_timestamp, user_name, client_ip, reported_client_type, is_success
FROM TABLE(information_schema.login_history(
       time_range_start => dateadd('day', -6, current_timestamp())
))
WHERE is_success = 'NO'
ORDER BY event_timestamp DESC;

In [None]:
SELECT
table_schema, table_name, row_count, bytes
FROM SF_CERT_PREP.INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'PUBLIC'
ORDER BY bytes DESC
LIMIT 50;

In [None]:
SELECT 
id, table_schema, table_name, active_bytes, time_travel_bytes, failsafe_bytes, retained_for_clone_bytes, TABLE_DROPPED
--*
FROM INFORMATION_SCHEMA.TABLE_STORAGE_METRICS
WHERE table_schema = 'PUBLIC'
ORDER BY (active_bytes + time_travel_bytes + failsafe_bytes + retained_for_clone_bytes) DESC
LIMIT 50;

# ACCOUNT_USAGE -> main views / functions:

## SNOWFLAKE.ACCOUNT_USAGE

### LOAD_HISTORY view

history of data loaded into tables using the COPY INTO <table> command within the last 365 days (1 year)

Does not return the history of data loaded using Snowpipe -> use COPY_HISTORY

### COPY_HISTORY view

Displays load activity for both COPY INTO <table> statements and continuous data loading using Snowpipe.


### PIPE_USAGE_HISTORY view

The view displays the history of data loaded and credits billed for your entire Snowflake account.

### METERING_HISTORY view

Return the hourly credit usage for an account within the last 365 days (1 year).

### WAREHOUSE_METERING_HISTORY view

Hourly credit usage for a single warehouse (or all the warehouses in your account) within the last 365 days (1 year).

### QUERY_HISTORY view

Query Snowflake query history by various dimensions (time range, session, user, warehouse, and so on) within the last 365 days (1 year)

### LOGIN_HISTORY view

Query login attempts by Snowflake users within the last 365 days (1 year).

### DATABASE_STORAGE_USAGE_HISTORY view

Query the average daily storage usage, in bytes, for databases in the account for the last 365 days (1 year). 

The data includes:

- All data stored in tables in the database(s).
- All historical data maintained in Fail-safe for the database(s).

### GRANTS_TO_ROLES view

Query access control privileges that have been granted to an account role, application, application role, database role, instance role, or user.

### GRANTS_TO_USERS view

This Account Usage view can be used to query the roles that have been granted to a user.

### TASK_HISTORY view

History of task usage within the last 365 days (1 year). The view displays one row for each run of a task in the history

In [None]:
-- need to be data on snowflake.account_usage since it needs the last 365 days
-- load_history does not include snowpipe
SELECT * FROM  SNOWFLAKE.ACCOUNT_USAGE.LOAD_HISTORY WHERE TABLE_NAME='ORDERS';


In [None]:
-- need to be data on snowflake.account_usage since it needs the last 365 days
-- load_history does not include snowpipe
SELECT * FROM  SNOWFLAKE.ACCOUNT_USAGE.COPY_HISTORY WHERE TABLE_NAME='ORDERS';


In [None]:
--  METERING_HISTORY provides the credit usage data at an hourly level. 
-- The view provides the start and end times during which credit usage occurred.
-- It also provides a breakup of the information according to the service that contributed to the credit usage, such as Virtual Warehouse compute usage, Snowpipe, Automatic Clustering, etc.

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.METERING_HISTORY;

In [ ]:
SELECT warehouse_name,
       ROUND(SUM(credits_used),2) AS credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day,-7,CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY credits DESC;

In [None]:
SELECT warehouse_name,
       SUM(credits_used_compute)               AS compute_credits,
       SUM(credits_attributed_compute_queries) AS query_credits,
       SUM(credits_used_compute) - SUM(credits_attributed_compute_queries) AS idle_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD(day, -10, CURRENT_DATE())
GROUP BY 1
ORDER BY idle_credits DESC;


In [None]:
SELECT query_id, user_name, warehouse_name, total_elapsed_time/1000 AS sec, query_text
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE start_time >= DATEADD(hour,-24,CURRENT_TIMESTAMP())
ORDER BY total_elapsed_time DESC
LIMIT 20;

In [None]:
SELECT user_name, COUNT(*) AS failures
FROM SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY
WHERE is_success = 'NO'
  AND event_timestamp >= DATEADD(day,-30,CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY failures DESC;

In [None]:
SELECT database_name,
       ROUND(SUM(average_database_bytes)/(1024*1024*1024),2) AS gb_days
FROM SNOWFLAKE.ACCOUNT_USAGE.DATABASE_STORAGE_USAGE_HISTORY
WHERE usage_date >= DATEADD(day,-30,CURRENT_DATE())
GROUP BY 1
ORDER BY gb_days DESC;

In [None]:
SELECT grantee_name AS role, privilege, granted_on, name AS object_name, granted_by
FROM SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLES
WHERE deleted_on IS NULL
ORDER BY role, granted_on, object_name;

In [None]:
--  GRANTS_TO_ROLES view can be used to view information about access privileges granted to a role. 
--  This view also contains historical information (up to 365 days), so privileges that have been granted and revoked in the last 365 days will also be shown.
--  https://docs.snowflake.com/en/sql-reference/account-usage/grants_to_users

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_USERS
WHERE GRANTEE_NAME = CURRENT_USER() LIMIT 100;


In [None]:
SELECT name AS task_name, state, error_code, error_message, completed_time
FROM SNOWFLAKE.ACCOUNT_USAGE.TASK_HISTORY
WHERE completed_time >= DATEADD(day,-7,CURRENT_TIMESTAMP())
  AND state = 'FAILED'
ORDER BY completed_time DESC;