## Comparing Annual Share Repurchases Across Companies Using Snowflake PIVOT function.
### Author: [Prasanna Rajagopal](https://www.linkedin.com/in/prasannarajagopal/)

## [PIVOT](https://docs.snowflake.com/en/sql-reference/constructs/pivot)

Rotates a table by turning the unique values from one column in the input expression into multiple columns and aggregating results where required on any remaining column values. In a query, it is specified in the FROM clause after the table name or subquery.

The operator supports the built-in aggregate functions AVG, COUNT, MAX, MIN, and SUM.

PIVOT can be used to transform a narrow table (for example, empid, month, sales) into a wider table (for example, empid, jan_sales, feb_sales, mar_sales).

#### Ask Snowflake Cortex AI COMPLETE function abotu the uses and value in analyzing or comparing values using the PIVOT function

In [None]:
SELECT SNOWFLAKE.CORTEX.COMPLETE('claude-3-5-sonnet', 'Explain the PIVOT FUNCTION in SQL, if possible, specifically describe Snowflake\'s PIVOT function. Also, describe its uses and value in analyzing and/or comparing values between the Pivot columns. For instance, I am using the Pivot function to compare the annual share repurchases made by Apple, Microsoft, Google, and amazon. I am also using the Pivot function compare the capex of apple, Microsoft, google, meta, and Amazon');

#### Here's part of the answer generated by Anthropic's Claude Sonnet LLM:
The PIVOT function is particularly valuable when:
1. Comparing multiple entities
2. Analyzing time-series data
3. Creating executive dashboards
4. Preparing data for visualization
5. Conducting competitive analysis

## Stage for Hosting Company Share Repurchase Data
Create a stage in Snowflake to host the share repurchase data downloaded from the [SEC](https://www.sec.gov/search-filings). 
The data files are in the [JSON](https://en.wikipedia.org/wiki/JSON) format.  

In [None]:
CREATE STAGE US_GAAP_Financials_Int_Stg 
	DIRECTORY = ( ENABLE = true ) 
	ENCRYPTION = ( TYPE = 'SNOWFLAKE_SSE' ) 
	COMMENT = 'Store the JSON Files of US Financials of Publicly Listed Companies. Downloaded From the U.S. SEC.';

## Instructions to upload the JSON share repurchase data files into the stage.

- #### You can download the JSON Data files stored as a zip archive from [Github here](https://github.com/rrprasan/Finance/tree/main/Snowflake/Notebooks/Company_Financials/Share_Repurchase_Comparison).
    - ##### The zip file is titled: Share_Repurchase_Data.zip
    - ##### Unzip the files in your local drive. 
    - ##### Copy the data files into the Snowflake internal stage - US_GAAP_Financials_Int_Stg.

- #### Follow the [following steps](https://docs.snowflake.com/en/user-guide/data-load-local-file-system-stage-ui#upload-files-onto-a-named-internal-stage) in Snowsight to upload the files into the stage.  
#### 1. Sign in to Snowsight.
#### 2. Select Data » Add Data.
#### 3. On the Add Data page, select Load files into a Stage.
#### 4. In the Upload Your Files dialog that appears, select the files that you want to upload. You can upload multiple files at the same time.
#### 5. Select the database schema in which you created the stage, then select the stage.
#### 6. Optionally, select or create a path where you want to save your files within the stage.
#### 7. Select Upload.

#### The following OHLC data files are in the zipped archive:
- Apple
- Google
- Meta
- Microsoft
- Oracle

## Create Table STOCK_REPURCHASES_RAW_TBL to Store the JSON Data in a Column of VARIANT Datatype 

In [None]:
CREATE TRANSIENT TABLE STOCK_REPURCHASES_RAW_TBL
(
    STOCK_REPURCHASES_JSON VARIANT
);

### COPY INTO <TABLE> to Load the CSV Data into EQUITIES_OHLC_DATA_TBL

The [COPY INTO](https://docs.snowflake.com/en/sql-reference/sql/copy-into-table) command gives you an easy and flexible tool to load data from files into a Snowflake database. 

Loads data from files to an existing table. The files must already be in one of the following locations:

Named internal stage (or table/user stage). Files can be staged using the PUT command.

Named external stage that references an external location (Amazon S3, Google Cloud Storage, or Microsoft Azure).

You cannot access data held in archival cloud storage classes that requires restoration before it can be retrieved. These archival storage classes include, for example, the Amazon S3 Glacier Flexible Retrieval or Glacier Deep Archive storage class, or Microsoft Azure Archive Storage.

External location (Amazon S3, Google Cloud Storage, or Microsoft Azure).

### The JSON data files are in a directory structure. 
### Please provide the directory structure in the copy command. 
### You do not have to create a directory while storing the JSON files in the stage.  

In [None]:
COPY INTO STOCK_REPURCHASES_RAW_TBL FROM @US_GAAP_Financials_Int_Stg/Cash_Flow_Stmt/Financing/Stock_Repurchases
FILE_FORMAT = (TYPE = JSON);

In [None]:
SELECT * FROM STOCK_REPURCHASES_RAW_TBL;

In [None]:
SELECT
    STOCK_REPURCHASES_JSON:"cik",
    STOCK_REPURCHASES_JSON:"entityName"::VARCHAR COMPANY_NAME,
    STOCK_REPURCHASES_JSON:"label"::VARCHAR LABEL,
    STOCK_REPURCHASES_JSON:"tag"::VARCHAR TAG,
    SR_FLAT.VALUE:"fy"::VARCHAR FY,
    SR_FLAT.VALUE:"start"::DATE FY_PERIOD_START_DATE,
    SR_FLAT.VALUE:"end"::DATE FY_PERIOD_END_DATE,
    SR_FLAT.VALUE:"val"::NUMBER SHARE_REPURCHASE
FROM
    STOCK_REPURCHASES_RAW_TBL,
    LATERAL FLATTEN(input => STOCK_REPURCHASES_JSON:"units":"USD") sr_flat

## Using LATERAL FLATTEN to extract value of each value.  
The FLATTEN function takes a semi-structured column and creates a separate row for each element in an array or object.
## Using DISTINCT SQL function is filter out the duplicate values from the JSON file. 
When companies file their financials with the SEC, they repeat financial data to make easy for investors to compare financial performance across quarters or years. 
We need a way to filter the data so that the repetitive data is not returned in the SQL resultset. We use DISTINCT SQL function for filtering.
## Using DATEDIFF to filter for annual data.
We also wish to look at only the annual data. So, we use the DATEDIFF SQL function to look for number of days in the reporting time frame. 

## Query Explanation Provided by [Snowflake Copilot](https://docs.snowflake.com/en/user-guide/snowflake-copilot)
This SQL query is designed to retrieve distinct company names, the fiscal year, the number of days in the fiscal period, and the share repurchase amount from a table named STOCK_REPURCHASES_RAW_TBL. The query first flattens the nested JSON data in the STOCK_REPURCHASES_JSON column to extract relevant information. It then filters the results to only include fiscal periods with more than 300 days and orders the output by company name in ascending order and fiscal year in descending order. The query uses the DISTINCT keyword to eliminate duplicate rows, and the EXTRACT and DATEDIFF functions to extract the year from the fiscal period end date and calculate the number of days in the fiscal period, respectively. The query also uses the LATERAL FLATTEN function to flatten the nested JSON data in the STOCK_REPURCHASES_JSON column.

In [None]:
SELECT
    DISTINCT COMPANY_NAME,
    EXTRACT(YEAR FROM FY_PERIOD_END_DATE) FY_YEAR,
    DATEDIFF(DAY, FY_PERIOD_START_DATE, FY_PERIOD_END_DATE) NUMBER_OF_DAYS_IN_PERIOD,
    SHARE_REPURCHASE_AMOUNT
FROM
(SELECT
    STOCK_REPURCHASES_JSON:"cik",
    STOCK_REPURCHASES_JSON:"entityName"::VARCHAR COMPANY_NAME,
    STOCK_REPURCHASES_JSON:"label"::VARCHAR LABEL,
    STOCK_REPURCHASES_JSON:"tag"::VARCHAR TAG,
    SR_FLAT.VALUE:"fy"::VARCHAR FY,
    SR_FLAT.VALUE:"start"::DATE FY_PERIOD_START_DATE,
    SR_FLAT.VALUE:"end"::DATE FY_PERIOD_END_DATE,
    SR_FLAT.VALUE:"val"::NUMBER SHARE_REPURCHASE_AMOUNT
FROM
    STOCK_REPURCHASES_RAW_TBL,
    LATERAL FLATTEN(input => STOCK_REPURCHASES_JSON:"units":"USD") sr_flat)
WHERE
    NUMBER_OF_DAYS_IN_PERIOD > 300
ORDER BY COMPANY_NAME ASC, FY_YEAR DESC;

## Create a view using the above query.

In [None]:
CREATE OR REPLACE VIEW ANNUAL_SHARE_REPURCHASE_AMOUNT_VW
AS
SELECT
    DISTINCT COMPANY_NAME,
    EXTRACT(YEAR FROM FY_PERIOD_END_DATE) FY_YEAR,
    DATEDIFF(DAY, FY_PERIOD_START_DATE, FY_PERIOD_END_DATE) NUMBER_OF_DAYS_IN_PERIOD,
    SHARE_REPURCHASE_AMOUNT
FROM
(SELECT
    STOCK_REPURCHASES_JSON:"cik",
    STOCK_REPURCHASES_JSON:"entityName"::VARCHAR COMPANY_NAME,
    STOCK_REPURCHASES_JSON:"label"::VARCHAR LABEL,
    STOCK_REPURCHASES_JSON:"tag"::VARCHAR TAG,
    SR_FLAT.VALUE:"fy"::VARCHAR FY,
    SR_FLAT.VALUE:"start"::DATE FY_PERIOD_START_DATE,
    SR_FLAT.VALUE:"end"::DATE FY_PERIOD_END_DATE,
    SR_FLAT.VALUE:"val"::NUMBER SHARE_REPURCHASE_AMOUNT
FROM
    STOCK_REPURCHASES_RAW_TBL,
    LATERAL FLATTEN(input => STOCK_REPURCHASES_JSON:"units":"USD") sr_flat)
WHERE
    NUMBER_OF_DAYS_IN_PERIOD > 300
ORDER BY COMPANY_NAME ASC, FY_YEAR DESC;

## Eliminating the NUMBER_OF_DAYS_IN_PERIOD column in the View. 
- Creating a new view.  
- This method allows you to keep the orinial view.
- An optional step.  
- Do not wish to use this column in the PIVOT table. 

In [None]:
CREATE OR REPLACE VIEW ANNUAL_SHARE_REPURCHASE_AMOUNT_MINUS_NUM_DAYS_IN_PERIOD_VW
AS
SELECT 
    *
    EXCLUDE NUMBER_OF_DAYS_IN_PERIOD
FROM ANNUAL_SHARE_REPURCHASE_AMOUNT_VW;

In [None]:
SELECT * FROM ANNUAL_SHARE_REPURCHASE_AMOUNT_MINUS_NUM_DAYS_IN_PERIOD_VW;

## Using PIVOT Function to Compare Share Repurchases.  

### Query Explanation Provided by [Snowflake Copilot](https://docs.snowflake.com/en/user-guide/snowflake-copilot)
The given SQL query is selecting the fiscal year, and the total share repurchase amount for five companies (Microsoft, Apple, Alphabet, Meta, and Oracle) from the ANNUAL_SHARE_REPURCHASE_AMOUNT_MINUS_NUM_DAYS_IN_PERIOD_VW view. The query uses the PIVOT function to transform the data from a row-based format to a column-based format, with each company's share repurchase amount displayed in a separate column. The query also filters the results to only include fiscal years greater than 2012 and orders the output by fiscal year in descending order.

### Comparing Share Repurchases Made by Apple, Microsoft, Alphabet, Meta, and Oracle.

In [None]:
SELECT
    TO_VARCHAR(TO_DATE(TO_VARCHAR("FISCAL YEAR"), 'YYYY'), 'YYYY') FISCAL_YEAR,
    TO_VARCHAR(MICROSOFT, '$999,999,999,999') MICROSOFT,
    TO_VARCHAR(APPLE, '$999,999,999,999') APPLE,
    TO_VARCHAR(ALPHABET, '$999,999,999,999') ALPHABET,
    TO_VARCHAR(META, '$999,999,999,999') META,
    TO_VARCHAR(ORACLE, '$999,999,999,999') ORACLE,
FROM
    ANNUAL_SHARE_REPURCHASE_AMOUNT_MINUS_NUM_DAYS_IN_PERIOD_VW
    PIVOT(SUM(SHARE_REPURCHASE_AMOUNT) FOR COMPANY_NAME IN ('MICROSOFT CORPORATION', 'Apple Inc.', 'Alphabet Inc.', 'Meta Platforms, Inc.', 'Oracle Corporation')) P ("FISCAL YEAR", MICROSOFT, APPLE, ALPHABET, META, ORACLE)
WHERE
    "FISCAL YEAR" > 2012
ORDER BY "FISCAL YEAR" DESC

## In FY 2024, Apple spent 5.5x more money on share purchases than Microsoft.
**In 2024, Apple spent \$94.9 billion on share repurchases compared to Microsoft's \$17.2 billion.**



In [None]:
SELECT
    TO_VARCHAR(TO_DATE(TO_VARCHAR("FISCAL YEAR"), 'YYYY'), 'YYYY') FISCAL_YEAR,
    TO_VARCHAR(MICROSOFT, '$999,999,999,999') MICROSOFT,
    TO_VARCHAR(APPLE, '$999,999,999,999') APPLE,
    APPLE/MICROSOFT APPLE_TO_MICROSOFT_REPURCHASE_RATIO,
    TO_VARCHAR(ALPHABET, '$999,999,999,999') ALPHABET,
    TO_VARCHAR(META, '$999,999,999,999') META,
    TO_VARCHAR(ORACLE, '$999,999,999,999') ORACLE,
FROM
    ANNUAL_SHARE_REPURCHASE_AMOUNT_MINUS_NUM_DAYS_IN_PERIOD_VW
    PIVOT(SUM(SHARE_REPURCHASE_AMOUNT) FOR COMPANY_NAME IN ('MICROSOFT CORPORATION', 'Apple Inc.', 'Alphabet Inc.', 'Meta Platforms, Inc.', 'Oracle Corporation')) P ("FISCAL YEAR", MICROSOFT, APPLE, ALPHABET, META, ORACLE)
WHERE
    "FISCAL YEAR" > 2012
ORDER BY "FISCAL YEAR" DESC