### Finviz Data Cleaning and Transformation Pipeline

This notebook processes the raw data downloaded from Finviz, cleans it, and transforms it into an analysis-ready format.

**Workflow:**

1.  **Load Data:** Reads a raw Finviz `.parquet` file for a specific date.
2.  **Feature Engineering:** Creates new composite columns (`Info`, `MktCap AUM`).
3.  **Data Type Conversion:**
    *   Converts currency strings (e.g., `1.5B`, `250K`) into numeric values in millions.
    *   Converts percentage strings (e.g., `12.5%`) into numeric values.
    *   Converts other object columns to their proper numeric types.
4.  **Final Processing:** Sorts the data by market capitalization, sets the `Ticker` as the index, and adds a `Rank` column.
5.  **Save & Verify:** Saves the cleaned DataFrame to a new `.parquet` file and verifies the saved file.

### Setup and Configuration

**This is the only cell you need to modify.** It contains all imports, paths, and lists of columns for processing.

In [1]:
import sys
from pathlib import Path
import pandas as pd
import numpy as np

# --- Project Path Setup ---
NOTEBOOK_DIR = Path.cwd()
ROOT_DIR = NOTEBOOK_DIR.parent if NOTEBOOK_DIR.name == 'notebooks' else NOTEBOOK_DIR
if str(ROOT_DIR) not in sys.path:
    sys.path.append(str(ROOT_DIR))

SRC_DIR = ROOT_DIR / 'src'
if str(SRC_DIR) not in sys.path:
    sys.path.append(str(SRC_DIR))

# Import config and custom utils now that path is set
from config import DATE_STR, DOWNLOAD_DIR, DEST_DIR
import utils

# --- File Path Configuration ---
# Build paths using pathlib for cross-platform compatibility
SOURCE_PATH = Path(DOWNLOAD_DIR) / f'df_finviz_{DATE_STR}_stocks_etfs.parquet'
DEST_PATH = Path(DEST_DIR) / f'{DATE_STR}_df_finviz_stocks_etfs.parquet'

# --- Column Processing Configuration ---
# Define which columns need specific cleaning operations.

# Columns to combine into the 'Info' column
INFO_COLS = ["Sector", "Industry", "Single Category", "Asset Type"]

# Columns with abbreviated currency values (B, M, K) to be converted to millions
CURRENCY_COLS = [
    'Market Cap', 'AUM', 'Sales', 'Income', 'Outstanding', 'Float', 
    'Short Interest', 'Avg Volume', 'Flows 1M', 'Flows 3M', 'Flows YTD',
    'MktCap AUM' # This is the new column we create
]

# Other columns that are numeric but stored as strings (objects)
# Note: Percentage columns are detected automatically in Step 3.
OTHER_NUMERIC_COLS = [
    "No.", "P/E", "Fwd P/E", "PEG", "P/S", "P/B", "P/C", "P/FCF",
    "Book/sh", "Cash/sh", "Dividend TTM", "EPS", "EPS next Q", "Short Ratio",
    "Curr R", "Quick R", "LTDebt/Eq", "Debt/Eq", "Beta", "ATR", "RSI",
    "Employees", "Recom", "Rel Volume", "Volume", "Target Price",
    "Prev Close", "Open", "High", "Low", "Price", "Holdings"
]

# --- Notebook Setup ---
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 200)
pd.set_option('display.width', 2500)
%load_ext autoreload
%autoreload 2

# --- Verification ---
print(f"Source file: {SOURCE_PATH}")
print(f"Destination file: {DEST_PATH}")
print(f"Processing for date: {DATE_STR}")

Source file: C:\Users\ping\Downloads\df_finviz_2025-06-26_stocks_etfs.parquet
Destination file: c:\Users\ping\Files_win10\python\py311\stocks_v0_works\data\2025-06-26_df_finviz_stocks_etfs.parquet
Processing for date: 2025-06-26


### Step 1: Load Raw Data

Load the source Parquet file into a pandas DataFrame.

In [2]:
print(f"--- Step 1: Loading data from {SOURCE_PATH.name} ---")

try:
    df = pd.read_parquet(SOURCE_PATH, engine='pyarrow')
    print("Data loaded successfully.")
    df.info()
    display(df.head(3))
except FileNotFoundError:
    print(f"ERROR: Source file not found at {SOURCE_PATH}")
    df = None  # Ensure df is None if loading fails
except Exception as e:
    print(f"An error occurred during file loading: {e}")
    df = None

--- Step 1: Loading data from df_finviz_2025-06-26_stocks_etfs.parquet ---
Data loaded successfully.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1560 entries, 0 to 1559
Columns: 111 entries, No. to Tags
dtypes: object(111)
memory usage: 1.3+ MB


Unnamed: 0,No.,Ticker,Company,Index,Sector,Industry,Country,Exchange,Market Cap,P/E,Fwd P/E,PEG,P/S,P/B,P/C,P/FCF,Book/sh,Cash/sh,Dividend,Dividend TTM,Dividend Ex Date,Payout Ratio,EPS,EPS next Q,EPS This Y,EPS Next Y,EPS Past 5Y,EPS Next 5Y,Sales Past 5Y,Sales Q/Q,EPS Q/Q,EPS YoY TTM,Sales YoY TTM,Sales,Income,EPS Surprise,Revenue Surprise,Outstanding,Float,Float %,Insider Own,Insider Trans,Inst Own,Inst Trans,Short Float,Short Ratio,Short Interest,ROA,ROE,ROIC,Curr R,Quick R,LTDebt/Eq,Debt/Eq,Gross M,Oper M,Profit M,Perf Week,Perf Month,Perf Quart,Perf Half,Perf Year,Perf YTD,Beta,ATR,Volatility W,Volatility M,SMA20,SMA50,SMA200,50D High,50D Low,52W High,52W Low,52W Range,All-Time High,All-Time Low,RSI,Earnings,IPO Date,Optionable,Shortable,Employees,Change from Open,Gap,Recom,Avg Volume,Rel Volume,Volume,Target Price,Prev Close,Open,High,Low,Price,Change,Single Category,Asset Type,Expense,Holdings,AUM,Flows 1M,Flows% 1M,Flows 3M,Flows% 3M,Flows YTD,Flows% YTD,Return% 1Y,Return% 3Y,Return% 5Y,Tags
0,741,SOLV,Solventum Corp,S&P 500,Healthcare,Medical Instruments & Supplies,USA,NYSE,12.91B,34.39,12.55,-,1.55,3.96,24.18,33.02,18.85,3.09,-,-,-,0.00%,2.17,1.45,-16.86%,6.73%,-,-0.19%,-,2.68%,-42.89%,-70.83%,1.28%,8.31B,378.00M,10.64%,2.66%,173.01M,138.41M,80.00%,20.00%,0.00%,67.78%,5.05%,3.03%,3.58,4.20M,2.59%,10.63%,3.41%,1.19,0.85,2.4,2.53,55.45%,10.67%,4.55%,2.75%,2.90%,0.53%,11.67%,45.48%,12.97%,0.69,1.7,2.05%,2.04%,0.67%,5.01%,4.35%,-2.29%,18.25%,-13.14%,58.25%,47.16 - 85.92,-22.30%,58.25%,55.85,May 08/a,3/26/2024,Yes,Yes,22000,-0.39%,-0.04%,2.77,1.17M,0.53,616831,81.57,74.95,74.92,75.68,74.28,74.63,-0.43%,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-
1,742,REG,Regency Centers Corporation,S&P 500,Real Estate,REIT - Retail,USA,NASD,12.88B,33.27,28.89,3.83,8.75,1.97,164.02,33.55,35.76,0.43,4.02%,2.78,6/11/2025,128.50%,2.12,0.55,8.25%,6.90%,8.16%,8.69%,6.21%,4.44%,1.46%,3.41%,7.39%,1.47B,386.55M,2.98%,3.56%,181.05M,180.20M,99.53%,0.73%,-8.52%,103.47%,0.69%,3.45%,5.5,6.21M,3.17%,5.83%,3.31%,0.97,0.97,0.74,0.79,43.40%,35.96%,26.24%,-0.25%,-2.22%,-3.14%,-4.42%,13.30%,-4.59%,1.01,1.18,1.51%,1.53%,-1.14%,-1.73%,-2.76%,-5.06%,1.73%,-9.77%,16.33%,60.64 - 78.18,-24.55%,366.38%,45.07,Apr 29/a,10/29/1993,Yes,Yes,500,0.54%,0.19%,1.7,1.13M,1.06,1198269,78.89,70.03,70.16,70.6,69.95,70.54,0.73%,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-
2,743,PAA,Plains All American Pipeline LP,-,Energy,Oil & Gas Midstream,USA,NASD,12.84B,19.65,12.19,13.74,0.26,1.69,30.08,6.24,10.8,0.61,8.50%,1.40,5/1/2025,173.45%,0.93,0.35,7.18%,-7.41%,-22.68%,1.43%,8.56%,-1.01%,68.23%,-20.48%,3.40%,50.17B,653.00M,-10.51%,-15.05%,703.78M,457.18M,64.96%,35.00%,0.00%,39.16%,-5.71%,2.38%,2.95,10.87M,3.46%,9.43%,3.60%,1.01,0.94,0.88,0.93,3.96%,3.19%,1.30%,-0.81%,8.18%,-10.40%,7.79%,3.93%,6.91%,0.66,0.46,2.35%,2.37%,3.65%,5.15%,0.70%,-3.08%,14.88%,-13.05%,17.24%,15.57 - 21.00,-70.11%,508.67%,59.94,May 09/b,11/18/1998,Yes,Yes,4200,1.39%,0.33%,2.4,3.68M,0.58,2152966,21.0,17.95,18.01,18.27,17.96,18.26,1.73%,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-


### Step 2: Feature Engineering - Create Composite Columns

Combine existing columns to create more meaningful features: `Info` and `MktCap AUM`.

In [3]:
if df is not None:
    print("\n--- Step 2: Engineering new features ---")
    
    # 1. Create 'Info' column by concatenating category columns.
    for col in INFO_COLS:
        if col in df.columns:
            df[col] = df[col].replace('-', '')
    df['Info'] = df[INFO_COLS].apply(lambda row: ', '.join(filter(None, row.astype(str))), axis=1)
    print("Created 'Info' column.")

    # 2. Create 'MktCap AUM' by concatenating 'Market Cap' and 'AUM'.
    # This combines stock and ETF liquidity metrics into a single string column for now.
    # It will be converted to numeric in the next step.
    df['MktCap AUM'] = df['Market Cap'].replace('-', '') + df['AUM'].replace('-', '')
    print("Created 'MktCap AUM' column.")

    # Display the new columns for verification
    display(df[['Ticker', 'Info', 'MktCap AUM']].head(3))


--- Step 2: Engineering new features ---


Created 'Info' column.
Created 'MktCap AUM' column.


Unnamed: 0,Ticker,Info,MktCap AUM
0,SOLV,"Healthcare, Medical Instruments & Supplies",12.91B
1,REG,"Real Estate, REIT - Retail",12.88B
2,PAA,"Energy, Oil & Gas Midstream",12.84B


### Step 3: Data Type Conversion

This multi-part step cleans and converts all string-based numeric and percentage columns into proper numeric types.

#### Part A: Convert Abbreviated Currency Columns to Millions

In [4]:
def convert_to_millions(value: str) -> float:
    """Converts a string with a T/B/M/K suffix to a numeric value in millions."""
    if pd.isna(value):
        return np.nan
    
    value_str = str(value).strip().upper()
    if not value_str:
        return np.nan

    multipliers = {'T': 1_000_000, 'B': 1_000, 'M': 1, 'K': 0.001}
    suffix = value_str[-1]
    
    if suffix in multipliers:
        number_part = value_str[:-1]
        try:
            return float(number_part) * multipliers[suffix]
        except (ValueError, TypeError):
            return np.nan
    return np.nan

if df is not None:
    print("\n--- Step 3a: Converting currency columns to millions ---")
    new_names = {}
    for col in CURRENCY_COLS:
        if col in df.columns:
            df[col] = df[col].apply(convert_to_millions)
            new_names[col] = f"{col}, M"
    
    df.rename(columns=new_names, inplace=True)
    print(f"Converted and renamed {len(new_names)} columns.")
    display(df[[name for name in new_names.values() if name in df.columns]].head(3))


--- Step 3a: Converting currency columns to millions ---
Converted and renamed 12 columns.


Unnamed: 0,"Market Cap, M","AUM, M","Sales, M","Income, M","Outstanding, M","Float, M","Short Interest, M","Avg Volume, M","Flows 1M, M","Flows 3M, M","Flows YTD, M","MktCap AUM, M"
0,12910.0,,8310.0,378.0,173.01,138.41,4.2,1.17,,,,12910.0
1,12880.0,,1470.0,386.55,181.05,180.2,6.21,1.13,,,,12880.0
2,12840.0,,50170.0,653.0,703.78,457.18,10.87,3.68,,,,12840.0


#### Part B: Convert Percentage Columns to Numeric

In [5]:
if df is not None:
    print("\n--- Step 3b: Converting percentage columns ---")
    percent_cols = [
        col for col in df.columns if df[col].dtype == 'object' and df[col].str.endswith('%', na=False).any()
    ]

    if not percent_cols:
        print("No new percentage columns found to modify.")
    else:
        print("Processing the following percentage columns:")
        for col in percent_cols:
            df[col] = pd.to_numeric(df[col].str.replace('%', ''), errors='coerce')
            new_name = f"{col} %" if '%' not in col else col
            df.rename(columns={col: new_name}, inplace=True)
            print(f"  - Converted '{col}' to numeric and renamed to '{new_name}'")


--- Step 3b: Converting percentage columns ---
Processing the following percentage columns:
  - Converted 'Dividend' to numeric and renamed to 'Dividend %'
  - Converted 'Payout Ratio' to numeric and renamed to 'Payout Ratio %'
  - Converted 'EPS This Y' to numeric and renamed to 'EPS This Y %'
  - Converted 'EPS Next Y' to numeric and renamed to 'EPS Next Y %'
  - Converted 'EPS Past 5Y' to numeric and renamed to 'EPS Past 5Y %'
  - Converted 'EPS Next 5Y' to numeric and renamed to 'EPS Next 5Y %'
  - Converted 'Sales Past 5Y' to numeric and renamed to 'Sales Past 5Y %'
  - Converted 'Sales Q/Q' to numeric and renamed to 'Sales Q/Q %'
  - Converted 'EPS Q/Q' to numeric and renamed to 'EPS Q/Q %'
  - Converted 'EPS YoY TTM' to numeric and renamed to 'EPS YoY TTM %'
  - Converted 'Sales YoY TTM' to numeric and renamed to 'Sales YoY TTM %'
  - Converted 'EPS Surprise' to numeric and renamed to 'EPS Surprise %'
  - Converted 'Revenue Surprise' to numeric and renamed to 'Revenue Surprise 

  - Converted 'Profit M' to numeric and renamed to 'Profit M %'
  - Converted 'Perf Week' to numeric and renamed to 'Perf Week %'
  - Converted 'Perf Month' to numeric and renamed to 'Perf Month %'
  - Converted 'Perf Quart' to numeric and renamed to 'Perf Quart %'
  - Converted 'Perf Half' to numeric and renamed to 'Perf Half %'
  - Converted 'Perf Year' to numeric and renamed to 'Perf Year %'
  - Converted 'Perf YTD' to numeric and renamed to 'Perf YTD %'
  - Converted 'Volatility W' to numeric and renamed to 'Volatility W %'
  - Converted 'Volatility M' to numeric and renamed to 'Volatility M %'
  - Converted 'SMA20' to numeric and renamed to 'SMA20 %'
  - Converted 'SMA50' to numeric and renamed to 'SMA50 %'
  - Converted 'SMA200' to numeric and renamed to 'SMA200 %'
  - Converted '50D High' to numeric and renamed to '50D High %'
  - Converted '50D Low' to numeric and renamed to '50D Low %'
  - Converted '52W High' to numeric and renamed to '52W High %'
  - Converted '52W Low' to n

#### Part C: Convert Other String-Based Numeric Columns

In [6]:
if df is not None:
    print("\n--- Step 3c: Converting other numeric string columns ---")
    converted_count = 0
    for col in OTHER_NUMERIC_COLS:
        if col in df.columns and df[col].dtype == 'object':
            df[col] = pd.to_numeric(df[col].str.replace(',', '', regex=False), errors='coerce')
            converted_count += 1
            
    print(f"Converted {converted_count} additional columns to numeric type.")
    print("\nData types after all conversions:")
    df.info()


--- Step 3c: Converting other numeric string columns ---
Converted 32 additional columns to numeric type.

Data types after all conversions:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1560 entries, 0 to 1559
Columns: 113 entries, No. to MktCap AUM, M
dtypes: float64(94), int64(2), object(17)
memory usage: 1.3+ MB


### Step 4: Final Processing - Sort, Index, and Rank

Sort the DataFrame by the unified liquidity metric, set the `Ticker` as the index, and add a final `Rank`.

In [7]:
if df is not None:
    print("\n--- Step 4: Finalizing DataFrame ---")
    
    # 1. Sort by the primary metric in descending order
    df.sort_values(by='MktCap AUM, M', ascending=False, inplace=True, na_position='last')
    print("Sorted DataFrame by 'MktCap AUM, M'.")
    
    # 2. Add a 'Rank' column based on the new sort order
    df['Rank'] = range(1, len(df) + 1)
    print("Added 'Rank' column.")
    
    # 3. Set 'Ticker' as the index
    if 'Ticker' in df.columns:
        df.set_index('Ticker', inplace=True)
        print("Set 'Ticker' as the index.")
    
    print("\nFinal DataFrame structure:")
    display(df[['Rank', 'Info', 'MktCap AUM, M']].head())


--- Step 4: Finalizing DataFrame ---
Sorted DataFrame by 'MktCap AUM, M'.
Added 'Rank' column.
Set 'Ticker' as the index.

Final DataFrame structure:


  df['Rank'] = range(1, len(df) + 1)


Unnamed: 0_level_0,Rank,Info,"MktCap AUM, M"
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
NVDA,1,"Technology, Semiconductors",3782490.0
MSFT,2,"Technology, Software - Infrastructure",3697320.0
AAPL,3,"Technology, Consumer Electronics",3002100.0
AMZN,4,"Consumer Cyclical, Internet Retail",2305020.0
GOOG,5,"Communication Services, Internet Content & Inf...",2111210.0


### Step 5: Save and Verify Cleaned Data

Save the final, cleaned DataFrame to a new Parquet file and read it back to verify integrity.

In [8]:
if df is not None:
    print("\n--- Step 5: Saving and verifying data ---")
    try:
        # Ensure destination directory exists
        DEST_PATH.parent.mkdir(parents=True, exist_ok=True)
        
        # Save the file
        df.to_parquet(DEST_PATH, engine='pyarrow', compression='zstd')
        print(f"Successfully saved cleaned data to: {DEST_PATH}")

        # Verify by loading it back
        loaded_df = pd.read_parquet(DEST_PATH, engine='pyarrow')
        print("\nVerification successful. First 20 rows of the saved file:")
        display(loaded_df.head(20))
        
    except Exception as e:
        print(f"An error occurred during save or verification: {e}")


--- Step 5: Saving and verifying data ---
Successfully saved cleaned data to: c:\Users\ping\Files_win10\python\py311\stocks_v0_works\data\2025-06-26_df_finviz_stocks_etfs.parquet

Verification successful. First 20 rows of the saved file:


Unnamed: 0_level_0,No.,Company,Index,Sector,Industry,Country,Exchange,"Market Cap, M",P/E,Fwd P/E,PEG,P/S,P/B,P/C,P/FCF,Book/sh,Cash/sh,Dividend %,Dividend TTM,Dividend Ex Date,Payout Ratio %,EPS,EPS next Q,EPS This Y %,EPS Next Y %,EPS Past 5Y %,EPS Next 5Y %,Sales Past 5Y %,Sales Q/Q %,EPS Q/Q %,EPS YoY TTM %,Sales YoY TTM %,"Sales, M","Income, M",EPS Surprise %,Revenue Surprise %,"Outstanding, M","Float, M",Float %,Insider Own %,Insider Trans %,Inst Own %,Inst Trans %,Short Float %,Short Ratio,"Short Interest, M",ROA %,ROE %,ROIC %,Curr R,Quick R,LTDebt/Eq,Debt/Eq,Gross M %,Oper M %,Profit M %,Perf Week %,Perf Month %,Perf Quart %,Perf Half %,Perf Year %,Perf YTD %,Beta,ATR,Volatility W %,Volatility M %,SMA20 %,SMA50 %,SMA200 %,50D High %,50D Low %,52W High %,52W Low %,52W Range,All-Time High %,All-Time Low %,RSI,Earnings,IPO Date,Optionable,Shortable,Employees,Change from Open %,Gap %,Recom,"Avg Volume, M",Rel Volume,Volume,Target Price,Prev Close,Open,High,Low,Price,Change %,Single Category,Asset Type,Expense %,Holdings,"AUM, M","Flows 1M, M",Flows% 1M,"Flows 3M, M",Flows% 3M,"Flows YTD, M",Flows% YTD,Return% 1Y,Return% 3Y,Return% 5Y,Tags,Info,"MktCap AUM, M",Rank
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1
NVDA,1,NVIDIA Corp,"DJIA, NDX, S&P 500",Technology,Semiconductors,USA,NASD,3782490.0,49.93,27.12,1.69,25.47,45.09,70.45,52.49,3.44,2.2,0.03,0.04,6/11/2025,1.16,3.1,1.0,44.35,32.42,91.83,29.57,64.24,69.18,27.6,81.36,86.17,148510.0,76770.0,9.89,1.68,24390.0,23400.0,95.97,4.08,-0.36,66.39,0.57,0.88,0.82,206.8,75.89,115.46,81.82,3.39,2.96,0.12,0.12,70.11,58.03,51.69,6.56,14.41,36.27,15.09,31.25,15.44,2.13,3.92,2.27,2.36,7.93,20.5,20.22,0.37,63.11,0.37,78.97,86.62 - 154.45,0.37,464959.99,74.45,May 28/a,1/22/1999,Yes,Yes,36000.0,-0.63,1.1,1.37,250.77,0.78,196761441,173.58,154.31,156.0,156.71,154.0,155.02,0.46,,,,,,,,,,,,,,,-,"Technology, Semiconductors",3782490.0,1
MSFT,2,Microsoft Corporation,"DJIA, NDX, S&P 500",Technology,Software - Infrastructure,USA,NASD,3697320.0,38.44,32.83,2.65,13.69,11.49,46.44,53.3,43.3,10.71,0.66,2.41,8/21/2025,25.42,12.94,3.37,13.5,13.12,18.45,14.53,14.33,13.27,17.88,12.1,14.13,270010.0,96640.0,7.38,2.38,7430.0,7320.0,98.5,1.48,-0.12,73.61,0.68,0.7,2.23,51.17,18.46,33.61,23.24,1.37,1.36,0.29,0.33,69.07,45.23,35.79,3.58,7.98,27.56,13.94,11.12,18.02,1.03,6.99,1.55,1.25,4.76,12.17,17.72,0.59,39.86,0.59,44.28,344.79 - 494.56,0.59,624220.41,79.42,Apr 30/a,3/13/1986,Yes,Yes,228000.0,0.92,0.13,1.31,22.91,0.94,21475419,519.16,492.27,492.93,498.04,492.81,497.45,1.05,,,,,,,,,,,,,,,-,"Technology, Software - Infrastructure",3697320.0,2
AAPL,3,Apple Inc,"DJIA, NDX, S&P 500",Technology,Consumer Electronics,USA,NASD,3002100.0,31.37,25.87,4.09,7.5,44.95,61.9,30.48,4.47,3.25,0.51,1.01,5/12/2025,16.11,6.41,1.41,6.21,8.39,15.41,7.67,8.51,5.08,7.68,-0.36,4.91,400370.0,97290.0,1.39,0.86,14940.0,14920.0,99.88,0.1,-1.28,63.81,-0.18,0.67,1.63,100.23,29.1,138.02,66.93,0.82,0.78,1.18,1.47,46.63,31.81,24.3,2.25,0.39,-9.27,-21.02,-3.43,-19.73,1.21,4.26,1.77,1.89,0.31,-0.81,-10.04,-6.32,5.89,-22.72,18.79,169.21 - 260.10,-22.72,315858.21,49.62,May 01/a,12/12/1980,Yes,Yes,164000.0,-0.12,-0.15,2.04,61.34,0.82,50495251,228.01,201.56,201.25,202.64,199.46,201.0,-0.28,,,,,,,,,,,,,,,-,"Technology, Consumer Electronics",3002100.0,3
AMZN,4,Amazon.com Inc,"DJIA, NDX, S&P 500",Consumer Cyclical,Internet Retail,USA,NASD,2305020.0,35.41,29.86,2.06,3.54,7.53,23.46,110.77,28.82,9.25,,,-,0.0,6.13,1.32,12.18,17.23,36.89,17.22,17.86,8.62,62.33,71.88,10.08,650310.0,65940.0,16.38,0.33,10610.0,9490.0,89.45,10.58,-0.02,64.43,0.38,0.65,1.27,61.84,11.23,25.24,15.02,1.05,0.84,0.44,0.49,49.16,11.15,10.14,2.16,5.39,7.95,-3.47,17.0,-1.03,1.33,5.1,2.28,2.06,2.7,8.52,6.07,-0.59,31.36,-10.47,43.21,151.61 - 242.52,-10.47,330749.53,62.03,May 01/a,5/15/1997,Yes,Yes,1556000.0,1.92,0.5,1.23,48.71,1.03,50181208,240.25,211.99,213.04,218.04,212.01,217.12,2.42,,,,,,,,,,,,,,,-,"Consumer Cyclical, Internet Retail",2305020.0,4
GOOG,5,Alphabet Inc,"NDX, S&P 500",Communication Services,Internet Content & Information,USA,NASD,2111210.0,19.45,17.16,1.5,5.88,6.14,22.15,28.19,28.41,7.88,0.32,0.81,6/9/2025,7.46,8.97,2.16,19.2,6.07,26.76,12.93,16.73,11.81,48.77,37.73,13.02,359310.0,111000.0,38.84,1.15,5470.0,5070.0,92.65,58.21,-0.01,27.09,-1.27,0.65,1.21,33.03,25.15,34.79,30.02,1.77,1.77,0.07,0.08,58.54,32.6,30.89,0.26,0.26,4.36,-9.6,-3.52,-8.41,1.01,4.61,3.02,2.23,0.71,4.59,0.44,-4.39,17.54,-16.42,22.27,142.66 - 208.70,-16.42,617.49,54.89,Apr 24/a,3/27/2014,Yes,Yes,183323.0,0.62,1.08,1.44,27.26,0.95,25811374,199.8,171.49,173.35,174.65,170.86,174.43,1.71,,,,,,,,,,,,,,,-,"Communication Services, Internet Content & Inf...",2111210.0,5
GOOGL,6,Alphabet Inc,"NDX, S&P 500",Communication Services,Internet Content & Information,USA,NASD,2110400.0,19.35,17.07,1.5,5.87,6.11,22.14,28.18,28.41,7.84,0.3,0.81,6/9/2025,7.46,8.97,2.16,19.23,6.06,26.76,12.93,16.73,11.81,48.77,37.73,13.02,359310.0,111000.0,38.81,1.15,5830.0,5800.0,99.65,52.17,-0.01,38.3,-1.26,1.17,1.66,67.94,25.15,34.79,30.02,1.77,1.77,0.07,0.08,58.54,32.6,30.89,0.13,0.37,5.14,-9.34,-3.17,-8.33,1.01,4.53,3.0,2.2,0.89,4.99,0.85,-4.18,18.78,-16.18,23.49,140.53 - 207.05,-16.18,7126.84,55.39,Apr 24/a,8/19/2004,Yes,Yes,183323.0,0.6,1.07,1.43,41.05,0.77,31653102,199.85,170.68,172.5,173.69,169.94,173.54,1.68,,,,,,,,,,,,,,,-,"Communication Services, Internet Content & Inf...",2110400.0,6
META,7,Meta Platforms Inc,"NDX, S&P 500",Communication Services,Internet Content & Information,USA,NASD,1825500.0,28.32,25.64,2.8,10.72,9.9,25.97,34.9,73.34,27.96,0.24,2.05,6/16/2025,8.38,25.64,5.81,7.31,10.58,29.99,10.13,18.4,16.07,36.38,47.56,19.37,170360.0,66640.0,22.83,2.36,2180.0,2170.0,99.44,13.78,-0.42,68.15,0.35,1.39,1.88,30.14,26.49,39.83,28.65,2.66,2.66,0.26,0.27,81.75,43.0,39.11,4.36,13.04,18.84,24.06,45.54,24.01,1.27,16.39,2.4,2.09,5.3,15.62,19.21,1.33,51.33,-2.0,64.03,442.65 - 740.91,-2.0,4037.27,69.06,Apr 30/a,5/18/2012,Yes,Yes,74067.0,1.7,0.75,1.45,16.07,0.87,13916762,708.83,708.68,713.98,728.22,711.05,726.09,2.46,,,,,,,,,,,,,,,-,"Communication Services, Internet Content & Inf...",1825500.0,7
AVGO,8,Broadcom Inc,"NDX, S&P 500",Technology,Semiconductors,USA,NASD,1270740.0,101.56,32.91,3.89,22.28,18.26,134.16,55.99,14.8,2.01,0.88,2.3,6/20/2025,170.61,2.66,1.66,35.67,24.27,13.91,26.13,17.94,20.16,132.81,14.43,33.85,57050.0,12920.0,0.69,0.31,4700.0,4610.0,98.02,1.99,-1.34,77.31,0.83,0.94,1.61,43.56,7.79,18.98,9.84,1.08,0.98,0.89,0.97,61.72,37.9,22.64,7.53,14.65,50.71,22.37,69.68,16.53,1.13,8.03,3.0,2.98,6.8,20.08,34.72,0.11,67.17,0.11,110.25,128.50 - 269.87,0.11,18753.46,72.14,Jun 05/a,8/6/2009,Yes,Yes,37000.0,1.9,0.18,1.41,27.1,0.87,23501526,288.55,264.65,265.13,271.67,264.12,270.17,2.09,,,,,,,,,,,,,,,-,"Technology, Semiconductors",1270740.0,8
TSM,9,Taiwan Semiconductor Manufacturing ADR,-,Technology,Semiconductors,Taiwan,NYSE,1161690.0,28.82,20.42,1.26,12.01,8.38,14.27,37.42,26.72,15.7,1.26,2.66,9/16/2025,29.86,7.77,2.33,37.68,15.94,26.75,22.79,21.09,35.36,53.28,47.97,35.45,96700.0,40300.0,3.48,1.35,5190.0,5180.0,99.88,0.11,0.0,16.16,0.02,0.51,1.69,26.18,20.37,32.11,23.98,2.39,2.18,0.22,0.24,56.02,47.19,41.68,4.92,13.32,29.11,13.59,33.49,13.43,1.28,4.97,2.21,1.88,7.09,18.18,18.37,0.23,53.6,-1.06,67.71,133.57 - 226.40,-1.06,8526.03,73.23,Apr 17/a,10/8/1997,Yes,Yes,,-0.44,1.01,1.28,15.52,0.49,7650155,224.76,222.74,225.0,225.22,222.7,224.01,0.57,,,,,,,,,,,,,,,-,"Technology, Semiconductors",1161690.0,9
BRK-A,10,Berkshire Hathaway Inc,-,Financial,Insurance - Diversified,USA,NYSE,1049010.0,12.97,22.58,22.35,2.83,1.6,3.02,86.95,455055.17,241884.66,,,-,0.0,56289.42,7486.53,-7.69,6.18,4.43,0.58,7.84,-0.16,-63.73,10.77,0.63,371290.0,80900.0,-5.4,-1.22,0.54,0.54,100.0,62.25,-0.0,7.7,-5.07,0.05,0.43,0.0,7.24,13.2,10.39,6.35,6.02,0.19,0.21,23.58,15.29,21.79,0.22,-3.99,-8.66,6.93,16.94,7.18,0.83,9826.38,1.29,1.12,-1.07,-4.03,0.81,-10.22,0.94,-10.22,20.04,607954.81 - 812855.00,-10.22,80541.75,40.42,May 03/a,3/17/1980,No,Yes,392400.0,-0.03,0.0,2.33,0.00065,0.54,351,782731.13,730000.0,730000.0,732541.5,726736.38,729807.81,-0.03,,,,,,,,,,,,,,,-,"Financial, Insurance - Diversified",1049010.0,10
