### Ticker Generation Workflow

This notebook automates the process of creating a master ticker list for the `Yloader` application.

**Workflow:**

1.  **Prerequisite:** Run an external process (e.g., a `finviz` scraper) to generate one or more ticker files (e.g., `ticker_2023-10-27_stocks_etfs.csv`) and save them to your `Downloads` directory.
2.  **Find Files:** The notebook scans the `Downloads` directory for the most recent ticker files based on a specified prefix.
3.  **Combine & Unify:** It reads all found files, combines the tickers into a single list, and removes duplicates.
4.  **Save List:** The final, unique list of tickers is saved to a `tickers.csv` file in the `Yloader` project directory, ready for use.


### Setup and Configuration

This cell contains all the necessary imports and configuration variables. **Modify the parameters in the `Configuration` section below to match your setup before running the notebook.**

In [1]:
import csv
from pathlib import Path
import pandas as pd
from typing import List, Set

# --- Configuration ---

# Directory to search for incoming ticker files.
# Uses pathlib to programmatically find the user's Downloads folder.
DOWNLOADS_DIR = Path.home() / "Downloads"

# The prefix of the ticker files to look for (e.g., 'ticker' for 'ticker_2023-10-27.csv').
TICKER_FILE_PREFIX = 'ticker'

# The number of most recent ticker files to process.
# Set to a higher number if you need to combine many historical files.
RECENT_FILE_COUNT = 10

# The target directory where the final, combined ticker list will be saved.
YLOADER_TICKERS_DIR = Path.home() / "Desktop" / "yloader" / "tickers"

# The name of the final output file.
OUTPUT_TICKER_FILENAME = "tickers.csv"

# --- Verification ---
print(f"Searching for ticker files in: {DOWNLOADS_DIR}")
print(f"Output directory for Yloader: {YLOADER_TICKERS_DIR}")

Searching for ticker files in: C:\Users\ping\Downloads
Output directory for Yloader: C:\Users\ping\Desktop\yloader\tickers


### Step 1: Find Recent Ticker Files

This cell defines the function to find recent files and immediately executes it. This completes the first step of the workflow.

In [2]:
def find_recent_csv_files(
    search_dir: Path,
    prefix: str,
    count: int
) -> List[Path]:
    """
    Finds the most recent CSV files in a directory that start with a given prefix.

    Args:
        search_dir (Path): The Path object for the directory to search.
        prefix (str): The prefix the CSV filenames must start with.
        count (int): The maximum number of recent file paths to return.

    Returns:
        List[Path]: A list of Path objects for the found files, sorted from
                    most recent to oldest. Returns an empty list if the
                    directory doesn't exist or no matching files are found.
    """
    if not search_dir.is_dir():
        print(f"Error: Directory not found at '{search_dir}'")
        return []

    candidate_files = [f for f in search_dir.glob(f"{prefix}*.csv") if f.is_file()]
    if not candidate_files:
        return []

    sorted_files = sorted(
        candidate_files,
        key=lambda f: f.stat().st_mtime,
        reverse=True
    )
    return sorted_files[:count]

# --- Execute Step 1 ---
print("--- Step 1: Finding recent ticker files ---")
recent_ticker_files = find_recent_csv_files(
    search_dir=DOWNLOADS_DIR,
    prefix=TICKER_FILE_PREFIX,
    count=RECENT_FILE_COUNT
)

if recent_ticker_files:
    print(f"Found {len(recent_ticker_files)} recent ticker file(s):")
    for i, file_path in enumerate(recent_ticker_files):
        print(f"  {i+1}. {file_path.name}")
else:
    print(f"No recent CSV files starting with '{TICKER_FILE_PREFIX}' found in '{DOWNLOADS_DIR}'.")
    # We assign an empty list to prevent NameError in the next cell
    recent_ticker_files = []

--- Step 1: Finding recent ticker files ---
Found 10 recent ticker file(s):
  1. ticker_2025-08-25_stocks_etfs.csv
  2. ticker_2025-08-22_stocks_etfs.csv
  3. ticker_2025-08-21_stocks_etfs.csv
  4. ticker_2025-08-20_stocks_etfs.csv
  5. ticker_2025-08-19_stocks_etfs.csv
  6. ticker_2025-08-18_stocks_etfs.csv
  7. ticker_2025-08-16_stocks_etfs.csv
  8. ticker_2025-08-15_stocks_etfs.csv
  9. ticker_2025-08-14_stocks_etfs.csv
  10. ticker_2025-08-13_stocks_etfs.csv


### Step 2: Combine Tickers into a Unique Set

This cell defines the function to combine tickers and immediately runs it on the files found in Step 1.

In [3]:
def combine_tickers_from_files(file_paths: List[Path]) -> List[str]:
    """
    Reads tickers from multiple CSV files, combines them, and returns a
    sorted, unique list.

    Args:
        file_paths (List[Path]): A list of Path objects pointing to the CSV files.

    Returns:
        List[str]: A sorted list of unique ticker symbols.
    """
    all_tickers: Set[str] = set()

    for file_path in file_paths:
        try:
            df = pd.read_csv(file_path, header=None, names=['ticker'], skip_blank_lines=True)
            if not df.empty:
                tickers_from_file = df['ticker'].dropna().astype(str).str.strip()
                valid_tickers = tickers_from_file[tickers_from_file != ''].tolist()
                all_tickers.update(valid_tickers)
        except pd.errors.EmptyDataError:
            print(f"Warning: File '{file_path.name}' is empty and will be skipped.")
        except Exception as e:
            print(f"An error occurred while processing file '{file_path.name}': {e}")

    return sorted(list(all_tickers))

# --- Execute Step 2 ---
print("\n--- Step 2: Combining tickers into a unique list ---")
ticker_list = combine_tickers_from_files(file_paths=recent_ticker_files)

if ticker_list:
    print(f"Successfully combined tickers from {len(recent_ticker_files)} file(s).")
    print(f"Total unique tickers found: {len(ticker_list)}")
    # Display a sample of the tickers to avoid flooding the output
    print(f"Sample tickers: {ticker_list[:10]}...")
else:
    print("No tickers were extracted. The final list is empty.")
    # We assign an empty list to prevent NameError in the next cell
    ticker_list = []


--- Step 2: Combining tickers into a unique list ---
Successfully combined tickers from 10 file(s).
Total unique tickers found: 1587
Sample tickers: ['A', 'AA', 'AAL', 'AAON', 'AAPL', 'ABBV', 'ABEV', 'ABNB', 'ABT', 'ACGL']...


### Step 3: Save the Combined Ticker List

Finally, this cell defines the save function and executes it to write the master `tickers.csv` file.

In [4]:
def save_tickers_to_csv(ticker_list: List[str], output_path: Path):
    """
    Saves a list of tickers to a single-column CSV file.

    Args:
        ticker_list (List[str]): The list of ticker symbols to save.
        output_path (Path): The full Path object for the output CSV file.
    """
    if not ticker_list:
        print("Warning: Ticker list is empty. Nothing to save.")
        return

    try:
        # Create the parent directory if it doesn't exist
        output_path.parent.mkdir(parents=True, exist_ok=True)

        with open(output_path, 'w', newline='', encoding='utf-8') as csvfile:
            writer = csv.writer(csvfile)
            for ticker in ticker_list:
                writer.writerow([ticker])

        print(f"Successfully saved {len(ticker_list)} tickers to: {output_path}")

    except IOError as e:
        print(f"Error: Could not write to file at {output_path}. Details: {e}")
    except Exception as e:
        print(f"An unexpected error occurred during file save: {e}")

# --- Execute Step 3 ---
print("\n--- Step 3: Saving the final ticker list ---")
output_file_path = YLOADER_TICKERS_DIR / OUTPUT_TICKER_FILENAME
save_tickers_to_csv(ticker_list=ticker_list, output_path=output_file_path)


--- Step 3: Saving the final ticker list ---
Successfully saved 1587 tickers to: C:\Users\ping\Desktop\yloader\tickers\tickers.csv
