# filoma — Quick interactive examples

This notebook demonstrates key filoma capabilities and includes lightweight checks to see if it works in your environment.

It covers: imports and version checks, probing a file and a directory, working with the `filoma.DataFrame` wrapper, using `probe_to_df`, a small image probe example, and saving a CSV export.

Note: cells wrap operations in `try/except` so the notebook still runs if optional dependencies (e.g. `polars`, `numpy`, or image backends) are missing.

In [None]:
# Basic environment and import checks
from pathlib import Path

import filoma
from filoma import DataFrame


def check_imports():
    results = {}
    try:
        import filoma

        results["filoma"] = getattr(filoma, "__version__", "unknown")
    except Exception as e:
        results["filoma"] = f"IMPORT ERROR: {e}"

    for pkg in ("polars", "numpy", "PIL"):
        try:
            __import__(pkg if pkg != "PIL" else "PIL.Image")
            results[pkg] = "available"
        except Exception as e:
            results[pkg] = f"missing ({e})"

    # show where we are running the notebook from
    results["cwd"] = str(Path(".").resolve())
    return results


check_imports()

## 1) Quick probe: a single file and a directory

Try probing a README or small file, then probe a lightweight sample directory from the repo's `tests/` tree.

In [None]:
file_candidate = "../README.md"
dir_candidate = "../tests/"

print("probing file ->", file_candidate)
if file_candidate is not None:
    try:
        file_report = filoma.probe(file_candidate)
        print("file probe result type:", type(file_report))
        try:
            # many filoma dataclasses implement a nice repr or to-dict
            print(file_report)
        except Exception:
            pass
    except Exception as e:
        print("file probe failed:", e)
else:
    print("No small file found to probe in the repository root.")

print("probing directory ->", dir_candidate)
if dir_candidate is not None:
    try:
        dir_report = filoma.probe(dir_candidate, max_depth=2, threads=2)
        print("directory probe returned an object of type:", type(dir_report))
        # If it exposes a to_df() method we can inspect a little
        if hasattr(dir_report, "to_df"):
            try:
                dfw = dir_report.to_df()
                print("to_df() -> wrapper type:", type(dfw))
            except Exception as e:
                print("to_df() raised:", e)
    except Exception as e:
        print("directory probe failed:", e)
else:
    print("No small directory found to probe in tests/; adjust the path and re-run.")

## 2) Working with `filoma.DataFrame` wrapper

Construct a `filoma.DataFrame` from a list of paths and run the convenience enrichers: `.add_path_components()`, `.add_file_stats_cols()`, and `.add_depth_col()`.

In [None]:
sample_paths = [p for p in (Path("../README.md"), Path("../pyproject.toml"), Path("../Cargo.toml")) if p.exists()]
if not sample_paths:
    # fallback to a couple of files from tests if present
    sample_paths = [p for p in (Path("../tests/test_basic_dataframe.py"), Path("../tests/test_rust_comprehensive.py")) if p.exists()]

print("sample paths used:", sample_paths)
dfw = DataFrame(sample_paths)
print("Initial wrapper and head:")
print(dfw.head(10))

print("With path components:")
try:
    df_components = dfw.add_path_components()
    print(df_components.head(10))
except Exception as e:
    print("add_path_components failed:", e)

print("With file stats:")
try:
    df_stats = dfw.add_file_stats_cols()
    print(df_stats.head(10))
except Exception as e:
    print("add_file_stats_cols failed:", e)

print("Add depth column relative to repo root:")
try:
    df_depth = dfw.add_depth_col(Path("."))
    print(df_depth.head(10))
except Exception as e:
    print("add_depth_col failed:", e)

## 3) Build a DataFrame from a directory using `probe_to_df`

This uses filoma's convenience `probe_to_df` which will build a Polars DataFrame if `polars` is installed. We request a lightweight folder under `tests/` to keep runtime small.

In [None]:
from filoma import probe_to_df

dir_path = "../tests"
if dir_path is None:
    print("No test directory available for probe_to_df; skip this cell.")
else:
    try:
        pl_df = probe_to_df(dir_path, to_pandas=False, enrich=True, max_depth=2, threads=2)
        print("probe_to_df returned a Polars DataFrame with shape:", pl_df.shape)
        # Show a small sample and a group_by_extension summary when available
        try:
            print("Sample rows:")
            print(pl_df.head(5))
        except Exception:
            pass
        try:
            print("Extension counts:")
            # wrap it in a DataFrame wrapper if needed
            from filoma import DataFrame as DFWrap

            wrapper = DFWrap(pl_df)
            print(wrapper.group_by_extension().head(10))
        except Exception as e:
            print("group_by_extension failed:", e)
    except Exception as e:
        print("probe_to_df failed:", e)

## 4) Image probing (in-memory)

Create a small numpy array and pass it to `filoma.probe_image` to exercise the image path that accepts arrays. This avoids needing image files or heavy dependencies.

In [None]:
try:
    import numpy as np

    arr = np.random.randn(16, 16)
    img_report = filoma.probe_image(arr)
    print("probe_image on numpy array returned type:", type(img_report))
    try:
        print(img_report)
    except Exception:
        pass
except Exception as e:
    print("Skipping image probe; numpy unavailable or probe failed:", e)

## 5) Save a small CSV export (if `polars` is available)

This cell attempts to save the `probe_to_df` result or our small DataFrame example to `/tmp/filoma_example.csv`. It prints a short verification sample.

In [None]:
out_path = Path("/tmp/filoma_example.csv")
saved = False
try:
    if "pl_df" in globals():
        # Try write via polars if present
        try:
            pl_df.write_csv(str(out_path))
            saved = True
        except Exception:
            pass
    if not saved and "dfw" in globals():
        try:
            df_stats.save_csv(out_path)
            saved = True
        except Exception:
            pass
    if saved:
        print("Saved CSV to", out_path)
        try:
            print("CSV sample:", out_path.read_text().splitlines()[:10])
        except Exception:
            pass
    else:
        print("Could not save CSV; polars or file-writer not available.")
except Exception as e:
    print("Saving CSV failed:", e)

---

### Notes and next steps

- If a cell raised an exception because a dependency is missing, install `polars`, `numpy`, and optionally `pillow`.
- To run longer scans increase `max_depth` and `threads` in the `probe()` calls.
- Use `probe_to_df(..., to_pandas=True)` to get a pandas.DataFrame if you prefer pandas.