# ln Module - lionherd Utility Toolkit

The `ln` module provides essential utilities for lionherd operations organized into functional categories:

**Async Operations:**
- **alcall / bcall**: Concurrent function mapping with retry, throttling, and error handling
- **lcall**: List comprehension-style function application

**Fuzzy Matching & Validation:**
- **fuzzy_match_keys**: Dictionary key validation with typo correction
- **fuzzy_validate_mapping / fuzzy_validate_pydantic**: Schema validation with fuzzy matching

**JSON Utilities (orjson-based):**
- **json_dumps / json_dumpb**: Fast JSON serialization with custom handlers
- **json_dict / json_lines_iter**: Parsing and NDJSON generation utilities
- **get_orjson_default / make_options**: Serialization configuration

**Data Conversion:**
- **to_dict**: Flexible object → dict conversion
- **to_list**: Normalize iterables to lists

**Hashing:**
- **hash_dict**: Stable hashing for dicts, lists, Pydantic models

**General Utilities:**
- **now_utc**: UTC timestamps
- **acreate_path**: Async path creation with timestamps/hashes
- **get_bins**: Bin packing for strings
- **import_module / is_import_installed**: Dynamic imports and checks

In [1]:
import asyncio

from lionherd_core import ln

# Show all exports
print("ln module exports:")
print(sorted([name for name in dir(ln) if not name.startswith("_")]))

ln module exports:
['AlcallParams', 'BcallParams', 'FuzzyMatchKeysParams', 'acreate_path', 'alcall', 'bcall', 'fuzzy_match_keys', 'fuzzy_validate_mapping', 'fuzzy_validate_pydantic', 'get_bins', 'get_orjson_default', 'hash_dict', 'import_module', 'is_import_installed', 'json_dict', 'json_dumpb', 'json_dumps', 'json_lines_iter', 'lcall', 'make_options', 'now_utc', 'to_dict', 'to_list']


## 1. Async Operations - Concurrent Processing

Apply functions to lists with comprehensive control over concurrency, retries, and error handling.

In [2]:
# alcall - async list map with full control
async def api_fetch(user_id):
    await asyncio.sleep(0.01)  # Simulate API call
    return {"id": user_id, "name": f"User{user_id}"}


# Concurrent execution with limits
results = await ln.alcall(
    [1, 2, 3, 4, 5],
    api_fetch,
    max_concurrent=2,  # Limit concurrent requests
    retry_attempts=2,  # Retry failures
    retry_timeout=1.0,  # Per-call timeout
)

print(f"Fetched {len(results)} users:")
for user in results[:2]:
    print(f"  {user}")

Fetched 5 users:
  {'id': 1, 'name': 'User1'}
  {'id': 2, 'name': 'User2'}


In [3]:
# bcall - batch processing with incremental results
async def process_item(x):
    await asyncio.sleep(0.01)
    return x * 2


batch_num = 0
async for batch in ln.bcall(list(range(1, 11)), process_item, batch_size=3):
    batch_num += 1
    print(f"Batch {batch_num}: {batch}")
    if batch_num >= 2:  # Show first 2 batches
        break

Batch 1: [2, 4, 6]
Batch 2: [8, 10, 12]


In [4]:
# lcall - simple synchronous list mapping
from lionherd_core.ln import lcall

# Apply function to each element
result = lcall([1, 2, 3, 4, 5], lambda x: x**2)
print(f"Squares: {result}")

# With input/output processing
nested = [[1, 2], [3, 4], [5]]
flat_doubled = lcall(nested, lambda x: x * 2, input_flatten=True)
print(f"Flattened and doubled: {flat_doubled}")

Squares: [1, 4, 9, 16, 25]
Flattened and doubled: [2, 4, 6, 8, 10]


## 2. Fuzzy Matching & Validation - Robust Schema Handling

Handle typos and variations in data with intelligent string similarity matching.

In [5]:
# fuzzy_match_keys - correct typos in dictionary keys
user_input = {
    "usrname": "Alice",  # typo: username
    "emal": "alice@example.com",  # typo: email
    "age": 30,
}

expected_schema = ["username", "email", "age", "created_at"]

# Correct keys and fill missing fields
corrected = ln.fuzzy_match_keys(
    user_input,
    expected_schema,
    similarity_threshold=0.85,
    handle_unmatched="force",  # Drop unmatched, fill missing
    fill_value=None,
)

print("Original keys:", list(user_input.keys()))
print("Corrected:", corrected)

Original keys: ['usrname', 'emal', 'age']
Corrected: {'age': 30, 'email': 'alice@example.com', 'username': 'Alice', 'created_at': None}


In [6]:
# fuzzy_validate_mapping - validate dict against schema
from lionherd_core.ln import fuzzy_validate_mapping

schema = {"username": str, "age": int, "email": str}
data = {"usrname": "Bob", "age": "25", "emal": "bob@example.com"}  # typos + wrong type

# Fuzzy match keys and coerce types
validated = fuzzy_validate_mapping(
    data,
    schema,
    fuzzy_match=True,
    similarity_threshold=0.8,
    handle_unmatched="force",
)

print("Validated:", validated)
print(
    f"Types: username={type(validated['username']).__name__}, age={type(validated['age']).__name__}"
)

Validated: {'age': '25', 'email': 'bob@example.com', 'username': 'Bob'}
Types: username=str, age=str


In [7]:
# fuzzy_validate_pydantic - validate against Pydantic model
from pydantic import BaseModel

from lionherd_core.ln import FuzzyMatchKeysParams, fuzzy_validate_pydantic


class User(BaseModel):
    username: str
    age: int
    email: str


# Data with typos
messy_data = {"usrname": "Charlie", "age": "30", "emal": "charlie@example.com"}

# Fuzzy validate and create model instance
user = fuzzy_validate_pydantic(
    messy_data,
    User,
    fuzzy_match=True,
    fuzzy_match_params=FuzzyMatchKeysParams(similarity_threshold=0.8),
)

print(f"User model: {user}")
print(f"Type: {type(user).__name__}")

User model: username='Charlie' age=30 email='charlie@example.com'
Type: User


## 3. JSON Utilities - Fast Serialization with orjson

High-performance JSON operations with support for custom types (UUID, datetime, Pydantic models, etc.).

In [8]:
import datetime as dt
from uuid import uuid4

from pydantic import BaseModel


# Custom types that need special handling
class Event(BaseModel):
    id: str
    timestamp: dt.datetime
    data: dict


event = Event(id=str(uuid4()), timestamp=ln.now_utc(), data={"status": "active", "count": 42})

# json_dumps - serialize to string
json_str = ln.json_dumps(event)
print(f"JSON string: {json_str}")

# json_dumpb - serialize to bytes (faster)
json_bytes = ln.json_dumpb(event)
print(f"JSON bytes: {json_bytes[:80]}...")

JSON string: {"id":"b39e4fbf-8b8f-4e21-9d7a-cd8b9f6a82ba","timestamp":"2025-11-09T15:36:35.366594+00:00","data":{"status":"active","count":42}}
JSON bytes: b'{"id":"b39e4fbf-8b8f-4e21-9d7a-cd8b9f6a82ba","timestamp":"2025-11-09T15:36:35.36'...


In [9]:
# json_dict - convert object to dict via JSON roundtrip
from pydantic import BaseModel


class Config(BaseModel):
    host: str
    port: int


config = Config(host="localhost", port=8000)

# Convert Pydantic model to plain dict via JSON
plain_dict = ln.json_dict(config)
print(f"Pydantic → dict: {plain_dict}")
print(f"Type: {type(plain_dict).__name__}")

Pydantic → dict: {'host': 'localhost', 'port': 8000}
Type: dict


In [None]:
# json_lines_iter - generate JSON Lines format
records = [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}, {"id": 3, "name": "Charlie"}]

# Iterate to generate NDJSON lines (bytes)
for line in ln.json_lines_iter(records):
    print(f"  {line.decode('utf-8')}", end="")

## 4. Data Conversion - Flexible Type Handling

Convert objects to dictionaries and normalize iterables to lists with extensive options.

In [11]:
# to_dict - convert various types to dict
from pydantic import BaseModel


class Person(BaseModel):
    name: str
    age: int


# From Pydantic model
person = Person(name="Alice", age=30)
dict1 = ln.to_dict(person)
print(f"From Pydantic: {dict1}")

# From dict-like object
dict2 = ln.to_dict({"a": 1, "b": 2})
print(f"From dict: {dict2}")

# From JSON string
dict3 = ln.to_dict('{"username": "Bob", "age": 25}')
print(f"From JSON string: {dict3}")

From Pydantic: {'name': 'Alice', 'age': 30}
From dict: {'a': 1, 'b': 2}
From JSON string: {'username': 'Bob', 'age': 25}


In [12]:
# to_list - normalize to list with filtering
from lionherd_core.ln import to_list

# Flatten nested structures
nested = [[1, 2], [3, None], 4, None, [5, 6]]
flat = to_list(nested, flatten=True, dropna=True)
print(f"Flattened: {flat}")

# Remove duplicates (requires flatten=True)
with_dupes = [1, 2, 2, 3, 1, 4, 3]
unique = to_list(with_dupes, flatten=True, unique=True)
print(f"Unique: {unique}")

# Combine all options
messy = [[1, None, 2], None, [2, 3], 3, [4, None]]
clean = to_list(messy, flatten=True, dropna=True, unique=True)
print(f"Clean: {clean}")

Flattened: [1, 2, 3, 4, 5, 6]
Unique: [1, 2, 3, 4]
Clean: [1, 2, 3, 4]


## 5. Hashing - Stable Content-Based Hashing

Generate consistent hashes for dicts, lists, sets, and Pydantic models (order-independent for dicts/sets).

In [13]:
# hash_dict - stable hashing for data structures

# Order-independent for dicts
dict1 = {"a": 1, "b": 2, "c": 3}
dict2 = {"c": 3, "a": 1, "b": 2}  # Different order

hash1 = ln.hash_dict(dict1)
hash2 = ln.hash_dict(dict2)

print(f"Dict 1 hash: {hash1}")
print(f"Dict 2 hash: {hash2}")
print(f"Hashes equal (order-independent): {hash1 == hash2}")

Dict 1 hash: 8599943207739827362
Dict 2 hash: 8599943207739827362
Hashes equal (order-independent): True


In [14]:
# Works with nested structures
complex_data = {
    "users": [{"name": "Alice", "id": 1}, {"name": "Bob", "id": 2}],
    "metadata": {"version": "1.0", "created": "2025-01-01"},
    "tags": {"important", "reviewed"},  # Set - order independent
}

hash_complex = ln.hash_dict(complex_data)
print(f"Complex structure hash: {hash_complex}")

# Same structure, different tag order → same hash
complex_data2 = {
    "users": [{"name": "Alice", "id": 1}, {"name": "Bob", "id": 2}],
    "metadata": {"created": "2025-01-01", "version": "1.0"},  # Different order
    "tags": {"reviewed", "important"},  # Different order
}

hash_complex2 = ln.hash_dict(complex_data2)
print(f"Same structure hash: {hash_complex2}")
print(f"Hashes equal: {hash_complex == hash_complex2}")

Complex structure hash: -7655103619936329721
Same structure hash: -7655103619936329721
Hashes equal: True


In [15]:
# Works with Pydantic models
from pydantic import BaseModel


class Config(BaseModel):
    host: str
    port: int
    tags: set[str]


config1 = Config(host="localhost", port=8000, tags={"dev", "test"})
config2 = Config(host="localhost", port=8000, tags={"test", "dev"})  # Different tag order

print(f"Config 1 hash: {ln.hash_dict(config1)}")
print(f"Config 2 hash: {ln.hash_dict(config2)}")
print(f"Pydantic models match: {ln.hash_dict(config1) == ln.hash_dict(config2)}")

Config 1 hash: -3676016240202627472
Config 2 hash: -3676016240202627472
Pydantic models match: True


## 6. General Utilities - Time, Paths, Imports

Essential utilities for timestamps, async path operations, data organization, and dynamic imports.

In [16]:
# now_utc - UTC timestamps
timestamp = ln.now_utc()
print(f"Current UTC: {timestamp}")
print(f"Timezone: {timestamp.tzinfo}")
print(f"ISO format: {timestamp.isoformat()}")

Current UTC: 2025-11-09 15:36:35.400821+00:00
Timezone: UTC
ISO format: 2025-11-09T15:36:35.400821+00:00


In [17]:
# acreate_path - async path creation with timestamps/hashes
import tempfile

with tempfile.TemporaryDirectory() as tmpdir:
    # Create path with timestamp and random hash
    path = await ln.acreate_path(
        tmpdir,
        "logs/session",
        "log",
        timestamp=True,
        random_hash_digits=6,
        timestamp_format="%Y%m%d_%H%M%S",
    )

    print(f"Created path: {path}")
    print(f"Filename: {path.name}")
    print(f"Parent exists: {await path.parent.exists()}")

Created path: /var/folders/5p/rcbw097d29j3s2qt861tsjfh0000gn/T/tmpdqxz8hip/logs/session_20251109_103635-b6ad14.log
Filename: session_20251109_103635-b6ad14.log
Parent exists: True


In [18]:
# get_bins - bin packing by cumulative string length
messages = [
    "Hello world",
    "How are you?",
    "I'm great!",
    "What's the weather?",
    "Sunny",
    "Perfect!",
]

# Organize into 30-character bins
bins = ln.get_bins(messages, upper=30)

print("Message batches (max 30 chars):")
for i, bin_indices in enumerate(bins):
    batch = [messages[idx] for idx in bin_indices]
    print(f"  Batch {i + 1}: {batch}")

Message batches (max 30 chars):
  Batch 1: ['Hello world', 'How are you?']
  Batch 2: ["I'm great!", "What's the weather?"]
  Batch 3: ['Sunny', 'Perfect!']


In [19]:
# import_module - dynamic imports
from lionherd_core.ln import import_module, is_import_installed

# Check if package is available
has_pydantic = is_import_installed("pydantic")
print(f"Pydantic installed: {has_pydantic}")

# Import module
json_mod = import_module("json")
print(f"Imported: {json_mod.__name__}")

# Import specific object
Path = import_module("pathlib", import_name="Path")
print(f"Imported Path: {Path.__name__}")

# Import multiple objects
dumps, loads = import_module("json", import_name=["dumps", "loads"])
print(f"Imported: {dumps.__name__}, {loads.__name__}")

Pydantic installed: True
Imported: json
Imported Path: Path
Imported: dumps, loads


## 7. Integration Patterns - Common Workflows

Real-world combinations of ln utilities.

In [20]:
# Pattern 1: API response normalization with fuzzy matching
from lionherd_core.ln import FuzzyMatchKeysParams

# Define reusable normalizer
api_normalizer = FuzzyMatchKeysParams(
    similarity_threshold=0.8,
    handle_unmatched="force",
    fill_value=None,
)

# Simulate messy API responses
api_responses = [
    {"usr_id": 1, "usrname": "Alice", "emal": "alice@example.com"},
    {"user_id": 2, "name": "Bob"},  # Missing email
    {"id": 3, "username": "Charlie", "email": "charlie@example.com"},
]

expected = ["user_id", "username", "email"]

print("Normalized API responses:")
for resp in api_responses:
    normalized = api_normalizer(resp, expected)
    print(f"  {normalized}")

Normalized API responses:
  {'email': 'alice@example.com', 'user_id': 1, 'username': 'Alice'}
  {'user_id': 2, 'username': None, 'email': None}
  {'username': 'Charlie', 'email': 'charlie@example.com', 'user_id': None}


In [21]:
# Pattern 2: Concurrent API fetching with retry and batching
async def fetch_user(user_id):
    """Simulate API fetch."""
    await asyncio.sleep(0.01)
    return {"id": user_id, "name": f"User{user_id}", "status": "active"}


# Fetch users in controlled batches
user_ids = list(range(1, 11))

all_users = []
async for batch in ln.bcall(
    user_ids,
    fetch_user,
    batch_size=3,
    max_concurrent=2,  # Respect rate limits
    retry_attempts=2,
):
    all_users.extend(batch)
    print(f"Processed batch of {len(batch)} users")

print(f"\nTotal fetched: {len(all_users)} users")
print(f"Sample: {all_users[:2]}")

Processed batch of 3 users
Processed batch of 3 users
Processed batch of 3 users
Processed batch of 1 users

Total fetched: 10 users
Sample: [{'id': 1, 'name': 'User1', 'status': 'active'}, {'id': 2, 'name': 'User2', 'status': 'active'}]


In [22]:
# Pattern 3: Data deduplication with stable hashing
records = [
    {"name": "Alice", "age": 30, "tags": {"dev", "senior"}},
    {"age": 30, "name": "Alice", "tags": {"senior", "dev"}},  # Duplicate (different order)
    {"name": "Bob", "age": 25, "tags": {"junior"}},
    {"name": "Alice", "age": 30, "tags": {"dev", "senior"}},  # Another duplicate
]

# Deduplicate using stable hashing
seen_hashes = set()
unique_records = []

for record in records:
    record_hash = ln.hash_dict(record)
    if record_hash not in seen_hashes:
        seen_hashes.add(record_hash)
        unique_records.append(record)

print(f"Original: {len(records)} records")
print(f"Unique: {len(unique_records)} records")
print(f"Deduplicated: {unique_records}")

Original: 4 records
Unique: 2 records
Deduplicated: [{'name': 'Alice', 'age': 30, 'tags': {'senior', 'dev'}}, {'name': 'Bob', 'age': 25, 'tags': {'junior'}}]


In [23]:
# Pattern 4: Message batching and timestamping
# Organize messages into batches for processing
messages = [
    "User logged in",
    "Fetched profile data",
    "Updated preferences",
    "Saved changes to database",
    "Logged out",
]

# Batch by length (max 50 chars)
bins = ln.get_bins(messages, upper=50)

print("Processing message batches:")
for i, bin_indices in enumerate(bins):
    batch = [messages[idx] for idx in bin_indices]
    timestamp = ln.now_utc()
    print(f"[{timestamp.strftime('%H:%M:%S')}] Batch {i + 1}:")
    for msg in batch:
        print(f"  - {msg}")

Processing message batches:
[15:36:35] Batch 1:
  - User logged in
  - Fetched profile data
[15:36:35] Batch 2:
  - Updated preferences
  - Saved changes to database
[15:36:35] Batch 3:
  - Logged out


## 8. Parameter Objects - Reusable Configurations

Use parameter dataclasses for consistent behavior across calls.

In [24]:
# AlcallParams - reusable async call configuration
from lionherd_core.ln import AlcallParams

# Define standard API client config
api_client_params = AlcallParams(
    max_concurrent=5,
    retry_attempts=3,
    retry_initial_delay=0.1,
    retry_backoff=2.0,
    retry_timeout=5.0,
)


# Use with different functions/inputs
async def api_call_1(x):
    await asyncio.sleep(0.01)
    return x * 2


async def api_call_2(x):
    await asyncio.sleep(0.01)
    return x**2


result1 = await api_client_params([1, 2, 3], api_call_1)
result2 = await api_client_params([1, 2, 3], api_call_2)

print(f"Result 1 (double): {result1}")
print(f"Result 2 (square): {result2}")

Result 1 (double): [2, 4, 6]
Result 2 (square): [1, 4, 9]


In [25]:
# FuzzyMatchKeysParams - reusable fuzzy matching config
from lionherd_core.ln import FuzzyMatchKeysParams

# Strict validator for API inputs
strict_validator = FuzzyMatchKeysParams(
    similarity_threshold=0.9,
    handle_unmatched="raise",
    strict=True,
)

# Lenient normalizer for user inputs
lenient_normalizer = FuzzyMatchKeysParams(
    similarity_threshold=0.7,
    handle_unmatched="force",
    fill_value=None,
)

# Use them
user_data = {"usrname": "Alice", "age": 30}
schema = ["username", "age", "email"]

normalized = lenient_normalizer(user_data, schema)
print(f"Lenient normalization: {normalized}")

# Strict would raise (uncomment to test)
# strict_validator(user_data, schema)

Lenient normalization: {'age': 30, 'username': 'Alice', 'email': None}


## Summary Checklist

**ln Module Essentials:**

**Async Operations:**
- ✅ `alcall` for concurrent function mapping with retry/throttle/error handling
- ✅ `bcall` for batch processing with incremental results
- ✅ `lcall` for simple list comprehension-style operations

**Fuzzy Matching:**
- ✅ `fuzzy_match_keys` for typo-tolerant dictionary validation
- ✅ `fuzzy_validate_mapping` for schema validation with type coercion
- ✅ `fuzzy_validate_pydantic` for Pydantic model validation

**JSON Utilities:**
- ✅ `json_dumps/json_dumpb` for fast serialization with custom handlers
- ✅ `json_dict` for parsing with fuzzy matching
- ✅ `json_lines_iter` for generating NDJSON format

**Data Conversion:**
- ✅ `to_dict` for flexible object → dict conversion
- ✅ `to_list` for normalizing iterables with flatten/dropna/unique

**Hashing:**
- ✅ `hash_dict` for stable, order-independent hashing

**General Utilities:**
- ✅ `now_utc` for UTC timestamps
- ✅ `acreate_path` for async path creation with timestamps/hashes
- ✅ `get_bins` for bin packing by string length
- ✅ `import_module/is_import_installed` for dynamic imports

**Next Steps:**
- See individual notebooks for deep dives: `ln_async_call`, `ln_fuzzy_match`, `ln_utils`, etc.
- See `base.Element` for integration with lionherd's serialization
- See `types.Spec` for structured data with fuzzy validation
- Use in production for robust data processing, API integration, and workflow orchestration