# Quick Start Guide

This tutorial introduces the core concepts of `projio` and shows you how to get started with centralized path management.

## Installation

```bash
pip install projio
```

Or install from source:

```bash
pip install git+https://github.com/s01st/project-io.git
```

## Basic Usage

The main class is `ProjectIO`. It manages all your project paths from a single root directory.

In [None]:
import tempfile
from projio import ProjectIO

# Create a temporary directory for this tutorial
tmp = tempfile.mkdtemp()

# Create a ProjectIO instance
io = ProjectIO(root=tmp, use_datestamp=False)

print(f"Root: {io.root}")
print(f"Inputs: {io.inputs}")
print(f"Outputs: {io.outputs}")

Root: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
Inputs: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
Outputs: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq


## Directory Structure

ProjectIO provides properties for common directory types:

In [2]:
print(f"Cache directory: {io.cache}")
print(f"Logs directory: {io.logs}")
print(f"Data directory: {io.data_dir}")
print(f"Downloads directory: {io.downloads}")

Cache directory: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/cache
Logs directory: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/logs
Data directory: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/data
Downloads directory: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/downloads


## Root Cascade

One of the key features is **root cascade**. When you set `root`, both `iroot` (input root) and `oroot` (output root) follow automatically:

In [3]:
# Both iroot and oroot follow root
print(f"iroot == root: {io.iroot == io.root}")
print(f"oroot == root: {io.oroot == io.root}")

iroot == root: True
oroot == root: True


You can override them individually if needed:

In [4]:
import tempfile

# Create separate directories
data_dir = tempfile.mkdtemp()
output_dir = tempfile.mkdtemp()

io2 = ProjectIO(
    root=tmp,
    iroot=data_dir,    # Override input root
    oroot=output_dir,  # Override output root
    use_datestamp=False
)

print(f"Root: {io2.root}")
print(f"Input root: {io2.iroot}")
print(f"Output root: {io2.oroot}")

Root: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
Input root: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpjd5flc4f
Output root: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmppxulspf0


## Building Paths

Use `path_for()` to build paths for different kinds of files:

In [5]:
# Build paths for different purposes
output_file = io.path_for('outputs', 'results', ext='.csv')
cache_file = io.path_for('cache', 'preprocessed', ext='.pkl')
log_file = io.path_for('logs', 'training', ext='.log')

print(f"Output file: {output_file}")
print(f"Cache file: {cache_file}")
print(f"Log file: {log_file}")

Output file: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/results.csv
Cache file: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/cache/preprocessed.pkl
Log file: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/logs/training.log


You can also add subdirectories:

In [6]:
# With subdirectory
nested_path = io.path_for('outputs', 'model', subdir='experiment_1', ext='.pt')
print(f"Nested path: {nested_path}")

# Multiple subdirectory levels
deep_path = io.path_for('outputs', 'results', subdir=['2024', 'march', 'run_1'], ext='.json')
print(f"Deep path: {deep_path}")

Nested path: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/experiment_1/model.pt
Deep path: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/2024/march/run_1/results.json


## Singleton Access with PIO

For convenience, use `PIO` for class-level access without managing instances:

In [None]:
from projio import PIO

# Set up the default instance
PIO.default = ProjectIO(root=tmp, use_datestamp=False)

# Now access paths directly from the class
print(f"PIO.root: {PIO.root}")
print(f"PIO.cache: {PIO.cache}")
print(f"PIO.outputs: {PIO.outputs}")

PIO.root: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
PIO.cache: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/cache
PIO.outputs: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq


## Auto-Creating Directories

By default, ProjectIO creates directories when you access paths. You can control this:

In [8]:
# With auto_create=True (default), directories are created
io_auto = ProjectIO(root=tempfile.mkdtemp(), auto_create=True, use_datestamp=False)
path = io_auto.path_for('outputs', 'test', ext='.txt')
print(f"Parent exists: {path.parent.exists()}")

# With auto_create=False, directories are not created
io_manual = ProjectIO(root=tempfile.mkdtemp(), auto_create=False, use_datestamp=False)
path2 = io_manual.path_for('outputs', 'test', ext='.txt')
print(f"Parent exists (manual): {path2.parent.exists()}")

Parent exists: True
Parent exists (manual): True


## Describing Configuration

Use `describe()` to get a summary of the current configuration:

In [9]:
config = io.describe()
for key, value in config.items():
    print(f"{key}: {value}")

root: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
iroot: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
oroot: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
inputs: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
outputs: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq
cache: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/cache
logs: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/logs
lightning_root: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/lightning
checkpoints: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/lightning/checkpoints
tensorboard: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpy76pa7pq/lightning/tensorboard
auto_create: True
use_datestamp: False
datestamp_format: %Y_%m_%d
datestamp_in: dirs
dry_run: False
package: None
producer_records: 0


## Next Steps

- Learn about [datestamp handling](02_datestamp.ipynb) for time-based organization
- Explore [Lightning integration](03_lightning.ipynb) for ML workflows
- Discover [templates](04_templates.ipynb) for common file patterns
- Check out [advanced features](05_advanced.ipynb) like producer tracking and dry-run mode