1+3

## Storage policy and paths (keep HOME empty)

- Personal HOME (small/quota): `/cluster/home/zwu09` — use only for emergencies/bootstrapping.
- Primary working storage (all code, data, caches, envs):
  - Datalab: `/cluster/tufts/datalab/zwu09` (granted) and `/cluster/tufts/datalab/EM212-LLM`
  - Class: `/cluster/tufts/em212class/zwu09` (1 TB pool for class)
- OnDemand sessions and SSH shells are separate allocations. Cancel unneeded OnDemand jobs to free resources.

From tickets (2025-03 to 2025-04):
- Datalab granted: `/cluster/tufts/datalab/EM212-LLM`, `/cluster/tufts/datalab/zwu09` (pool ~2.3T; near full at times)
- Class space: `/cluster/tufts/em212class` 1000GB with ample free capacity

Check quotas/usage:
```bash
# lab/class space
df -h /cluster/tufts/datalab
df -h /cluster/tufts/em212class

# top-level HOME usage (run in an interactive shell)
srun -n 1 -t 0-03:00 --mem 4G -p batch --pty bash
du -h --max-depth=1 ~ | sort -hr | head -n 20
```

Rule of thumb:
- Keep `~` empty; do not store datasets, models, or caches there.
- Put EVERYTHING (code repos, conda/mamba envs, caches, temp, models) under `/cluster/tufts/datalab/zwu09` (or class path when appropriate).



In [1]:
# Redirect caches, tools, and environments to datalab
# Run these in your login shell once, then re-login

export DATALAB_BASE=/cluster/tufts/datalab/zwu09
mkdir -p "$DATALAB_BASE"/{code,envs,caches,tmp,models}

# Poetry / pip / wheel caches
mkdir -p "$DATALAB_BASE/caches/pip"
export PIP_CACHE_DIR="$DATALAB_BASE/caches/pip"

# Conda/Mamba envs and pkgs
mkdir -p "$DATALAB_BASE/envs/conda_pkgs"
export CONDA_ENVS_PATH="$DATALAB_BASE/envs"
export CONDA_PKGS_DIRS="$DATALAB_BASE/envs/conda_pkgs"

# Hugging Face
mkdir -p "$DATALAB_BASE/caches/huggingface"
export HF_HOME="$DATALAB_BASE/caches/huggingface"

# PyTorch
mkdir -p "$DATALAB_BASE/caches/torch"
export TORCH_HOME="$DATALAB_BASE/caches/torch"

# Jupyter runtime
mkdir -p "$DATALAB_BASE/tmp/jupyter"
export JUPYTER_RUNTIME_DIR="$DATALAB_BASE/tmp/jupyter"

# General tmp
export TMPDIR="$DATALAB_BASE/tmp"

# Persist by appending to ~/.bashrc
{
  echo "export DATALAB_BASE=$DATALAB_BASE"
  echo 'export PIP_CACHE_DIR="$DATALAB_BASE/caches/pip"'
  echo 'export CONDA_ENVS_PATH="$DATALAB_BASE/envs"'
  echo 'export CONDA_PKGS_DIRS="$DATALAB_BASE/envs/conda_pkgs"'
  echo 'export HF_HOME="$DATALAB_BASE/caches/huggingface"'
  echo 'export TORCH_HOME="$DATALAB_BASE/caches/torch"'
  echo 'export JUPYTER_RUNTIME_DIR="$DATALAB_BASE/tmp/jupyter"'
  echo 'export TMPDIR="$DATALAB_BASE/tmp"'
} >> ~/.bashrc

echo "Configured caches and envs under $DATALAB_BASE"


SyntaxError: invalid syntax (1678195054.py, line 4)