# iRODS sync tutorial

`sync` synchronizes the data between a local copy (local file system) and the copy stored in iRODS. It compares the checksum of local and remote files to determine whether they have changed and should be synchronized. It creates files or overwrites older copies, but does not delete files from the target location when they have been deleted from the source.

The command can be in one of the two modes: synchronization of data from the client's local file system to iRODS, or from iRODS to the local file system.

In [1]:
import os
from pathlib import Path
from pprint import pprint
from ibridges.interactive import interactive_auth
from ibridges.path import IrodsPath
from ibridges import sync

### Create a session

Set up a session to an iRODS server. In this example we assume you have a valid locally cached iRODS password from a previous session.

In [2]:
session = interactive_auth()

### Uploading/downloading
Upload or download mode is determined by the type of `source` and `target` (`IrodsPath` or `str`/`Path`).

When uploading, `source` must be an existing local folder, and `target` an existing iRODS collection, and vice versa when downloading. An exception will be raised if either doesn't exist.

In [8]:
source = IrodsPath(session, "~", "Demo")
source.create_collection()
target = Path.home().joinpath("Downloads/Sync")

In [9]:
print(target, target.is_dir())
print(source, source.collection_exists())

/Users/staig001/Downloads/Sync True
/uu/home/research-christine/Demo True


### Setting sync options
`sync` takes various options:

- The `max_level` option controls the depth up to which the file tree will be synchronized. With `max_level` set to None (default), there is no limit (full recursive synchronization). A max level of 1 synchronizes only the source's root, max level 2 also includes the first set of subfolders/subcollections and their contents, etc.
- The `copy_empty_folders` (default False) option controls whether folders/collections that contain no files or subfolders/subcollections will be synchronized.
- The `dry_run` option lists all the source files and folders that need to be synchronized without actually performing the synchronization.

By default, checksums of all transferred files will be calculated and verified after up- or downloading. A checksum mismatch will generate an error, aborting the synchronization process. Should this happen, it is possible some hiccup occurred during the transfer process. Check both copies of the offending file, and retain the correct one.

In [10]:
max_level=None
copy_empty_folders=True
dry_run=True

### Dry run
Setting `dry_run` to True will list what will be synchronized without any actual transfers.

In [11]:
ops = sync(
    source=source,
    target=target,
    max_level=max_level,      
    dry_run=dry_run,
    copy_empty_folders=copy_empty_folders
)
ops.print_summary()




To perform the actual synchronization, set `dry_run` to False, and run again.

In [12]:
dry_run=False

In [13]:
ops = sync(
    source=source,
    target=target,
    max_level=max_level,      
    dry_run=dry_run,
    copy_empty_folders=copy_empty_folders)


