# Testing code with Scitacean

Testing programs that use Scitacean can be tricky as those tests might try to access a SciCat server or fileserver.
This is why Scitacean provides [FakeClient](../generated/classes/scitacean.testing.client.FakeClient.rst) and [FakeFileTransfer](../generated/classes/scitacean.testing.transfer.FakeFileTransfer.rst).
Those two classes follow the same separation of concerns as the real classes.
That is `FakeClient` handles metadata and `FakeFileTransfer` handles files.
They can be mixed and matched freely with the real client and file transfers.
But it is generally recommended to combine them.

First, create a test dataset and file.

In [None]:
from scitacean import Dataset

dataset = Dataset(
    type="raw",
    owner_group="faculty",
    owner="ridcully",
    principal_investigator="ridcully",
    contact_email="ridcully@uu.am",
    data_format="spellbook-9000",
    source_folder="/upload",
)

In [None]:
from pathlib import Path

path = Path("test-data/spellbook.txt")
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("w") as f:
    f.write("fireball power=1000 mana=123")

In [None]:
dataset.add_local_files("test-data/spellbook.txt", base_path="test-data")

## FakeClient

[scitacean.testing.client.FakeClient](../generated/classes/scitacean.testing.client.FakeClient.rst) has the same interface as the regular [Client](../generated/classes/scitacean.Client.rst) but never connects to any SciCat server.
Instead, it maintains an internal record of datasets and datablocks.
It is easiest to explain with an example.
First, create a `FakeClient`.
The url is completely arbitrary and only needs to be passed for parity with the real client.

In [None]:
from scitacean.testing.client import FakeClient
from scitacean.testing.transfer import FakeFileTransfer

client = FakeClient.without_login(
    url="https://fake.scicat",
    file_transfer=FakeFileTransfer(source_folder="/upload/{uid}"))

### Upload

And now we can upload our test dataset as usual:

In [None]:
finalized = client.upload_new_dataset_now(dataset)
str(finalized)

However, this did not talk to a SciCat server.
We can check if the fake upload was successful by inspecting the `client`.
`client.datasets` is a `dict` that contains all datasets known to the fake server keyed by PID:

In [None]:
client.datasets.keys()

In [None]:
pid = list(client.datasets.keys())[0]
client.datasets[pid]

The client has recorded the upload from earlier.
However, it stored the dataset as a [model](../reference/index.rst#models), not as a regular `Dataset` object.
In addition, since the dataset has a file, an original datablock was uploaded as well: (Datablocks store metadata and paths of files in SciCat.)

In [None]:
client.orig_datablocks.keys()

In [None]:
# use the pid of the dataset
client.orig_datablocks[pid]

When writing tests, those recorded dataset and datablock models can be used to check if an upload worked.

### Download

`FakeClient` can also download datasets that are stored in its `datasets` dictionary:

In [None]:
downloaded = client.get_dataset(pid)
str(downloaded)

This is now an actual `Dataset` object like you would get from a real client.

If we want to test downloads independently of uploads, we can populate `client.datasets` and `cliend.orig_datablocks` manually.
But keep in mind that those store *models*. See the [model reference](../reference/index.rst#models) for an overview.
And also note that `orig_datablocks` stores a list of models for each dataset as there can be multiple datablocks per dataset.

### Fidelity

Although `FakeClient` is sufficient for many tests, it does not behave exactly the same way as a real client.
For example, it does not perform any validation of datasets or handle credentials.
In addition, it does not modify uploaded datasets like a real server would.
This can be seen from both the `finalized` dataset returned by `client.upload_new_dataset_now(dataset)` above and `downloaded`.

If a test requires these properties, consider using a locally deployed SciCat server.
See in particular the [developer documentation on testing](../developer/testing.rst).

## FakeFileTransfer

`FakeClient` used above only fakes a SciCat server, i.e. handling of metadata.
If we also want to test file uploads and downloads, we can use [scitacean.testing.transfer.FakeFileTransfer](../generated/classes/scitacean.testing.transfer.FakeFileTransfer.rst).
Starting from a clean slate, create a fake client with a fake file transfer as above:

In [None]:
from scitacean.testing.client import FakeClient
from scitacean.testing.transfer import FakeFileTransfer

client = FakeClient.without_login(
    url="https://fake.scicat",
    file_transfer=FakeFileTransfer(source_folder="/upload/{uid}"))

And upload a dataset:

In [None]:
finalized = client.upload_new_dataset_now(dataset)

The file transfer has recorded the upload of the file without actually uploading it anywhere.
We can inspect all files on the fake fileserver using:

In [None]:
client.file_transfer.files

This is a dictionary keyed by [remote_access_path](../generated/classes/scitacean.File.rst#scitacean.File.remote_access_path) to the content of the file.

We can also download the file.

In [None]:
downloaded = client.get_dataset(finalized.pid)
with_downloaded_file = client.download_files(downloaded, target="test-data/download")

In [None]:
file = list(with_downloaded_file.files)[0]
file

In [None]:
with file.local_path.open() as f:
    print(f.read())

If we want to test downloads independently of uploads, we can populate `client.file_transfer.files` manually.

In [None]:
# This cell is hidden.
# It should remove *only* files and directories created by this notebook.
import shutil
shutil.rmtree("test-data", ignore_errors=True)