# Dataset Submission

In this example, we will be working with a Singlepoint Dataset. However, the concepts will apply to all other datasets

In [None]:
import qcportal as ptl
from qcportal.molecules import Molecule
from qcportal.singlepoint import QCSpecification

In [None]:
# Submission requires a username/password
client = ptl.PortalClient("https://qcademo.molssi.org", username="YOUR_USERNAME", password="YOUR_PASSWORD")

## Creating the dataset, entries, and specifications

We will create the singlepoint dataset on the server

The first argument is the type of dataset. See
[PortalClient.add_dataset](../api/qcportal.rst#qcportal.client.PortalClient.add_dataset) for more options. We are also adding a default tag for all the computations.

In [None]:
ds = client.add_dataset("singlepoint",
                        name="Element Benchmark",
                        description="Variety of calculations on single atoms",
                        default_tag="sp_el_tag")

Now add datasaet entries. For a singlepoint dataset, these correspond to the molecules the singlepoint calculation runs on.

This cell creates ten Molecule objects, one for each of the first 10 elements, with the atom at the origin. It then creates entries for the dataset, and adds them to the dataset.

Dataset entries can have other fields as well. See, for example, [SinglepointDatasetNewEntry](../api/qcportal.datasets.singlepoint.rst#qcportal.datasets.singlepoint.models.SinglepointDatasetNewEntry)

In [None]:
for m in ['h', 'he', 'li', 'be', 'b', 'c', 'n', 'o', 'f', 'ne']:
    mol = Molecule(symbols=[m], geometry=[0.0, 0.0, 0.0])
    
    # Creates an entry from the molecule. The entry contains the molecule and a name,
    # but there are additional fields you can have as well
    entry_name = m + "_atom"
    ds.add_entry(name=entry_name, molecule=mol)

We will now create two different specifications, and add them to the dataset. The First will be hf/sto-3g, and the second will be mp2/aug-cc-pvtz.

On both, we will increase the maximum number of SCF iterations to 100

In [None]:
spec_1 = QCSpecification(
            program="psi4",
            driver="energy",
            method="hf",
            basis="sto-3g",
            keywords={"maxiter": 100}
)

spec_2 = QCSpecification(
            program="psi4",
            driver="properties",
            method="mp2",
            basis="aug-cc-pvtz",
            keywords={"maxiter": 100}
)

ds.add_specification(name="hf/sto-3g", specification=spec_1)
ds.add_specification(name="mp2/aug-cc-pvtz", specification=spec_2)

## Submitting the computations and checking the status

At this point, we have added specifications and entries,
but have not submitted any calculations yet. We do that with
the `submit()` function

By default, this submits all calculations, but we could restrict the entries
and specifications that get submitted.

The compute tag for all these computations can be specified here, but by default, the `default_tag` we passed to the `add_dataset` function will be used.

In [None]:
ds.submit()

We can check the status of the calculations on the server with the `status()` function. Note that this will always be computed on the server, and will not use any locally-cached records.

In [None]:
ds.status()

We can then view/manipulate records as before

In [None]:
rec = ds.get_record("h_atom", "hf/sto-3g")
print(rec.id)
print(rec.properties["return_energy"])

In [None]:
df = ds.compile_values(lambda r: r.properties["return_energy"])

In [None]:
print(df)