From a22043625c224ec3e3158fa8874a6dedec3e5071 Mon Sep 17 00:00:00 2001 From: Axel Huebl Date: Sat, 16 Feb 2019 16:33:50 +0100 Subject: [PATCH] Docs: New First Steps A new, simple, step-by-step "first steps" section with detailed instructions. Cheat-sheet like. --- CHANGELOG.rst | 1 + docs/source/index.rst | 3 +- docs/source/usage/firstread.rst | 322 ++++++++++++++++++++++++++++ docs/source/usage/firststeps.rst | 22 -- docs/source/usage/firstwrite.rst | 357 +++++++++++++++++++++++++++++++ docs/source/usage/parallel.rst | 4 +- docs/source/usage/serial.rst | 4 +- 7 files changed, 686 insertions(+), 27 deletions(-) create mode 100644 docs/source/usage/firstread.rst delete mode 100644 docs/source/usage/firststeps.rst create mode 100644 docs/source/usage/firstwrite.rst diff --git a/CHANGELOG.rst b/CHANGELOG.rst index 15792a8967..adb5f3b462 100644 --- a/CHANGELOG.rst +++ b/CHANGELOG.rst @@ -38,6 +38,7 @@ Other - PyPI install method #450 #451 - more info on MPI #449 + - new "first steps" section #473 0.7.1-alpha diff --git a/docs/source/index.rst b/docs/source/index.rst index 72c15962c1..d25f39c5af 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -65,7 +65,8 @@ Usage :maxdepth: 1 :hidden: - usage/firststeps + usage/firstwrite + usage/firstread usage/serial usage/parallel usage/examples diff --git a/docs/source/usage/firstread.rst b/docs/source/usage/firstread.rst new file mode 100644 index 0000000000..f68c78a262 --- /dev/null +++ b/docs/source/usage/firstread.rst @@ -0,0 +1,322 @@ +.. _usage-firstread: + +First Read +========== + +Step-by-step: how to read openPMD data? +We are using the examples files from `openPMD-example-datasets `_. + +.. raw:: html + + + +Include / Import +---------------- + +After successfull :ref:`installation `, you can start using openPMD-api as follows: + +C++11 +^^^^^ + +.. code-block:: cpp + + #include + + // example: data handling & print + #include // std::vector + #include // std::cout + #include // std::shared_ptr + + namespace api = openPMD; + +Python +^^^^^^ + +.. code-block:: python3 + + import openpmd_api as api + + # example: data handling + import numpy as np + + +Open +---- + +Open an existing openPMD series in ``data.h5``. +Further file formats than ``.h5`` (`HDF5 `_) are supported: +``.bp`` (`ADIOS1 `_) or ``.json`` (`JSON `_). + +C++11 +^^^^^ + +.. code-block:: cpp + + auto series = api::Series( + "data%T.h5", + AccessType::READ_ONLY); + + +Python +^^^^^^ + +.. code-block:: python3 + + series = api.Series( + "data%T.h5", + api.Access_Type.read_only) + +Iteration +--------- + +Grouping by an arbitrary, positive integer number ```` in a series. +Let's take the iteration ``100``: + +C++11 +^^^^^ + +.. code-block:: cpp + + auto i = series.iterations[100]; + +Python +^^^^^^ + +.. code-block:: python3 + + i = series.iterations[100] + +Attributes +---------- + +openPMD defines a kernel of meta attributes and can always be extended with more. +Let's see what we've got: + +C++11 +^^^^^ + +.. code-block:: cpp + + std::cout + << "Author: " + << s.author() << "\n" + << "openPMD version: " + << s.openPMD() << "\n"; + +Python +^^^^^^ + +.. code-block:: python3 + + print( + "Author: {0}" + "openPMD version: {1}" + .format( + s.author, + s.openPMD)) + +Record +------ + +An openPMD record can be either structured (mesh) or unstructured (particles). +Let's read an electric field: + +C++11 +^^^^^ + +.. code-block:: cpp + + // record + auto E = i.meshes["E"]; + + // record components + auto E_x = E["x"]; + +Python +^^^^^^ + +.. code-block:: python3 + + # record + E = i.meshes["E"] + + # record components + E_x = E["x"] + +Units +----- + +On read, units are automatically converted to SI unless otherwise specified (not yet implemented). +Even without the name "E" we can check the `dimensionality `_ of a record to understand its purpose. + +C++11 +^^^^^ + +.. code-block:: cpp + + // unit system agnostic dimension + auto E_unitDim = E.getUnitDimension(); + + // ... + // api::UnitDimension::M + +Python +^^^^^^ + +.. code-block:: python3 + + # unit system agnostic dimension + E_unitDim = E.unit_dimension + + # ... + # api.Unit_Dimension.M + +.. note:: + + This example is not yet written :-) + +Register Chunk +-------------- + +We can load record components partially and in parallel or at once. +Reading small data one by one is is a performance killer for I/O. +Therefore, we register the data to be loaded first and then flush it in collectively. + +C++11 +^^^^^ + +.. code-block:: cpp + + // alternatively, pass pre-allocated + std::shared_ptr< double > x_data = + E_x.loadChunk< double >(); + +Python +^^^^^^ + +.. code-block:: python3 + + # returns an allocated but + # undefined numpy array + x_data = E_x.load_chunk() + +.. attention:: + + After registering a data chunk such as ``x_data`` for loading, it MUST NOT be modified or deleted until the ``flush()`` step is performed! + **You must not yet access** ``x_data`` **!** + +Flush Chunk +----------- + +We now flush the registered data chunks and fill them with actual data from the I/O backend. +Flushing several chunks at once allows to increase I/O performance significantly. +**Only after that**, the variable ``x_data`` can be read, manipulated and/or deleted. + +C++11 +^^^^^ + +.. code-block:: cpp + + series.flush(); + +Python +^^^^^^ + +.. code-block:: python3 + + series.flush() + +Data +----- + +We can not work with the newly loaded data in ``x_data``: + +C++11 +^^^^^ + +.. code-block:: cpp + + auto extent = E_x.getExtent(); + + cout << "First values in E_x " + "of shape: "; + for( auto const& dim : extent ) + cout << dim << ', '; + cout << "\n" + + for( size_t col = 0; + col < extent[1] && col < 5; + ++col ) + cout << x_data.get()[col] + << ", "; + cout << "\n"; + + +Python +^^^^^^ + +.. code-block:: python3 + + extent = E_x.shape + + print( + "First values in E_x " + "of shape: ", + extent) + + + print(x_data[0, 0, :5]) + +Close +----- + +Finally, the Series is closed when its destructor is called. +Make sure to have ``flush()`` ed all data loads at this point, otherwise it will be called once more implicitly. + +C++11 +^^^^^ + +.. code-block:: cpp + + // destruct series object, + // e.g. when out-of-scope + +Python +^^^^^^ + +.. code-block:: python3 + + del series diff --git a/docs/source/usage/firststeps.rst b/docs/source/usage/firststeps.rst deleted file mode 100644 index a799ca56ae..0000000000 --- a/docs/source/usage/firststeps.rst +++ /dev/null @@ -1,22 +0,0 @@ -.. _usage-firststeps: - -First Steps -=========== - -For brevity, all following examples assume the following includes/imports: - -C++11 ------ - -.. code-block:: cpp - - #include - - using namespace openPMD; - -Python ------- - -.. code-block:: python3 - - import openpmd_api diff --git a/docs/source/usage/firstwrite.rst b/docs/source/usage/firstwrite.rst new file mode 100644 index 0000000000..fbdf1a8db2 --- /dev/null +++ b/docs/source/usage/firstwrite.rst @@ -0,0 +1,357 @@ +.. _usage-firstwrite: + +First Write +=========== + +Step-by-step: how to write scientific data with openPMD-api? + +.. raw:: html + + + +Include / Import +---------------- + +After successfull :ref:`installation `, you can start using openPMD-api as follows: + +C++11 +^^^^^ + +.. code-block:: cpp + + #include + + // example: data handling + #include // std::iota + #include // std::vector + + namespace api = openPMD; + +Python +^^^^^^ + +.. code-block:: python3 + + import openpmd_api as api + + # example: data handling + import numpy as np + + +Open +---- + +Write into a new openPMD series in ``myOutput/data_<00...N>.h5``. +Further file formats than ``.h5`` (`HDF5 `_) are supported: +``.bp`` (`ADIOS1 `_) or ``.json`` (`JSON `_). + +C++11 +^^^^^ + +.. code-block:: cpp + + auto series = api::Series( + "myOutput/data_%05T.h5", + AccessType::CREATE); + + +Python +^^^^^^ + +.. code-block:: python3 + + series = api.Series( + "myOutput/data_%05T.h5", + api.Access_Type.create) + +Iteration +--------- + +Grouping by an arbitrary, positive integer number ```` in a series: + +C++11 +^^^^^ + +.. code-block:: cpp + + auto i = series.iterations[42]; + +Python +^^^^^^ + +.. code-block:: python3 + + i = series.iterations[42] + +Attributes +---------- + +Everything in openPMD can be extended and user-annotated. +Let us try this by writing some meta data: + +C++11 +^^^^^ + +.. code-block:: cpp + + s.setAuthor( + "Axel Huebl "); + s.setMachine( + "Hall Probe 5000, Model 3"); + s.setAttribute( + "dinner", "Pizza and Coke"); + i.setAttribute( + "vacuum", true); + +Python +^^^^^^ + +.. code-block:: python3 + + s.set_author( + "Axel Huebl ") + s.set_machine( + "Hall Probe 5000, Model 3") + s.set_attribute( + "dinner", "Pizza and Coke") + i.set_attribute( + "vacuum", True) + +Data +---- + +Let's prepare some data that we want to write. +For example, a magnetic field :math:`\vec B(i, j)` slice in two dimensions with three components :math:`(B_x, B_y, B_z)^\intercal` of which the :math:`B_y` component shall be constant for all :math:`(i, j)` indices. + +C++11 +^^^^^ + +.. code-block:: cpp + + std::vector x_data( + 150 * 300); + std::iota( + global_data.begin(), + global_data.end(), + 0.); + + float y_data = 4.f; + + std::vector y_data(x_data); + for( auto& c : y_data ) + c -= 8000.f; + +Python +^^^^^^ + +.. code-block:: python3 + + x_data = np.arange( + 150 * 300, + dtype=np.float + ).reshape(150, 300) + + + + y_data = 4. + + z_data = x_data.copy() - 8000. + +Record +------ + +An openPMD record can be either structured (mesh) or unstructured (particles). +We prepared a vector field in 2D above, which is a mesh: + +C++11 +^^^^^ + +.. code-block:: cpp + + // record + auto B = i.meshes["B"]; + + // record components + auto B_x = B["x"]; + auto B_y = B["y"]; + auto B_z = B["z"]; + + auto dataset = api::Dataset( + determineDatatype(), + {150, 300}); + B_x.resetDataset(dataset); + B_y.resetDataset(dataset); + B_z.resetDataset(dataset); + +Python +^^^^^^ + +.. code-block:: python3 + + # record + B = i.meshes["B"] + + # record components + B_x = B["x"] + B_y = B["y"] + B_z = B["z"] + + auto dataset = api::Dataset( + x_data.dtype, + x_data.shape) + B_x.reset_dataset(dataset) + B_y.reset_dataset(dataset) + B_z.reset_dataset(dataset) + +Units +----- + +Ouch, our measured magnetic field data is in `Gauss `_! +Quick, let's store the conversion factor to SI (`Tesla `_). + +C++11 +^^^^^ + +.. code-block:: cpp + + // conversion to SI + B_x.setUnitSI(1.e-4); + B_y.setUnitSI(1.e-4); + B_z.setUnitSI(1.e-4); + + // unit system agnostic dimension + B.setUnitDimension({ + {api::UnitDimension::M, 1}, + {api::UnitDimension::I, -1}, + {api::UnitDimension::T, -2} + }); + +Python +^^^^^^ + +.. code-block:: python3 + + # conversion to SI + B_x.set_unit_SI(1.e-4) + B_y.set_unit_SI(1.e-4) + B_z.set_unit_SI(1.e-4) + + # unit system agnostic dimension + B.set_unit_dimension({ + api.Unit_Dimension.M: 1, + api.Unit_Dimension.I: -1, + api.Unit_Dimension.T: -2 + }) + +.. tip:: + + Annotating the `dimensionality `_ of a record allows us to read data sets with *arbitrary names* and understand their purpose simply by *dimensionality*. + +Register Chunk +-------------- + +We can write record components partially and in parallel or at once. +Writing very small data one by one is is a performance killer for I/O. +Therefore, we register the data to be written first and then flush it out collectively. + +C++11 +^^^^^ + +.. code-block:: cpp + + B_x.storeChunk( + api::shareRaw(x_data)); + B_z.storeChunk( + api::shareRaw(z_data)); + + B_x.makeConstant(y_data); + +Python +^^^^^^ + +.. code-block:: python3 + + B_x.store_chunk(x_data) + + B_z.store_chunk(z_data) + + + B_x.make_constant(y_data) + +.. attention:: + + After registering a data chunk such as ``x_data`` and ``y_data``, it MUST NOT be modified or deleted until the ``flush()`` step is performed! + +Flush Chunk +----------- + +We now flush the registered data chunks to the I/O backend. +Flushing several chunks at once allows to increase I/O performance significantly. +After that, the variables ``x_data`` and ``y_data`` can be used again. + +C++11 +^^^^^ + +.. code-block:: cpp + + series.flush(); + +Python +^^^^^^ + +.. code-block:: python3 + + series.flush() + +Close +----- + +Finally, the Series is fully closed (and newly registered data or attributes since the last ``.flush()`` is written) when its destructor is called. + +C++11 +^^^^^ + +.. code-block:: cpp + + // destruct series object, + // e.g. when out-of-scope + +Python +^^^^^^ + +.. code-block:: python3 + + del series diff --git a/docs/source/usage/parallel.rst b/docs/source/usage/parallel.rst index fd2eccc897..0168fbfb7a 100644 --- a/docs/source/usage/parallel.rst +++ b/docs/source/usage/parallel.rst @@ -1,7 +1,7 @@ .. _usage-parallel: -Parallel API -============ +Parallel Examples +================= The following examples show parallel reading and writing of domain-decomposed data with MPI. diff --git a/docs/source/usage/serial.rst b/docs/source/usage/serial.rst index fe77c2d446..5c8b1e0c26 100644 --- a/docs/source/usage/serial.rst +++ b/docs/source/usage/serial.rst @@ -1,7 +1,7 @@ .. _usage-serial: -Serial API -========== +Serial Examples +=============== The serial API provides sequential, one-process read and write access. Most users will use this for exploration and processing of their data.