Skip to content

Commit

Permalink
Merge branch 'v0.5.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
csadorf committed Aug 31, 2016
2 parents d5a70e8 + ee25101 commit 780f83e
Show file tree
Hide file tree
Showing 27 changed files with 613 additions and 222 deletions.
16 changes: 16 additions & 0 deletions changelog.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
0.5.0:

- General updates:

- The performance of project indexing and crawling has been improved.

- API changes:

- New function: `signac.init_project()` simplifies project initialization
within python
- Added optional `root` argument to `signac.get_project()` to simplify
getting a project handle outside of the current working directory
- Added optional argument to `signac.get_project()`, to allow for projec
- Added two class factory methods to `Project`: `get_project()` and
`init_project()`

0.4.0

- General updates:
Expand Down
4 changes: 2 additions & 2 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,9 +76,9 @@ def __getattr__(cls, name):
# built documents.
#
# The short X.Y version.
version = '0.4'
version = '0.5'
# The full version, including alpha/beta/rc tags.
release = '0.4.0'
release = '0.5.0'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
2 changes: 1 addition & 1 deletion doc/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Project configuration
A project configuration file is defined by containing the keyword *project*.
Once **signac** found a project configuration file it will stop to search for more configuration files above the current working directory.

For example, to initialize a project named *MyProject*, navigate to the project's root directory and either execute ``$ signac init MyProject`` on the command line or create the project configuration file manually.
For example, to initialize a project named *MyProject*, navigate to the project's root directory and either execute ``$ signac init MyProject`` on the command line, use the :py:func:`signac.init_project` function or create the project configuration file manually.
This is an example for a project configuration file:

.. code-block:: ini
Expand Down
14 changes: 12 additions & 2 deletions doc/projects.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,17 @@ Specifying a project name as identifier within a configuration file initiates a
$ mkdir my_project
$ cd my_project
$ signac init MyProject
# or
You can alternatively initialize your project within python with

.. code-block:: python
>>> project = signac.init_project('MyProject')
Finally, you can of course also create the configuration manually, e.g., with:

.. code-block:: bash
$ echo project=MyProject >> signac.rc
The directory that contains this configuration file is the project's root directory.
Expand Down Expand Up @@ -62,7 +72,7 @@ Get an instance of :py:class:`~signac.contrib.job.Job`, which is a handle on you
>>> job = project.open_job(statepoint)
>>> job.get_id()
'9bfd29df07674bc4aa960cf661b5acd2'
Equivalent from the command line:

.. code-block:: shell
Expand Down
4 changes: 4 additions & 0 deletions doc/reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ Projects
Start a new project
-------------------

.. code-block:: python
project = signac.init_project('MyProject')
.. code-block:: bash
$ mkdir my_project
Expand Down
62 changes: 36 additions & 26 deletions doc/tutorial/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Project Setup

To start we create a new project directory.
You can create the directory anywhere, for example in your home directory.
Initialize your project either on the command line

.. code-block:: bash
Expand All @@ -60,13 +61,21 @@ You can create the directory anywhere, for example in your home directory.
$ signac init IdealGasProject
Initialized project 'IdealGasProject'.
or within python:

.. code-block:: python
>>> import signac
>>> project = signac.init_project('IdealGasProject')
>>>
This creates a config file ``signac.rc`` within our project root directory with the following content:

.. code-block:: bash
project=IdealGasProject
Alternatively you can create the config file manually with ``$ echo "project=IdealGasProject" > signac.rc``.
Alternatively you can create the config file manually, e.g., with ``$ echo "project=IdealGasProject" > signac.rc``.

The project is the interface to our data space.
We can either interact with it on the command line or use the python interface:
Expand All @@ -81,24 +90,25 @@ We can either interact with it on the command line or use the python interface:
A minimal Example
=================

For this tutorial we want to compute the volume of an ideal gas as a function of its pressure and temperature.
For this tutorial we want to compute the volume of an ideal gas as a function of its pressure and thermal energy.

.. math::
p V = N k_B T
p V = N k_B T,
We will set :math:`k_B=1` and execute the complete study in **7 lines** of code:
where :math:`N` is the system size, :math:`p` the pressure, :math:`k_B T` the thermal energy and :math:`V` the volume.
We will execute the complete study in **7 lines of code**:

.. code-block:: python
0. # minimal.py
1. import signac
2. project = signac.get_project()
3. for p in 0.1, 1.0, 10.0:
4. sp = {'p': p, 'T': 10.0, 'N': 10}
4. sp = {'p': p, 'kT': 1.0, 'N': 1000}
5. with project.open_job(sp) as job:
6. if 'V' not in job.document:
7. job.document['V'] = sp['N'] * sp['T'] / sp['p']
7. job.document['V'] = sp['N'] * sp['kT'] / sp['p']
1. Import the ``signac`` package.
2. Obtain a handle for the configured project.
Expand All @@ -115,9 +125,9 @@ We can then examine our results by iterating over the data space:
>>> for job in project.find_jobs():
... print(job.statepoint()['p'], job.document['V'])
...
0.1 1000.0
1.0 100.0
10.0 10.0
0.1 10000.0
1.0 1000.0
10.0 100.0
This concludes the minimal example.
In the next section we will assume that the ideal gas computation represents a more expensive computation.
Expand All @@ -134,7 +144,7 @@ Data space initialization
In the minimal example we initialized the data space *implicitly*.
Let's see how we can initialize it *explicitly*.
In general, the data space needs to contain all parameters that will affect our data.
For the ideal gas that is a 3-dimensional space spanned by the temperature *T*, the pressure *p* and the system size *N*.
For the ideal gas that is a 3-dimensional space spanned by the thermal energy *kT*, the pressure *p* and the system size *N*.

Each state point represents a unique set of parameters that we want to associate with data.
In terms of signac this relationship is represented by a :py:class:`~signac.contrib.job.Job`.
Expand All @@ -150,7 +160,7 @@ Let's define our initialization routine in a script called ``init.py``:
project = signac.get_project()
for pressure in 0.1, 1.0, 10.0:
statepoint = {'p': pressure, 'T': 1.0, 'N': 1000}
statepoint = {'p': pressure, 'kT': 1.0, 'N': 1000}
job = project.open_job(statepoint)
job.init()
print(job, 'initialized')
Expand All @@ -160,9 +170,9 @@ We can now initialize the workspace with:
.. code-block:: bash
$ python init.py
3daa7dc28de43a2ff132a4b48c6abe0e initialized
9e100da58ccdf6ad7941fce7d14deeb5 initialized
07dc3f53615713900208803484b87253 initialized
5a6c687f7655319db24de59a2336eff8 initialized
ee617ad585a90809947709a7a45dda9a initialized
5a456c131b0c5897804a4af8e77df5aa initialized
The output shows the job ids associated with each state point.
The *job id* is a unique identifier representing the state point.
Expand All @@ -175,9 +185,9 @@ The project's workspace has been populated with directories for each state point
.. code-block:: bash
$ ls -1 workspace/
07dc3f53615713900208803484b87253
3daa7dc28de43a2ff132a4b48c6abe0e
9e100da58ccdf6ad7941fce7d14deeb5
5a6c687f7655319db24de59a2336eff8
ee617ad585a90809947709a7a45dda9a
5a456c131b0c5897804a4af8e77df5aa
We could execute the initialization script multiple times to add more state points, already existing jobs will be ignored.

Expand All @@ -199,12 +209,12 @@ For this we define two functions inside a ``run.py`` script:
"Compute the volume of this state point."
sp = job.statepoint()
with job:
V = calc_volume(sp['N'], sp['T'], sp['p'])
V = calc_volume(sp['N'], sp['kT'], sp['p'])
with open('V.txt', 'w') as file:
file.write(str(V)+'\n')
print(job, 'computed volume')
The ``calc_volume()`` function returns the volume of an ideal gas with a system size *N*, temperature *T* and pressure *p*.
The ``calc_volume()`` function returns the volume of an ideal gas with a system size *N*, thermal energy *kT* and pressure *p*.
The ``compute_volume()`` function retrieves the state point from the job argument and stores the result of the ideal gas law calculation in a file called ``V.txt``.
The ``with job:`` clause utilizes the ``job`` handle as a context manager.
It means that all commands below it are executed within the job's workspace directory.
Expand Down Expand Up @@ -232,7 +242,7 @@ Let's add a few more lines to complete the ``run.py`` script:
"Compute the volume of this state point."
sp = job.statepoint()
with job:
V = calc_volume(sp['N'], sp['T'], sp['p'])
V = calc_volume(sp['N'], sp['kT'], sp['p'])
with open('V.txt', 'w') as file:
file.write(str(V)+'\n')
print(job, 'computed volume')
Expand All @@ -246,16 +256,16 @@ We are now ready to execute:
.. code-block:: bash
$ python run.py
07dc3f53615713900208803484b87253 computed volume
3daa7dc28de43a2ff132a4b48c6abe0e computed volume
9e100da58ccdf6ad7941fce7d14deeb5 computed volume
5a456c131b0c5897804a4af8e77df5aa computed volume
5a6c687f7655319db24de59a2336eff8 computed volume
ee617ad585a90809947709a7a45dda9a computed volume
And we can verify that we actually stored data:

.. code-block:: bash
$ cat workspace/07dc3f53615713900208803484b87253/V.txt
100.0
$ cat workspace/ee617ad585a90809947709a7a45dda9a/V.txt
1000.0
Analyzing data
--------------
Expand Down Expand Up @@ -307,7 +317,7 @@ To use the job document instead of a file, we need to modify our operation funct
def compute_volume(job):
sp = job.statepoint()
with job:
V = calc_volume(sp['N'], sp['T'], sp['N'])
V = calc_volume(sp['N'], sp['kT'], sp['N'])
job.document['V'] = V # <-- new line
with open('V.txt', 'w') as file:
file.write(str(V)+'\n')
Expand Down
28 changes: 14 additions & 14 deletions doc/tutorial/data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ By default, using the :py:meth:`~.Project.find_jobs` function will return a list
>>> for job in project.find_jobs():
... print(job)
...
474778977e728a74b4ebc2e14221bef6
3daa7dc28de43a2ff132a4b48c6abe0e
8629822576debc2bfbeffa56787ca348
9110d0837ad93ff6b4013bae30091edd
# ...
Similarly, we can execute ``signac find`` on the command line to get a list of all *job ids* within the workspace:
Expand All @@ -24,8 +24,8 @@ Similarly, we can execute ``signac find`` on the command line to get a list of a
$ signac find
Indexing project...
474778977e728a74b4ebc2e14221bef6
3daa7dc28de43a2ff132a4b48c6abe0e
8629822576debc2bfbeffa56787ca348
9110d0837ad93ff6b4013bae30091edd
# ...
A standard operation is to find and operate on a **data subset**.
Expand All @@ -36,7 +36,7 @@ For this purpose we can use a filter argument, which will return all jobs with m
>>> for job in project.find_jobs({'p': 1.0}):
... print(job)
...
3daa7dc28de43a2ff132a4b48c6abe0e
ee617ad585a90809947709a7a45dda9a
Or equivalently on the command line:
Expand All @@ -45,15 +45,15 @@ Or equivalently on the command line:
$ signac find '{"p": 0.1}'
Indexing project...
3daa7dc28de43a2ff132a4b48c6abe0e
5a6c687f7655319db24de59a2336eff8
Next, we verify the selection by piping the output of ``signac find`` into the ``signac statepoints`` command via ``xargs``:

.. code-block:: bash
$ signac find '{"p": 0.1}' | xargs signac statepoint
Indexing project...
{"p": 0.1, "T": 1.0, "N": 1000}
{"N": 1000, "p": 0.1, "kT": 1.0}
Instead of filtering by statepoint, we can also filter by values in the *job document* (or both):
Expand All @@ -63,7 +63,7 @@ Instead of filtering by statepoint, we can also filter by values in the *job doc
>>> for job in project.find_jobs(doc_filter={'V': 100}):
... print(job)
...
07dc3f53615713900208803484b87253
5a456c131b0c5897804a4af8e77df5aa
Finding jobs by certain criteria requires an index of the data space.
In the previous examples this index was created implicitly, however depending on the data space size, it may make sense to create the index explicitly for multiple uses.
Expand All @@ -80,8 +80,8 @@ To create an index, we need to crawl through the project's data space, for examp
>>> for doc in project.index():
... print(doc)
{'statepoint': {'N': 1000, 'T': 1.0, 'p': 10.0}, '_id': '07dc3f53615713900208803484b87253', 'signac_id': '07dc3f53615713900208803484b87253', 'V': 100.0}
{'statepoint': {'N': 1000, 'T': 1.0, 'p': 4.5}, '_id': '14ba699529683f7132c863c51facc79c', 'signac_id': '14ba699529683f7132c863c51facc79c', 'V': 222.22222222222223}
{'statepoint': {'p': 5.6, 'N': 1000, 'kT': 1.0}, '_id': '05061d2acea19d2d9a25ac3360f70e04', 'signac_id': '05061d2acea19d2d9a25ac3360f70e04', 'V': 178.57142857142858}
{'statepoint': {'p': 1.2000000000000002, 'N': 1000, 'kT': 1.0}, '_id': '22582e83c6b12336526ed304d4378ff8', 'signac_id': '22582e83c6b12336526ed304d4378ff8', 'V': 833.3333333333333}
# ...
Or by executing the ``signac index`` function on the command line:
Expand All @@ -90,8 +90,8 @@ Or by executing the ``signac index`` function on the command line:
$ signac index
Indexing project...
{"signac_id": "07dc3f53615713900208803484b87253", "V": 100.0, "_id": "07dc3f53615713900208803484b87253", "statepoint": {"N": 1000, "p": 10.0, "T": 1.0}}
{"signac_id": "14ba699529683f7132c863c51facc79c", "V": 222.22222222222223, "_id": "14ba699529683f7132c863c51facc79c", "statepoint": {"N": 1000, "p": 4.5, "T": 1.0}}
{"signac_id": "05061d2acea19d2d9a25ac3360f70e04", "V": 178.57142857142858, "statepoint": {"N": 1000, "p": 5.6, "kT": 1.0}, "_id": "05061d2acea19d2d9a25ac3360f70e04"}
{"signac_id": "22582e83c6b12336526ed304d4378ff8", "V": 833.3333333333333, "statepoint": {"N": 1000, "p": 1.2000000000000002, "kT": 1.0}, "_id": "22582e83c6b12336526ed304d4378ff8"}
# ...
We can store and reuse this index, e.g. to speed up find operations:
Expand All @@ -102,8 +102,8 @@ We can store and reuse this index, e.g. to speed up find operations:
Indexing project...
$ signac find --index=index.txt
Reading index from file 'index.txt'...
b0dd91c4755b81b47becf83e6fb22413
957349e42149cea3b0362226535a3973
05061d2acea19d2d9a25ac3360f70e04
e8186b9b68e18a82f331d51a7b8c8c15
# ...
At this point the index contains information about the statepoint and all data stored in the *job document*.
Expand Down

0 comments on commit 780f83e

Please sign in to comment.