-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doc issue 78 #94
Doc issue 78 #94
Conversation
I left some text between "begin remove" ... "end remove" markers. |
docs/source/data-model.rst
Outdated
Data Model | ||
********** | ||
******************************** | ||
Data Model: Documents and Events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of Events as one of the kinds of Documents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have had some confusion with the event/Event terminology. In my mind there are event-based execution model "events" and then there are "Event Documents". I try not to think of the documents as "events" but as "Events". I think you are right, this heading should go back to just "Data Model".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, let's be clearer. Instead of starting with the next paragraph, start with something more direct.
In data acquisition, an *event* is a set of readings coordinated at a specific time. A primary design goal of bluesky is to enable better research by recording rich metadata alongside the event data for use in later analysis. Bluesky encapsulates all information from each event in a *Document*. Other types of Document describe the events associated with an experimental run.
docs/source/data-model.rst
Outdated
|
||
A *document* is our term for a Python dictionary with a schema --- that is, | ||
organized in a | ||
`formally specified <https://github.com/NSLS-II/event-model>`_ way --- created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link is a cross-reference from bluesky to event-model. Now that we are in event-model, I think we can just remove this link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AND, leave the title of this section as "Data Model: Documents and Events" since it creates the necessary association with the data acquisition term event and the bluesky document type event.
docs/source/data-model.rst
Outdated
Minimal nontrivial valid example: | ||
|
||
.. code-block:: python | ||
|
||
# 'stop' document |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should add a comment like this to all the examples? Most of them do not have one. I would rather apply it to none or all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added them to all the examples.
docs/source/data-model.rst
Outdated
Data Model: Documents and Events | ||
******************************** | ||
|
||
*begin remove* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still like this original first paragraph, doesn't seem exactly redundant to what follows. Maybe combine it with the next paragraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time trying to combine the original first paragraph and the text from documents.rst. My problem is that the original is talking about little-e "events" and it seems confusing to talk about big-e "Events" at the same time.
I will try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a new proposal: I added the original paragraph to the second new paragraph. I still think using the term "event-based" is confusing so I propose using "document-based" instead. I felt like the original text fits here since the following paragraph talks about subscribing to a stream of documents.
A primary design goal of bluesky is to enable better research by recording
rich metadata alongside measured data for use in later analysis. Documents are
how we do this.
A *document* is our term for a Python dictionary with a schema created
by the RunEngine during plan execution. All of the metadata and data generated
by executing the plan is organized into documents. Bluesky's document-based
data model supports complex, asynchronous data collection and enables
sophisticated live, prompt, streaming, and *post-facto* data analysis.
A :doc:"later section <callbacks>" describes how outside functions can
"subscribe" to a stream of these documents, visualizing, processing, or saving
them. This section provides an outline of documents themselves, aiming to give
a sense of the structure and familiarity with useful components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
working on the third paragraph below....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I agree this is tricky. This line once made sense
created by the RunEngine during plan execution
but is now too limiting: we can "ingest" data files from non-bluesky beamlines or simulations to create documents now. I wouldn't want to give the wrong impression that the RunEngine is the only way to mint documents.
docs/source/data-model.rst
Outdated
by the RunEngine during plan execution. All of the metadata and data generated | ||
by executing the plan is organized into documents. | ||
|
||
A :doc:"later section <callbacks>" describes how outside functions can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the bluesky internal references of the kind I think we should just remove. Maybe this whole paragraph can go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the mention of subscribing to a document stream. It convinces me that this is very useful information. I propose to link to the Bluesky documentation:
The `bluesky documentation <https://blueskyproject.io/bluesky/>`_ describes
how outside functions can "subscribe" to a stream of these documents,
visualizing, processing, or saving them. This section provides an outline of
documents themselves, aiming to give a sense of the structure and familiarity
with useful components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense.
docs/source/event-descriptors.rst
Outdated
Event Descriptors | ||
================= | ||
|
||
In the section on :doc:`documents`, we gave an overview of the four kinds of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the section on :doc:`documents`, we gave an overview of the four kinds of | |
In the section on :doc:`data-model`, we gave an overview of the types of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check
docs/source/event-descriptors.rst
Outdated
================= | ||
|
||
In the section on :doc:`documents`, we gave an overview of the four kinds of | ||
document. We presented an example Run Start, Event, and Run Stop, but we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This second sentence is no longer accurate. We do present an Event Descriptor example. We just down break it down and discuss it in detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check
docs/source/event-descriptors.rst
Outdated
|
||
.. code-block:: python | ||
|
||
# 'event' document (same as above, shown again for reference) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"above" is out of date --- this is own its own page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check
docs/source/data-model.rst
Outdated
Overview | ||
======== | ||
|
||
The data model is composed of six types of Documents, which in Python are | ||
represented as dictionaries but could be represented as nested mappings (e.g. | ||
JSON) in any language. Each document class has a defined, but flexible, schema. | ||
|
||
*begin remove* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised you marked this for removal. I figure it's good to have a very short summary of the meaning of each document somewhere. As someone coming to this fresh, do you think this is so succinct that it doesn't add much value, that we might as well go straight to the more detailed description?
I would be happy with any of the following:
- Drop bluesky's version of this overview and keep event-model's.
- Merge some of event-model's verbiage (e.g. "the who / why / what") into bluesky's.
It depends on how much detail we want to present this early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the similar summary of each document from documents.rst because there was a lot of detail, but I see the value of a shorter summary for each document. I will drop the bluesky version.
docs/source/data-model.rst
Outdated
|
||
generates a typical 'run start' document like this: | ||
|
||
.. code-block:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is older and less accurate that the version in event-model. We should keep the version in event-model. We could however reincorporate the example bluesky code if that seems helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check
In the "Overview" section at the top of data-model.rst there is no summary for resource or datum documents. @danielballan if you would like to write those I will put them in. |
The first sentence in the Overview section is "The data model is composed of six types of Documents...." so I feel let down because the list has only four. I would like to move the Resource and Datum documents up. |
Sounds good. |
"Reference" is not much better than "Pointer" in terms of clarity. It's not really a It is "A bundle of data you can trade with another system to get the data you really want" (which is how I understand more-than-one-reference-deep pointers in c++ 😜 ). |
I'm with @prjemian that "pointer" has the potential to confuse sophisticated programmers who, in first encountering our work, provisionally map "pointer" to "memory reference" and need more context to be sure that's not what we mean. I have some empirical evidence for this: I avoided using that term at first, then tried using it for awhile, and found that it confused people. Maybe "record that points to" in place of "pointer to"? |
I suggest opening a new issue on the topic of "pointer" vs "reference" vs something else since this PR is focused on moving some specific documentation from the bluesky repo to the event-model repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, this is getting close. Just a couple picky comments to make sure all our statements are up to date and not misleading in any way.
docs/source/data-model.rst
Outdated
rich metadata alongside measured data for use in later analysis. Documents are | ||
how we do this. | ||
|
||
A *document* is our term for a Python dictionary with a schema created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This language should be updated since the RunEngine is now not the only way to create documents. Something like:
A document is our term for a Python dictionary with a schema. The bluesky RunEngine emits documents during plan execution.
We might want to elaborate:
There are tools for composing documents directly, which can be useful for ingesting data that was not acquired with bluesky into this data model.
But that might be too much detail for this section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am inclined to leave out the elaboration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that is too much for this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. I lean that way as well, though I included it for consideration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the chance to review. Need another view before I can approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the one additional change (Globally unique ID of this run's 'start' document.), then I'm good.
docs/source/data-model.rst
Outdated
rich metadata alongside measured data for use in later analysis. Documents are | ||
how we do this. | ||
|
||
A *document* is our term for a Python dictionary with a schema created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that is too much for this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's much better!
Sorry @prjemian I don't know why that change wasn't showing up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for driving this, @jklynch!
Move event-model documentation from bluesky to event-model.
Description
Merge https://blueskyproject.io/bluesky/documents.html into https://blueskyproject.io/event-model/data-model.html.
Move https://nsls-ii.github.io/bluesky/event_descriptors.html into the event-model project.
Motivation and Context
This PR addresses issue #78.
How Has This Been Tested?