New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-13164: Update documentation for ap_pipe #22
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on the more technical aspects of the documentation. I'm a bit concerned that it's not really topic-based documentation, but I assume that's partly to get a tutorial out there ASAP.
doc/ap_pipe/index.rst
Outdated
https://github.com/lsst-dm/ap_pipe | ||
|
||
JIRA component | ||
``ap_pipe`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Links are your (users') friend: `ap_pipe <https://jira.lsstcorp.org/issues/?jql=project %3D DM AND component %3D ap_pipe>`_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there... supposed to be spaces in that URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will work on any modern browser, and IMO it's more readable. You can use %20 if you really want to, the template does.
`lsst.ip.diffim`; and | ||
3. `~lsst.ap.associate.AssociationTask`, which makes a catalog of | ||
Difference Image Analysis (DIA) Objects from the DIASources created | ||
during image differencing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies that is must be these specific tasks, which is not the case -- can you make this more general (say, by writing "CCD processing (e.g., by `~lsst.pipe.tasks.ProcessCcdTask`)")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing some subtlety here, but isn't the stack's CCD processing task pretty much forever and always ProcessCcdTask
? It's not like we have a whole suite of CCD processing task options one can choose from when running ap_pipe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we don't now, we could in the future (especially for the sub-sub-tasks you mention).
doc/lsst.ap.pipe/getting-started.rst
Outdated
can be somewhat camera-specific, and is outside the scope of the AP Pipeline. | ||
|
||
A utility to ingest data before running `lsst.ap.pipe` | ||
is available in `lsst.ap.verify.ingestion`. Alternately, you may use a pre- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not true; perhaps you're thinking of ap_verify/bin/ingest_dataset.py
? Link as :ref:`ap-verify-run-ingest`.
It might also be worth noting that the ap_verify
script only works on Reiss-style datasets (:ref:`ap-verify-datasets`); the current wording makes it sound like it will magically ingest anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was looking at the filename ingestion.py
, but I will clarify both the name and say the ingestion script in ap_verify only ingests a "dataset" of the proper format.
doc/lsst.ap.pipe/getting-started.rst
Outdated
|
||
`lsst.ap.pipe` is not yet included with the LSST Science Pipelines stack. | ||
You will need to clone, setup, and build it by following the directions | ||
`here <https://pipelines.lsst.io/install/package-development.html>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A link to the GitHub repo might be useful here and for ap_association
.
directly interface with data files. `lsst.ap.pipe` is no exception. | ||
The process of turning a set of raw data files and corresponding calibration | ||
products into a format the Butler understands is called ingestion. Ingestion | ||
can be somewhat camera-specific, and is outside the scope of the AP Pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explaining what ingestion is and that it's camera-specific violates the "assume the reader is qualified" principle encouraged by DMTN-030. Consider a placeholder link and/or a TODO:
comment for fixing this once we have a proper processing topic for ingestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a TODO comment. I appreciate your point about the "assume the reader is qualified" principle, which I usually interpret a bit more as "don't underestimate intelligence and don't overestimate expertise." At any rate, I agree that once ingestion docs exist, this paragraph can certainly be cut or linked away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not the clearest name. I think Jonathan Sick explicitly said in a Community post that the "qualified" thing is about not including background.
|
||
Until a resolution for `DM-12672 <https://jira.lsstcorp.org/browse/DM-12672>`_ | ||
is found, you should include a filter in the ``dataId`` string for | ||
``ap_pipe`` to run successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to link to Jira issues from user-facing documentation, especially in cases like this where a user who doesn't have Jira access will have no way of figuring out what's going on. Why not simply say that some instruments may require you to explicitly give a filter, that this is a known issue, but that you don't know when it will be fixed?
deepDiff/ | ||
v123456/ <--- difference images and DIASource tables are in here | ||
schema/ | ||
123456/ <--- all other processed data products are in here (calexps etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this (for the reasons you give in the following paragraph).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I find it instructive to have a directory structure to stare at when typing things into the command line or the Butler. I will clarify that this is one example of how the output or rerun directory may be structured and omit the config
and schema
subdirectories that users don't need to worry about in this tutorial.
.. code-block:: python | ||
|
||
import lsst.daf.persistence as dafPersist | ||
butler = dafPersist.Butler('repo') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this needs to be dafPersist.Butler(inputs='repo/rerun/processed')
-- I remember hearing that the Butler API does not support reruns, that they're only a CmdLineTask
thing.
.. _section-configuration-options: | ||
|
||
Configuration options | ||
===================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of a confusing section, because it combines different sorts of things. Why not move the subsections to top-level sections of their own? (I guess "previewing data IDs" might be easiest to find if it went in the main tutorial?)
ups/ap_pipe.table
Outdated
@@ -13,6 +13,8 @@ setupRequired(ndarray) | |||
setupRequired(obs_decam) | |||
setupRequired(ap_association) | |||
|
|||
envPrepend(PYTHONPATH, ${PRODUCT_DIR}/bin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean PATH
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably did! That's probably why it doesn't work as I thought it should 😆
doc/lsst.ap.pipe/getting-started.rst
Outdated
`lsst.ap.association` is a prerequisite for `lsst.ap.pipe` and is also | ||
not currently part of the LSST Stack. You must clone ``ap_association`` | ||
and set it up first. | ||
`lsst.ap.association`, available `here <https://github.com/lsst-dm/ap_association>`_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just make lsst.ap.association
the link text? It's cleaner and more web-idiomatic than "click here!".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't lsst.ap.association
eventually link to the docs on that package? I didn't want to break that future auto-linkage with a manual GitHub link.
Includes note about decompressing and extracting certain files, note about decam defect ingestion, and consistent section headers. Also adds ap_pipe boilerplate to ups table file and updates the placeholder unit test to point to the correct ticket.
No description provided.