Skip to content

Commit

Permalink
Address feedback/comments from Wil
Browse files Browse the repository at this point in the history
  • Loading branch information
fritzm committed Jul 7, 2017
1 parent 5822be6 commit af68c2d
Showing 1 changed file with 25 additions and 20 deletions.
45 changes: 25 additions & 20 deletions LDM-135.tex
Expand Up @@ -158,11 +158,9 @@ \section{Introduction}\label{introduction}

\section{Requirements}\label{requirements}

The key requirements driving the LSST database architecture include:
incremental scaling, near-real-time response time for ad-hoc simple user
queries, fast turnaround for full-sky scans/correlations, reliability,
and low cost, all at multi-petabyte scale. These requirements are
primarily driven by the ad-hoc user query access.
Formal DM database requirements are called out in \citeds{LDM-555}.
For purposes of exposition, this section summarizes some of the
key requirements which drive the LSST database architecture.

\subsection{General Requirements}\label{general-requirements}

Expand Down Expand Up @@ -227,11 +225,7 @@ \subsection{Query Access Related Requirements}\label{query-access-related-requir
The Science Data Archive Data Release query load is defined primarily in
terms of access to the large catalogs in the archive: Object, Source,
and ForcedSource. Queries to image metadata, for example, though
numerous, are expected to be fast and can easily be handled by
replicating the relatively small metadata tables.

Specific query requirements are called out in \citeds{LDM-555}; in general
the following are required:
numerous, are expected to be fast. In general the following are required:

\textbf{Reproducibility}. Queries executed on any Level 1 and Level 2
data products must be reproducible.
Expand All @@ -250,7 +244,7 @@ \subsection{Query Access Related Requirements}\label{query-access-related-requir

\textbf{Cross-matching with external/user data}. Occasionally, LSST
database catalog will need to be cross-matched with external catalogs:
both large, such as SDSS, SKA or GAIA, and small, such as small amateur
both large, such as SDSS, SKA, or Gaia, and small, such as small amateur
data sets. Users should be able to save results of their queries, and
access them during subsequent queries.

Expand All @@ -267,7 +261,7 @@ \subsection{Query Access Related Requirements}\label{query-access-related-requir

\subsection{Discussion}\label{discussion}

\subsubsection{Implications}\label{implications}
\subsubsection{Design Considerations}\label{design-consideration}

The above requirements have important implications on the LSST data
access architecture.
Expand Down Expand Up @@ -355,9 +349,10 @@ \section{Baseline Architecture}\label{baseline-architecture}

\begin{itemize}
\item
The LSST baseline architecture for Alert Production is an off-the-shelf
RDBMS system which uses replication for fault tolerance and which takes
advantage of horizontal (time-based) partitioning;
The LSST baseline architecture for Alert Production is a (yet to be
selected) off-the-shelf RDBMS system which uses replication for fault
tolerance and which takes advantage of horizontal (time-based)
partitioning;
\item
The baseline architecture for user access to Data Releases is an MPP
(multi-processor, parallel) relational database running on a
Expand Down Expand Up @@ -1203,7 +1198,7 @@ \subsection{Data Distribution}\label{data-distribution}
\subsubsection{Database data
distribution}\label{database-data-distribution}

The baseline database system will provide access for two database
The baseline database system will provide access for at least two database
releases: latest and previous . Data for each release will be spread out
among all nodes in the cluster.

Expand Down Expand Up @@ -1579,8 +1574,8 @@ \subsubsection{Implementation}\label{shared-scan-implementation}
single user query, and the impact is amortized among all disks on all
workers.

For discussion about the performance of the existing prototype, refer
\citeds{DMTR-16}.
For discussion about the performance of the current implementation, refer
to \citeds{DMTR-16}.

\subsubsection{Memory management}\label{shared-scan-memory-management}

Expand Down Expand Up @@ -1993,7 +1988,9 @@ \subsection{Current Status and Future
\item
resource management;
\item
security.
security;
\item
early engagement with astronomy users.
\end{itemize}

\textbf{Automatic data distribution and replication}. We have experimented
Expand Down Expand Up @@ -2042,6 +2039,14 @@ \subsection{Current Status and Future
\textbf{Security}. The system needs to be secure and resilient against
denial of service attacks.

\textbf{Early engagement with astronomy users}. It is important that we
engage early enough members of our target user-community, so we can have
time to on their feedback about what we are building. Does the system
have the capabilities they need and expect? Is the query syntax usable and
practical for them? We have begun some work in this area through activities
in the PDAC (Prototype Data Access Center) cluster at NCSA with a very
limited audience, and plan to expand that audience in upcoming months.

\subsection{Open Issues}\label{open-issues}

What follows is a (non-exhaustive) list of issues, technical and scientific,
Expand Down Expand Up @@ -2195,7 +2200,7 @@ \subsection{Potential Key Risks}\label{potential-key-risks}
DM-075: New SRD requirements require new DM functionality
\end{itemize}

\subsection{Risks Mitigations}\label{risks-mitigations}
\subsection{Risk Mitigations}\label{risk-mitigations}

To mitigate the insufficient performance/scalability risk, we developed Qserv,
and demonstrated scalability and performance. In addition, to increase chances
Expand Down

0 comments on commit af68c2d

Please sign in to comment.