Skip to content

Commit

Permalink
background start
Browse files Browse the repository at this point in the history
  • Loading branch information
viktoralmqvist committed May 7, 2012
1 parent 4ed43b2 commit 34095d3
Showing 1 changed file with 93 additions and 21 deletions.
114 changes: 93 additions & 21 deletions thesis/Chapters/Introduction.tex
Original file line number Diff line number Diff line change
Expand Up @@ -11,42 +11,113 @@ \chapter{Introduction}

\highlight{something}


\section{Background}
To run computations effectively on modern supercomputers and computer
clusters the applications need strong scaling. A limitation like this
is a problem for the applications as the available resources are not
used to reach highest possible performance.

%Copernicus paper

%Many interesting real-world applications (all that are not
%embarrassingly parallel) require some interprocess communication for
%scaling and are therefore limited both by the availability of this
%bandwith as well as the total amount of resources for high absolute
%performance.


Molecular dynamics simulations are computations which have limitations
as described, but there is a possibility due to the fact that many of
these computations are of statistical nature. Relying on sampling of
many individual simulations makes it possible to distribute the
workload on supercumputers and computer clusters. This is a
prallelization of such simulations which gives a great perfomance
boost when high numbers of cores are available.

%Molecular dynamics simulations pose significant computaional
%challanges. The systems are big enough to be parallelized, with
%100-500 particles assigned to each core in high-performance molecular
%dynamics (MD) packages such as Gromacs [10, 17] when run on a system
%with sufficiently low interconnect latency.


Clouds are solutions to run computations on high-performing computer
systems. \cite{foster:2008} defines Clouds as:

\begin{quote} \slshape
A large-scale distributed computing paradigm that is driven by
economies of scale, in which a pool of abstracted, virtualized,
dynamically-scalable, managed computing power, storage, platforms,
and services are deliviered on demand to externaal customers over
the Internet.
\end{quote}

The resources are opaque to the user who use a pre-defined API to run
and use the system. This means the system can contain different kind
of computation power and the user is not affected. Running molecular
dynamics simulations on a Cloud would need high parallelization, such
as described above, to benefit of the possible perfomance boost.

%In a Cload, different levels of services can be offered to an end
%user, the user is only exposed to a pre-defined API, and the lower
%level resources are opaque to the user...

\highlight{computations with potential for strong scaling, sampling
molecular simulations}

\highlight{does not use available power}

%Cloud Computing and Grid Computing 360-Degree Compared:

%''Nevertheless,yes: the problems are mostly the same in Clouds and
%In this paper, we show that Clouds and Grids share a lot commonality
%in their vision, architecture and technology, but they also differ in
%various aspects such as security, programming model, business model,
%compute model, data model, applications, and abstractions.

%Nevertheless,yes: the problems are mostly the same in Clouds and
%Grids. There is a common need to be able to manage large facilities;
%to define methods by which consumers discover, request, and use
%resources provided by the central facilities; and to implement the
%often highly parallel computations that execute on those resources.''
%often highly parallel computations that execute on those resources.

%Another challenge that virtualization brings to Clouds is the
%potention difficulty in fine-control over the monitoring of
%resources.

%PROVENANCE

%Provenance refers to the derivation history of a data product,
%including all the data sources, intermediate data products, and the
%procedures that were applied to produce the data product.

%On the other hand, Clouds are becoming the future playground for
%e-science research, and provenance management is extremely important
%in order to track the processes and support the reproducibility of
%scientific results.

%''Provenance is still an unexplored area in Cloud environments, in
%Provenance is still an unexplored area in Cloud environments, in
%which we need to deal with even more challenging issues such as
%tracking data production across different service providers (with
%different platform visibility and access policies) and across
%different software and hardware abstraction layers within on
%provider.''
%different software and hardware abstraction layers within one
%provider.

%PROGRAMMING MODEL

%Copernicus paper
%More specifically, a workflow system alloews the composition of
%individual (single step) components into a complex dependency graph,
%and it governs the flow of a data and/or control through these
%components.

%Many interesting real-world applications (all that are not
%embarrassingly parallel) require some interprocess communication for
%scaling and are therefore limited both by the availability of this
%bandwith as well as the total amount of resources for high absolute
%performance.

%Molecular dynamics simulations pose significant computaional
%challanges. The systems are big enough to be parallelized, with
%100-500 particles assigned to each core in high-performance molecular
%dynamics (MD) packages such as Gromacs [10, 17] when run on a system
%with sufficiently low interconnect latency.

%The data Grid...

%In an increasing number of scientific disciplines, large data
%collections are emergin as important community resources.


There is a solution for parallelizing molecular simulations and it is
called Copernicus


\subsection{Copernicus}
Copernicus is a software system that is made to distribute and
Expand All @@ -66,7 +137,7 @@ \subsection{Copernicus}
while keeping the performance advantages of massively parallel
simulations. Such computations are called projects in the system.

\begin{quote}
\begin{quote} \slshape
A project is executed as a single job, but breaks it up into coupled
individual parallel simulations over all available computational
resoureces, with the single simulation as the individual work
Expand All @@ -79,7 +150,7 @@ \subsection{Copernicus}
To handle projects with many simulations as a single entity Copernicus
needs to able to
\renewcommand{\labelitemi}{-}
\begin{itemize}
\begin{itemize} \slshape
\item match and distribute the individual simulations to the available
computational resources,
\item run simulations on a variety of remote platforms simultaneously:
Expand Down Expand Up @@ -117,6 +188,7 @@ \subsection{Copernicus}

%There are primitive types like files, strings, ints, etc. There are
%also compound types lists, dictionaries and function types
\highlight{types, monitoring, provenance?}

The problem with Copernicus was the lack of a good way to describe
projects. There were no intuitive way of giving input to the system,
Expand Down

0 comments on commit 34095d3

Please sign in to comment.