Skip to content

Commit

Permalink
First semi complete
Browse files Browse the repository at this point in the history
  • Loading branch information
michielbaird committed May 6, 2012
1 parent dc1ce24 commit ded3c17
Show file tree
Hide file tree
Showing 2 changed files with 164 additions and 8 deletions.
117 changes: 110 additions & 7 deletions writeup/litsynth/writeup.tex
@@ -1,5 +1,6 @@
\documentclass[11pt,twocolumn]{article}
\usepackage[cm]{fullpage}
\usepackage{paralist}


\title{Scientific Workbench and Workflow management for GIS workflow}
Expand All @@ -15,13 +16,23 @@
\begin{document}
\maketitle
\begin{abstract}
This Literature synthesis does an overview of what has been done
in the field of building a Scientific Workbench and Automated
Workflow management. It focusses on implementation methods and
case studies on where it has been implemented.
Automated workflow systems have been successfully implemented
across various disciplines, including Scientific and business
workflows. This is an overview of what has has been done
in the field of building these systems and pays special attention
to building a Scientific Workbench. It focusses of what
has been done in the past, highlights some the methods used
as well as the lessons learn during these implementations.

It also looks at how these principles could be applied
specifically to GIS workflow by giving an overview of the
structure of the field. It seeks to find an appropriate
mapping to these systems using known principles of SOA and
grid computing.

This is then illustrates that this solution is highly applicable
for GIS workflow.
to GIS workflow, provided the necessary middle-ware can be made
to facilitate integration.
\end{abstract}
\section{Introduction}
Automated workflow management has been in wide use across
Expand All @@ -34,9 +45,17 @@ \section{Introduction}
This has been very successful in the field of science as
the process can be rerun on different sets of data.\cite{4721191}
This not only aids in reproducibility but also gives
clear direction and saves time.
clear direction and saves time. This is done by efficiently
abstracting the operations of the flow and allowing it
to be automatically handled.

WRITE SHORT PIECE ON GIS
Geographic information Systems(GIS) is the field that
concerns itself with the organisation and representation
of geographic data, for the purpose of querying it and
making considered decissions off of the data
\cite{DiMartino:2007:TAG:1341012.1341081}. This
seems to lend itself to possibly lend itself to being
effectively applied to a workflow system.

\section{Overview}
A workflow management system, consists of definitions
Expand Down Expand Up @@ -197,8 +216,92 @@ \section{Implementations}


\section{Case Studies}
The next section will look at two instances where
workflow management systems were implemented and used.
These case studies will look at both a business and a
scientific application.
\subsection*{Danske Bank}
The workflow management system at \emph{Danske bank} was
incrementally implemented as there system moved
from a manual system.

The system was slowly introduced when the client
packages became to time consuming for the customer
advisors and was sup-optimal for the customers.
This was then replaced by a document that contained
a description of the package, this was then shipped
off to back office workers that assembled the packages
from the document, as a sort of primitive workflow system.
This then slowly developed to include web-services,
and almost full atomisation. This rapidly expanded
to almost every function of the bank and increased
productivity drastically.

Several lessons were learned during the process that
is applicable to other work flow systems. When work
was divided purely from an efficiency point of view
the workers became complacent as they felt that they
did not understand the overall mechanism and felt that
they were not involved. They also discovered that the
system did not handle change very well. And this change
was as expensive as it was inevitable the system had
to be adapted to handle this change. The success of the
system is mainly attributed to the interruptibility and
close relationship between the users and the developers
\cite{Brahe:2007:SWW:1316624.1316661}.

\subsection*{OrthoSearch}
\emph{OrophoSearch} is a workflow,
built on \emph{Kepler} that is designed to work on
work on data in the field of Bio Informatics. This
was used in the the dicipline of \emph{Neglected Diseases}
which has the potential to kill millions of people.

A pipeline was was created to represent the workflow
of this research. This pipeline was then implemented
using \emph{Perl} scripts. This solution did not meet
some of the original requirements, nor did it provide
the desired level of abstraction.

The system was moved to \emph{Kepler} as it addressed
the requirements better including: \begin{inparaenum}[(i)]
\item Workflow definition and Design; \item workflow execution
control; \item fault tolerance; \item intermediate data management;
and \item data provenance support. \end{inparaenum}

This approach turned out to be much more successful. With
its flexibility, stability and grid enabled features, it
addressed most of the problems that the manual system had.

Although the system was not without its hickups and changes
the integration with Kepler provided the workflow with the
much needed direction and increased overall productivity
drastically.\cite{daCruz:2008:OSW:1363686.1363983}


\section{Conclusion}
The field of GIS concerns itself with a vast amout of Geographic
data. This data comes in various sizes and as such different
methods of handling and transfering would need to be used to
facilitate dataflows within the system. It was also found that
there are a large number of transformations that workflow would
need to support.

The work however is done in very distributed mannor which allows
for a very effective mapping onto a grid based computing solution.
Provided middleware can be developed to support the systems that
are used. This would allow for effective Content Delivery Network
that provides data on demand where it is needed on the grid
\cite{Montella:2007:UGC:1272980.1272995}.

A workflow system would appear to be highly effective in this
field, as it supports the nature of the science extremely well,
It would allow for effective automatisation of some of the
functions and would be able to remove a large amount of the
problems associated with Content Delivery.
\cite{Withana:2010:VWE:1851476.1851586}




\bibliography{../references}{}
Expand Down
55 changes: 54 additions & 1 deletion writeup/references.bib
Expand Up @@ -63,7 +63,7 @@ @article{Suchman:1983:OPP:357442.357445
}

@article{vanderAalst2002125,
title = "Inheritance of workflows: an approach to tackling problems related to change",
title = "Inheritance of workflows: an approach to tackling pdaCruz:2008:OSW:1363686.136398oroblems related to change",
journal = "Theoretical Computer Science",
volume = "270",
number = "1–2",
Expand Down Expand Up @@ -335,3 +335,56 @@ @inproceedings{Sanders:2008:SSA:1400549.1400595
keywords = {DoDAF, MoDAF, TOGAF, service-oriented architecture},
}

@inproceedings{daCruz:2008:OSW:1363686.1363983,
author = {da Cruz, Sergio Manuel Serra and Batista, Vanessa and D\'{a}vila, Alberto M. R. and Silva, Edno and Tosta, Frederico and Vilela, Clarissa and Campos, Maria Luiza M. and Cuadrat, Rafael and Tschoeke, Diogo and Mattoso, Marta},
title = {OrthoSearch: a scientific workflow approach to detect distant homologies on protozoans},
booktitle = {Proceedings of the 2008 ACM symposium on Applied computing},
series = {SAC '08},
year = {2008},
isbn = {978-1-59593-753-7},
location = {Fortaleza, Ceara, Brazil},
pages = {1282--1286},
numpages = {5},
url = {http://doi.acm.org/10.1145/1363686.1363983},
doi = {10.1145/1363686.1363983},
acmid = {1363983},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {bioinformatics, provenance, scientific workflows},
}

@inproceedings{Montella:2007:UGC:1272980.1272995,
author = {Montella, Raffaele and Giunta, Giulio and Riccio, Angelo},
title = {Using grid computing based components in on demand environmental data delivery},
booktitle = {Proceedings of the second workshop on Use of P2P, GRID and agents for the development of content networks},
series = {UPGRADE '07},
year = {2007},
isbn = {978-1-59593-718-6},
location = {Monterey, California, USA},
pages = {81--86},
numpages = {6},
url = {http://doi.acm.org/10.1145/1272980.1272995},
doi = {10.1145/1272980.1272995},
acmid = {1272995},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {environmental data distribution, grid computing, resource broking},
}

@inproceedings{Withana:2010:VWE:1851476.1851586,
author = {Withana, Eran Chinthaka and Plale, Beth and Barga, Roger and Araujo, Nelson},
title = {Versioning for workflow evolution},
booktitle = {Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing},
series = {HPDC '10},
year = {2010},
isbn = {978-1-60558-942-8},
location = {Chicago, Illinois},
pages = {756--765},
numpages = {10},
url = {http://doi.acm.org/10.1145/1851476.1851586},
doi = {10.1145/1851476.1851586},
acmid = {1851586},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {evolution, versioning, workflows},
}

0 comments on commit ded3c17

Please sign in to comment.