You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I cannot make a pull request for this section, because I am on mobile internet and cannot afford to clone the stratosphere.github.io repository. I have pasted my text below. Sorry Ufuk, for causing additional work.
Introduction
Analysis programs in Stratosphere's are regular Java Programs that implement transformations on data sets (e.g., filtering, , mapping, joining, grouping). The data sets are initially created from certain sources (e.g., by reading files, or from collections). The results are returned by sinks, which may for example write the data to (distributed) files, or print it to the command line. The sections on the program skeleton and transformations show the general template of a program and describe the available transformations.
Stratosphere programs can run in a variety of contexts, for example locally as standalone programs, locally embedded in other programs, or on clusters of many machines (see [program skeleton] how to define different environments). All programs are executed lazily: When the program is run and the transformation method on the data set is invoked, it creates a specific transformation operation. That transformation operation is only executed once program execution is triggered on the environment. Whether the program is executed locally or on a cluster depends on the environment of the program.
In contrast to the Stratospheres Record API, the Java API is strongly typed: All data sets and transformations accept typed elements rather than generic records. This allows to catch typing errors very early and supports safe refactoring of programs.
The text was updated successfully, but these errors were encountered:
I tried to keep it short (I guess people do not want to read a lot). If you find something non-intuitive or if you think something is missing, please comment.
I cannot make a pull request for this section, because I am on mobile internet and cannot afford to clone the
stratosphere.github.io
repository. I have pasted my text below. Sorry Ufuk, for causing additional work.Introduction
Analysis programs in Stratosphere's are regular Java Programs that implement transformations on data sets (e.g., filtering, , mapping, joining, grouping). The data sets are initially created from certain sources (e.g., by reading files, or from collections). The results are returned by sinks, which may for example write the data to (distributed) files, or print it to the command line. The sections on the program skeleton and transformations show the general template of a program and describe the available transformations.
Stratosphere programs can run in a variety of contexts, for example locally as standalone programs, locally embedded in other programs, or on clusters of many machines (see [program skeleton] how to define different environments). All programs are executed lazily: When the program is run and the transformation method on the data set is invoked, it creates a specific transformation operation. That transformation operation is only executed once program execution is triggered on the environment. Whether the program is executed locally or on a cluster depends on the environment of the program.
In contrast to the Stratospheres Record API, the Java API is strongly typed: All data sets and transformations accept typed elements rather than generic records. This allows to catch typing errors very early and supports safe refactoring of programs.
The text was updated successfully, but these errors were encountered: