This repository has been archived by the owner. It is now read-only.
Permalink
Show file tree
Hide file tree
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
FALCON-1106 Documentation for extensions
Author: Sowmya Ramesh <sramesh@hortonworks.com> Reviewers: "Balu Vellanki <balu@apache.org>", Ying Zheng <yzheng@hortonworks.com>" Closes #120 from sowmyaramesh/FALCON-1106
- Loading branch information
1 parent
fc34d42
commit 85345ad7e7421fbd25829381f27eb5b165d2f8d0
Showing
19 changed files
with
442 additions
and
262 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
---+ Falcon Extensions | ||
|
||
---++ Overview | ||
|
||
A Falcon extension is a static process template with parameterized workflow to realize a specific use case and enable non-programmers to capture and re-use very complex business logic. Extensions are defined in server space. Objective of the extension is to solve a standard data management function that can be invoked as a tool using the standard Falcon features (REST API, CLI and UI access) supporting standard falcon features. | ||
|
||
For example: | ||
|
||
* Replicating directories from one HDFS cluster to another (not timed partitions) | ||
* Replicating hive metadata (database, table, views, etc.) | ||
* Replicating between HDFS and Hive - either way | ||
* Data masking etc. | ||
|
||
---++ Proposal | ||
|
||
Falcon provides a Process abstraction that encapsulates the configuration for a user workflow with scheduling controls. All extensions can be modeled as a Process and its dependent feeds with in Falcon which executes the user | ||
workflow periodically. The process and its associated workflow are parameterized. The user will provide properties which are <name, value> pairs that are substituted by falcon before scheduling it. Falcon translates these extensions | ||
as a process entity by replacing the parameters in the workflow definition. | ||
|
||
---++ Falcon extension artifacts to manage extensions | ||
|
||
Extension artifacts are published in addons/extensions. Artifacts are expected to be installed on HDFS at "extension.store.uri" path defined in startup properties. Each extension is expected to ahve the below artifacts | ||
* json file under META directory lists all the required and optional parameters/arguments for scheduling extension job | ||
* process entity template to be scheduled under resources directory | ||
* parameterized workflow under resources directory | ||
* required libs under the libs directory | ||
* README describing the functionality achieved by extension | ||
|
||
REST API and CLI support has been added for extension artifact management on HDFS. Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]] for more details. | ||
|
||
---++ CLI and REST API support | ||
REST APIs and CLI support has been added to manage extension jobs and instances. | ||
|
||
Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]] for more details on usage of CLI and REST API's for extension jobs and instances management. | ||
|
||
---++ Metrics | ||
HDFS mirroring and Hive mirroring extensions will capture the replication metrics like TIMETAKEN, BYTESCOPIED, COPY (number of files copied) for an instance and populate to the GraphDB. | ||
|
||
---++ Sample extensions | ||
|
||
Sample extensions are published in addons/extensions | ||
|
||
---++ Types of extensions | ||
* [[HDFSMirroring][HDFS mirroring extension]] | ||
* [[HiveMirroring][Hive mirroring extension]] | ||
|
||
---++ Packaging and installation | ||
|
||
Extension artifacts in addons/extensions are packaged in falcon war under extensions directory. For manual installation user is expected to install the extension artifacts under extensions in falcon war to HDFS at "extension.store.uri" path defined in startup properties and then restart Falcon. | ||
|
||
---++ Migration | ||
Recipes framework and HDFS mirroring capability was added in Apache Falcon 0.6.0 release and it was client side logic. With 0.10 release its moved to server side and renamed as server side extensions. Client side recipes only had CLI support and expected certain pre steps to get it working. This is no longer required in 0.10 release as new CLI and REST API support has been provided. | ||
|
||
If user is migrating to 0.10 release and above then old Recipe setup and CLI's won't work. For manual installation user is expected to copy Extension artifacts to HDFS. Please refer "Packaging and installation" section above for more details. | ||
Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]] for more details on usage of CLI and REST API's for extension jobs and instances management. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
---+ HDFS mirroring Extension | ||
---++ Overview | ||
Falcon supports HDFS mirroring extension to replicate data from source cluster to destination cluster. This extension implements replicating arbitrary directories on HDFS and piggy backs on replication solution in Falcon which uses the DistCp tool. It also allows users to replicate data from on-premise to cloud, either Azure WASB or S3. | ||
|
||
---++ Use Case | ||
* Copy directories between HDFS clusters with out dated partitions | ||
* Archive directories from HDFS to Cloud. Ex: S3, Azure WASB | ||
|
||
---++ Limitations | ||
As the data volume and number of files grow, this can get inefficient. | ||
|
||
---++ Usage | ||
---+++ Setup source and destination clusters | ||
<verbatim> | ||
$FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml | ||
</verbatim> | ||
|
||
---+++ HDFS mirroring extension properties | ||
Extension artifacts are expected to be installed on HDFS at the path specified by "extension.store.uri" in startup properties. hdfs-mirroring-properties.json file located at "<extension.store.uri>/hdfs-mirroring/META/hdfs-mirroring-properties.json" lists all the required and optional parameters/arguments for scheduling HDFS mirroring job. | ||
|
||
---+++ Submit and schedule HDFS mirroring extension | ||
|
||
<verbatim> | ||
$FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName hdfs-mirroring -file /process/definition.xml | ||
</verbatim> | ||
|
||
Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]] for more details on usage of CLI and REST API's. |
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.