Skip to content

Commit

Permalink
Moving Genie 3 concepts to the reference documentation (#444)
Browse files Browse the repository at this point in the history
  • Loading branch information
tgianos committed Dec 22, 2016
1 parent cd7f1b8 commit 998f7bd
Show file tree
Hide file tree
Showing 13 changed files with 1,660 additions and 1 deletion.
9 changes: 9 additions & 0 deletions genie-docs/src/docs/asciidoc/concepts/_concepts.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
== Concepts

include::_dataModel.adoc[]

include::_howItWorks.adoc[]

include::_netflixExample.adoc[]

include::_netflixDeployment.adoc[]
155 changes: 155 additions & 0 deletions genie-docs/src/docs/asciidoc/concepts/_dataModel.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
=== Data Model

The Genie 3 data model contains modifications and additions to the
https://netflix.github.io/genie/concepts/2/DataModel.html[Genie 2 data model] to enable even more flexibility,
modularity and meta data retention.

Several limitations of the Genie 2 data model were uncovered in its more than two years of production use at Netflix. A
few of those limitations were:

* Inability to link multiple applications to a command
* Lack of insight into what the original user request for a job contained
* Too much normalization, particularly between tags and jobs, causing inefficient joins and slow performance
* Lack of system data for job dimensions like attachment size, output size, etc.
The changes made to the data model will become apparent in the following sections.

==== Caveats

* What we called `entities` in Genie 2 we'll call `resources` in Genie 3 to decouple the idea from database
implementation to user API level
* The specific resource fields are *NOT* defined in this document. These fields are available in the
https://netflix.github.io/genie/docs/{revnumber}/rest/[REST API documentation]
** This documentation covers the higher level purposes of the resources and not their contents

==== Resources

The following sections describe the various resources available from the Genie REST APIs. You should reference the
https://netflix.github.io/genie/docs/{revnumber}/rest/[API Docs] for how to manipulate these resources. These sections
will focus on the high level purposes for each resource and how they rely and/or interact with other resources within
the system.

===== Tagging

One important aspect to comment on is tagging of resources. Genie relies heavily on tags for how the system discovers
resources like clusters and commands for a job. Each of the core resources has a set of tags that can be associated
with them. These tags can be of set to whatever you want but it is recommended to come up with some sort of consistent
structure for your tags to make it easier for users to understand their purpose. For example at Netflix we've adopted
some of the following standards for our tags:

* `sched:{something}`
** This corresponds to any schedule like that this resource (likely a cluster) is expected to adhere to
** e.g. `sched:sla` or `sched:adhoc`
* `type:{something}`
** e.g. `type:yarn` or `type:presto` for a cluster or `type:hive` or `type:spark` for a command
* `ver:{something}`
** The specific version of said resource
** e.g. two different Spark commands could have `ver:1.6.1` vs `ver:2.0.0`
* `data:{something}`
** Used to classify the type of data a resource (usually a command) will access
** e.g. `data:prod` or `data:test`

===== Configuration Resources

The following resources (applications, commands and clusters) are considered configuration, or admin, resources.
They're generally set up by the Genie administrator and available to all users for user with their jobs.

====== Application Resource

An application resource represents pretty much what you'd expect. It is a reusable set of binaries, configuration files
and setup files that can be used to install and configure (surprise!) an application. Generally these resources are
used when an application isn't already installed and on the `PATH` on a Genie node.

When a job is run and applications are involved in running the job Genie will download all the dependencies,
configuration files and setup files of each application and store it all in the job working directory. It will then
execute the setup script in order to install that application for that job. Genie is "dumb" as to the contents or
purpose of any of these files so the onus is on the administrators to create and test these packages.

Applications are very useful for decoupling application binaries from a Genie deployment. For example you could deploy
a Genie cluster and change the version of Hadoop, Hive, Spark that Genie will download without actually re-deploying
Genie. Applications can be combined together via a command. This will be explained more in the Command section.

The first entity to talk about is an application. Applications are linked to commands in order for binaries and
configurations to be downloaded and installed at runtime. Within Netflix this is frequently used to deploy new clients
without redeploying a Genie cluster.

At Netflix our applications frequently consists of a zip of all the binaries uploaded to s3 along with a setup file to
unzip and configure environment variables for that application.

It is important to note applications are entirely optional and if you prefer to just install all client binaries on a
Genie node beforehand you're free to do that. It will save in overhead for job launch times but you will lose
flexibility in the trade-off.

====== Command Resource

Commands resources primarily represent what a user would enter at the command line if you wanted to run a process on a
cluster and what dependencies (applications) you would need on your PATH to make that possible.

Commands can have configuration and setup files just like applications but primarily they should have an ordered list
of applications associated with them if necessary. For example lets take a typical scenario involving running Hive. To
run Hive you generally need a few things:

. A cluster to run its processing on (we'll get to that in the Cluster section)
. A hive-site.xml file which says what metastore to connect to amongst other settings
. Hive binaries
. Hadoop configurations and binaries

So a typical setup for Hive in Genie would be to have one, or many, Hive commands configured. Each command would have
its own hive-site.xml pointing to a specific metastore (prod, test, etc). The command would depend on Hadoop and Hive
applications already configured which would have all the default Hadoop and Hive binaries and configurations. All this
would be combined in the job working directory in order to run Hive.

You can have any number of commands configured in the system. They should then be linked to the clusters they can
execute on. Clusters are explained next.

====== Cluster

A cluster stores all the details of an execution cluster including connection information, properties, etc. Some
cluster examples are Hadoop, Spark, Presto, etc. Every cluster can be linked to a set of commands that it can run.

Genie does *not* launch clusters for you. It merely stores metadata about the clusters you have running in your
environment so that jobs using commands and applications can connect to them.

Once a cluster has been linked to commands your Genie instance is ready to start running jobs. The job resources are
described in the following section. One important thing to note is that the list of commands linked to the cluster
is a priority ordered list. That means if you have two pig commands available on your system for the same cluster the
first one found in the list will be chosen provided all tags match. See <<How it Works>> for more details.

===== Job Resources

The following resources all relate to a user submitting and monitoring a given job. They are split up from the Genie 2
Job idea to provide better separation of concerns as usually a user doesn't care about certain things. What node a
job ran on or its Linux process exit code for example.

Users interact with these entities directly though all but the initial job request are *read-only* in the sense you can
only get their current state back from Genie.

====== Job Request

This is the resource you use to kick off a job. It contains all the information the system needs to run a job.
Optionally the REST APIs can take attachments. All attachments and file dependencies are downloaded into the root of
the jobs working directory. The most important aspects are the command line arguments, the cluster criteria and the
command criteria. These dictate the which cluster, command and arguments will be used when the job is executed. See the
<<How it Works>> section for more details.

====== Job

The job resource is created in the system after a Job Request is received. All the information a typical user would be
interested in should be contained within this resource. It has links to the command, cluster and applications used to
run the job as well as the meta information like status, start time, end time and others. See the
https://netflix.github.io/genie/docs/{revnumber}/rest/[REST API documentation] for more details.

====== Job Execution

The job execution resource contains information about where a job was run and other information that may be more
interesting to a system admin than a regular user. Frequently useful in debugging.

A job contains all the details of a job request and execution including any command line arguments. Based on the
request parameters, a cluster and command combination is selected for execution. Job requests can also supply necessary
files to Genie either as attachments or using the file dependencies field if they already exist in an accessible file
system. As a job executes, its details are recorded in the job record within the Genie database.

==== Wrap-up

Hopefully this guide provides insight into how the Genie data model is thought out and works together. It is meant to
be very generic and support as many use cases as possible without modifications to the Genie codebase.
174 changes: 174 additions & 0 deletions genie-docs/src/docs/asciidoc/concepts/_howItWorks.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
=== How it Works

This section is meant to provide context for how Genie can be configured with Clusters, Commands and Applications (see
<<Data Model>> for details) and then how these work together in order to run a job on a Genie node.

==== Resource Configuration

This section describes how configuration of Genie works from an administrator point of view. This isn't how to
install and configure the Genie application itself. Rather it is how to configure the various resources involved in
running a job.

===== Register Resources

All resources (clusters, commands, applications) should be registered with Genie before attempting to link them
together. Any files these resources depend on should be uploaded somewhere Genie can access them (S3, web server,
mounted disk, etc).

Tagging of the resources, particularly Clusters and Commands, is extremely important. Genie will use the tags in order
to find a cluster/command combination to run a job. You should come up with a convenient tagging scheme for your
organization. At Netflix we try to stick to a pattern for tags structures like `{tag category}:{tag value}`. For
example `type:yarn` or `data:prod`. This allows the tags to have some context so that when users look at what resources
are available they can find what to submit their jobs with so it is routed to the correct cluster/command combination.

===== Linking Resources

Once resources are registered they should be linked together. By linking we mean to represent relationships between the
resources.

====== Commands for a Cluster

Adding commands to a cluster means that the administrator acknowledges that this cluster can run a given set of
commands. If a command is not linked to a cluster it cannot be used to run a job.

The commands are added in priority order. For example say you have different Spark commands you want to add to a given
YARN cluster but you want Genie to treat one as the default. Here is how those commands might be tagged:

Spark 1.6.0 (id: spark16)
* `type:sparksubmit`
* `ver:1.6`
* `ver:1.6.0`

Spark 1.6.1 (id: spark161)
* `type:sparksubmit`
* `ver:1.6.1`

Spark 2.0.0 (id: spark200)
* `type:sparksubmit`
* `ver:2.0`
* `ver:2.0.0`

Now if we added the commands to the cluster in this order: `spark16, spark161, spark200` and a user submitted a job
only requesting a command tagged with `type:sparksubmit` (as in they don't care what version just the default) they
would get Spark 1.6.0. However if we later deemed 2.0.0 to be ready to be the default we would reorder the commands to
`spark200, spark16, spark161` and that same job if submitted again would now run with Spark 2.0.0.

====== Applications for a Command

Linking application(s) to commands means that a command has a dependency on said application(s). The order of the
applications added is important because Genie will setup the applications in that order. Meaning if one application
depends on another (e.g. Spark depends on Hadoop on classpath for YARN mode) Hadoop should be ordered first. All
applications must successfully be installed before Genie will start running the job.

==== Job Submission

OK. The system admin has everything registered and linked together. Things could obviously change but that's mostly
transparent to end users. They just want to run jobs.How does that work? This section attempts to walk through what
happens at a high level. The example section lower down will walk through a step by step example.

===== User Submits a Job Request

In order to submit a job request there is some work a user will have to do up front. What kind of job are they running?
What cluster do they want to run on? What command do they want to use? Do they care about certain details like version
or just want the defaults? Once they determine the answers to the questions they can decide how they want to tag their
job request for the `clusterCriterias` and `commandCriteria` fields.

General rule of thumb for these fields is to use the lowest common denominator of tags to accomplish what a user
requires. This will allow the most flexibility for the job to be moved to different clusters or commands as need be.
For example if they want to run a Spark job and don't really care about version it is better to just say
"type:sparksubmit" (assuming this is tagging structure at your organization) only instead of that *and* "ver:2.0.0".
This way when version 2.0.1 or 2.1.0 comes down the pipe the job moves along with the new default. Obviously if they do
care about version they should set it or any other specific tag.

The `clusterCriterias` field is an array of `ClusterCriteria` objects. This is done to provide a fallback mechanism. If
the no match is found for the first `ClusterCriteria` and `commandCriteria` combination it will move onto the second
and so on until all options are exhausted. This is handy if it is desirable to run a job on some cluster that is only
up some of the time but other times it isn't and its fine to run it on some other cluster that is always available.

Other things a user needs to consider when submitting a job. All dependencies which aren't sent as attachments must
already be uploaded somewhere Genie can access them. Somewhere like S3, web server, shared disk, etc.

Users should familiarize themselves with whatever the `executable` for their desired command includes. It's possible
the system admin has setup some default parameters they should know are there so as to avoid duplication or unexpected
behavior. Also they should make sure they know all the environment variables that may be available to them as part of
the setup process of all the cluster, command and application setup processes.

===== Genie Receives the Job Request

When Genie receives the job request it does a few things immediately:

. If the job request doesn't have an id it creates a GUID for the job
. It saves the job request to the database so it is recorded
.. If the ID is in use a 409 will be returned to the user saying there is a conflict
. It creates job and job execution records in data base for consistency
. It saves any attachments in a temporary location

Next Genie will attempt to find a cluster and command matching the requested tag combinations. If none is found it will
send a failure back to the user and mark the job failed in the database.

If a combination is found Genie will then attempt to determine if the node can run the job. By this it means it will
check the amount of client memory the job requires against the available memory in the Genie allocation. If there is
enough the job will be accepted and will be run on this node and the jobs memory is subtracted from the available pool.
If not it will be rejected with a 503 error message and user should retry later.

The order of priority for how memory for a job is determined is:

. The memory a user requested in the job request
.. Cannot exceed the max memory allowed by system admin for a given job
. The memory set as the default for a given command by the admins
. The default memory size for a job as set by the system admin

Successful job submission results in a 202 message to the user stating it's accepted and will be processed
asynchronously by the system.

===== After Job Accepted

Once a job has been accepted to run on a Genie node a workflow is executed in order to setup the job working directory
and launch the job. Some minor steps left out for brevity.

. Job is marked in `INIT` state in the database
. A job directory is created under the admin configured jobs directory with a name of the job id
. A run script file is created with the name `run` under the job working directory
.. Currently this is a bash script
. Kill handlers are added to the run script
. Directories for Genie logs, application files, command files, cluster files are created under the job working
directory
. Default environment variables are added to the run script to export their values
. Cluster configuration files are downloaded and stored in the job work directory
. Cluster related variables are written into the run script
. Application configuration and dependency files are downloaded and stored in the job directory if any applications are
needed
. Application related variables are written into the run script
. Command configuration and dependency files are downloaded and store in the job directory
. Command related variables are written into the run script
. All job dependency files (including attachments) are downloaded into the job working directory
. Job related variables are written into the run script
. Job script is executed in a forked process.
. Script `pid` stored in database `job_executions` table and job marked as `RUNNING` in database
. Monitoring process created for pid

===== After Job Running

Once the job is running Genie will poll the PID periodically waiting for it to no longer be used.

NOTE: Assumption made as to the amount of process churn on the Genie node. We're aware PID's can be reused but
reasonably this shouldn't happen within the poll period given the amount of available PID to the processes a typical
Genie node will run.

Once the pid no longer exists Genie checks the done file for the exit code. It marks the job succeeded, failed or
killed depending on that code.

===== Cleaning Up

To save disk space Genie will delete application dependencies from the job working directory after a job is completed.
This can be disabled by an admin. If the job is marked as it should be archived the working directory will be zipped up
and stored in the default archive location under the `{jobId}.tar.gz`.

==== User Behavior

Users can check on the status of their job using the `status` API and get the output using the output APIs. See the
https://netflix.github.io/genie/docs/{revnumber}/rest/[REST Documentation] for specifics on how to do that.

==== Wrap Up

WARNING: TODO
Loading

0 comments on commit 998f7bd

Please sign in to comment.