Moving Genie 3 concepts to the reference documentation (#444)

Netflix · Dec 22, 2016 · 998f7bd · 998f7bd
1 parent cd7f1b8
commit 998f7bd
Show file tree

Hide file tree

Showing 13 changed files with 1,660 additions and 1 deletion.
diff --git a/genie-docs/src/docs/asciidoc/concepts/_concepts.adoc b/genie-docs/src/docs/asciidoc/concepts/_concepts.adoc
@@ -0,0 +1,9 @@
+== Concepts
+
+include::_dataModel.adoc[]
+
+include::_howItWorks.adoc[]
+
+include::_netflixExample.adoc[]
+
+include::_netflixDeployment.adoc[]
diff --git a/genie-docs/src/docs/asciidoc/concepts/_dataModel.adoc b/genie-docs/src/docs/asciidoc/concepts/_dataModel.adoc
@@ -0,0 +1,155 @@
+=== Data Model
+
+The Genie 3 data model contains modifications and additions to the
+https://netflix.github.io/genie/concepts/2/DataModel.html[Genie 2 data model] to enable even more flexibility,
+modularity and meta data retention.
+
+Several limitations of the Genie 2 data model were uncovered in its more than two years of production use at Netflix. A
+few of those limitations were:
+
+* Inability to link multiple applications to a command
+* Lack of insight into what the original user request for a job contained
+* Too much normalization, particularly between tags and jobs, causing inefficient joins and slow performance
+* Lack of system data for job dimensions like attachment size, output size, etc.
+
+The changes made to the data model will become apparent in the following sections.
+
+==== Caveats
+
+* What we called `entities` in Genie 2 we'll call `resources` in Genie 3 to decouple the idea from database
+implementation to user API level
+* The specific resource fields are *NOT* defined in this document. These fields are available in the
+https://netflix.github.io/genie/docs/{revnumber}/rest/[REST API documentation]
+** This documentation covers the higher level purposes of the resources and not their contents
+
+==== Resources
+
+The following sections describe the various resources available from the Genie REST APIs. You should reference the
+https://netflix.github.io/genie/docs/{revnumber}/rest/[API Docs] for how to manipulate these resources. These sections
+will focus on the high level purposes for each resource and how they rely and/or interact with other resources within
+the system.
+
+===== Tagging
+
+One important aspect to comment on is tagging of resources. Genie relies heavily on tags for how the system discovers
+resources like clusters and commands for a job. Each of the core resources has a set of tags that can be associated
+with them. These tags can be of set to whatever you want but it is recommended to come up with some sort of consistent
+structure for your tags to make it easier for users to understand their purpose. For example at Netflix we've adopted
+some of the following standards for our tags:
+
+* `sched:{something}`
+** This corresponds to any schedule like that this resource (likely a cluster) is expected to adhere to
+** e.g. `sched:sla` or `sched:adhoc`
+* `type:{something}`
+** e.g. `type:yarn` or `type:presto` for a cluster or `type:hive` or `type:spark` for a command
+* `ver:{something}`
+** The specific version of said resource
+** e.g. two different Spark commands could have `ver:1.6.1` vs `ver:2.0.0`
+* `data:{something}`
+** Used to classify the type of data a resource (usually a command) will access
+** e.g. `data:prod` or `data:test`
+
+===== Configuration Resources
+
+The following resources (applications, commands and clusters) are considered configuration, or admin, resources.
+They're generally set up by the Genie administrator and available to all users for user with their jobs.
+
+====== Application Resource
+
+An application resource represents pretty much what you'd expect. It is a reusable set of binaries, configuration files
+and setup files that can be used to install and configure (surprise!) an application. Generally these resources are
+used when an application isn't already installed and on the `PATH` on a Genie node.
+
+When a job is run and applications are involved in running the job Genie will download all the dependencies,
+configuration files and setup files of each application and store it all in the job working directory. It will then
+execute the setup script in order to install that application for that job. Genie is "dumb" as to the contents or
+purpose of any of these files so the onus is on the administrators to create and test these packages.
+
+Applications are very useful for decoupling application binaries from a Genie deployment. For example you could deploy
+a Genie cluster and change the version of Hadoop, Hive, Spark that Genie will download without actually re-deploying
+Genie. Applications can be combined together via a command. This will be explained more in the Command section.
+
+The first entity to talk about is an application. Applications are linked to commands in order for binaries and
+configurations to be downloaded and installed at runtime. Within Netflix this is frequently used to deploy new clients
+without redeploying a Genie cluster.
+
+At Netflix our applications frequently consists of a zip of all the binaries uploaded to s3 along with a setup file to
+unzip and configure environment variables for that application.
+
+It is important to note applications are entirely optional and if you prefer to just install all client binaries on a
+Genie node beforehand you're free to do that. It will save in overhead for job launch times but you will lose
+flexibility in the trade-off.
+
+====== Command Resource
+
+Commands resources primarily represent what a user would enter at the command line if you wanted to run a process on a
+cluster and what dependencies (applications) you would need on your PATH to make that possible.
+
+Commands can have configuration and setup files just like applications but primarily they should have an ordered list
+of applications associated with them if necessary. For example lets take a typical scenario involving running Hive. To
+run Hive you generally need a few things:
+
+. A cluster to run its processing on (we'll get to that in the Cluster section)
+. A hive-site.xml file which says what metastore to connect to amongst other settings
+. Hive binaries
+. Hadoop configurations and binaries
+
+So a typical setup for Hive in Genie would be to have one, or many, Hive commands configured. Each command would have
+its own hive-site.xml pointing to a specific metastore (prod, test, etc). The command would depend on Hadoop and Hive
+applications already configured which would have all the default Hadoop and Hive binaries and configurations. All this
+would be combined in the job working directory in order to run Hive.
+
+You can have any number of commands configured in the system. They should then be linked to the clusters they can
+execute on. Clusters are explained next.
+
+====== Cluster
+
+A cluster stores all the details of an execution cluster including connection information, properties, etc. Some
+cluster examples are Hadoop, Spark, Presto, etc. Every cluster can be linked to a set of commands that it can run.
+
+Genie does *not* launch clusters for you. It merely stores metadata about the clusters you have running in your
+environment so that jobs using commands and applications can connect to them.
+
+Once a cluster has been linked to commands your Genie instance is ready to start running jobs. The job resources are
+described in the following section. One important thing to note is that the list of commands linked to the cluster
+is a priority ordered list. That means if you have two pig commands available on your system for the same cluster the
+first one found in the list will be chosen provided all tags match. See <<How it Works>> for more details.
+
+===== Job Resources
+
+The following resources all relate to a user submitting and monitoring a given job. They are split up from the Genie 2
+Job idea to provide better separation of concerns as usually a user doesn't care about certain things. What node a
+job ran on or its Linux process exit code for example.
+
+Users interact with these entities directly though all but the initial job request are *read-only* in the sense you can
+only get their current state back from Genie.
+
+====== Job Request
+
+This is the resource you use to kick off a job. It contains all the information the system needs to run a job.
+Optionally the REST APIs can take attachments. All attachments and file dependencies are downloaded into the root of
+the jobs working directory. The most important aspects are the command line arguments, the cluster criteria and the
+command criteria. These dictate the which cluster, command and arguments will be used when the job is executed. See the
+<<How it Works>> section for more details.
+
+====== Job
+
+The job resource is created in the system after a Job Request is received. All the information a typical user would be
+interested in should be contained within this resource. It has links to the command, cluster and applications used to
+run the job as well as the meta information like status, start time, end time and others. See the
+https://netflix.github.io/genie/docs/{revnumber}/rest/[REST API documentation] for more details.
+
+====== Job Execution
+
+The job execution resource contains information about where a job was run and other information that may be more
+interesting to a system admin than a regular user. Frequently useful in debugging.
+
+A job contains all the details of a job request and execution including any command line arguments. Based on the
+request parameters, a cluster and command combination is selected for execution. Job requests can also supply necessary
+files to Genie either as attachments or using the file dependencies field if they already exist in an accessible file
+system. As a job executes, its details are recorded in the job record within the Genie database.
+
+==== Wrap-up
+
+Hopefully this guide provides insight into how the Genie data model is thought out and works together. It is meant to
+be very generic and support as many use cases as possible without modifications to the Genie codebase.
diff --git a/genie-docs/src/docs/asciidoc/concepts/_howItWorks.adoc b/genie-docs/src/docs/asciidoc/concepts/_howItWorks.adoc
@@ -0,0 +1,174 @@
+=== How it Works
+
+This section is meant to provide context for how Genie can be configured with Clusters, Commands and Applications (see
+<<Data Model>> for details) and then how these work together in order to run a job on a Genie node.
+
+==== Resource Configuration
+
+This section describes how configuration of Genie works from an administrator point of view. This isn't how to
+install and configure the Genie application itself. Rather it is how to configure the various resources involved in
+running a job.
+
+===== Register Resources
+
+All resources (clusters, commands, applications) should be registered with Genie before attempting to link them
+together. Any files these resources depend on should be uploaded somewhere Genie can access them (S3, web server,
+mounted disk, etc).
+
+Tagging of the resources, particularly Clusters and Commands, is extremely important. Genie will use the tags in order
+to find a cluster/command combination to run a job. You should come up with a convenient tagging scheme for your
+organization. At Netflix we try to stick to a pattern for tags structures like `{tag category}:{tag value}`. For
+example `type:yarn` or `data:prod`. This allows the tags to have some context so that when users look at what resources
+are available they can find what to submit their jobs with so it is routed to the correct cluster/command combination.
+
+===== Linking Resources
+
+Once resources are registered they should be linked together. By linking we mean to represent relationships between the
+resources.
+
+====== Commands for a Cluster
+
+Adding commands to a cluster means that the administrator acknowledges that this cluster can run a given set of
+commands. If a command is not linked to a cluster it cannot be used to run a job.
+
+The commands are added in priority order. For example say you have different Spark commands you want to add to a given
+YARN cluster but you want Genie to treat one as the default. Here is how those commands might be tagged:
+
+Spark 1.6.0 (id: spark16)
+* `type:sparksubmit`
+* `ver:1.6`
+* `ver:1.6.0`
+
+Spark 1.6.1 (id: spark161)
+* `type:sparksubmit`
+* `ver:1.6.1`
+
+Spark 2.0.0 (id: spark200)
+* `type:sparksubmit`
+* `ver:2.0`
+* `ver:2.0.0`
+
+Now if we added the commands to the cluster in this order: `spark16, spark161, spark200` and a user submitted a job
+only requesting a command tagged with `type:sparksubmit` (as in they don't care what version just the default) they
+would get Spark 1.6.0. However if we later deemed 2.0.0 to be ready to be the default we would reorder the commands to
+`spark200, spark16, spark161` and that same job if submitted again would now run with Spark 2.0.0.
+
+====== Applications for a Command
+
+Linking application(s) to commands means that a command has a dependency on said application(s). The order of the
+applications added is important because Genie will setup the applications in that order. Meaning if one application
+depends on another (e.g. Spark depends on Hadoop on classpath for YARN mode) Hadoop should be ordered first. All
+applications must successfully be installed before Genie will start running the job.
+
+==== Job Submission
+
+OK. The system admin has everything registered and linked together. Things could obviously change but that's mostly
+transparent to end users. They just want to run jobs.How does that work? This section attempts to walk through what
+happens at a high level. The example section lower down will walk through a step by step example.
+
+===== User Submits a Job Request
+
+In order to submit a job request there is some work a user will have to do up front. What kind of job are they running?
+What cluster do they want to run on? What command do they want to use? Do they care about certain details like version
+or just want the defaults? Once they determine the answers to the questions they can decide how they want to tag their
+job request for the `clusterCriterias` and `commandCriteria` fields.
+
+General rule of thumb for these fields is to use the lowest common denominator of tags to accomplish what a user
+requires. This will allow the most flexibility for the job to be moved to different clusters or commands as need be.
+For example if they want to run a Spark job and don't really care about version it is better to just say
+"type:sparksubmit" (assuming this is tagging structure at your organization) only instead of that *and* "ver:2.0.0".
+This way when version 2.0.1 or 2.1.0 comes down the pipe the job moves along with the new default. Obviously if they do
+care about version they should set it or any other specific tag.
+
+The `clusterCriterias` field is an array of `ClusterCriteria` objects. This is done to provide a fallback mechanism. If
+the no match is found for the first `ClusterCriteria` and `commandCriteria` combination it will move onto the second
+and so on until all options are exhausted. This is handy if it is desirable to run a job on some cluster that is only
+up some of the time but other times it isn't and its fine to run it on some other cluster that is always available.
+
+Other things a user needs to consider when submitting a job. All dependencies which aren't sent as attachments must
+already be uploaded somewhere Genie can access them. Somewhere like S3, web server, shared disk, etc.
+
+Users should familiarize themselves with whatever the `executable` for their desired command includes. It's possible
+the system admin has setup some default parameters they should know are there so as to avoid duplication or unexpected
+behavior. Also they should make sure they know all the environment variables that may be available to them as part of
+the setup process of all the cluster, command and application setup processes.
+
+===== Genie Receives the Job Request
+
+When Genie receives the job request it does a few things immediately:
+
+. If the job request doesn't have an id it creates a GUID for the job
+. It saves the job request to the database so it is recorded
+.. If the ID is in use a 409 will be returned to the user saying there is a conflict
+. It creates job and job execution records in data base for consistency
+. It saves any attachments in a temporary location
+
+Next Genie will attempt to find a cluster and command matching the requested tag combinations. If none is found it will
+send a failure back to the user and mark the job failed in the database.
+
+If a combination is found Genie will then attempt to determine if the node can run the job. By this it means it will
+check the amount of client memory the job requires against the available memory in the Genie allocation. If there is
+enough the job will be accepted and will be run on this node and the jobs memory is subtracted from the available pool.
+If not it will be rejected with a 503 error message and user should retry later.
+
+The order of priority for how memory for a job is determined is:
+
+. The memory a user requested in the job request
+.. Cannot exceed the max memory allowed by system admin for a given job
+. The memory set as the default for a given command by the admins
+. The default memory size for a job as set by the system admin
+
+Successful job submission results in a 202 message to the user stating it's accepted and will be processed
+asynchronously by the system.
+
+===== After Job Accepted
+
+Once a job has been accepted to run on a Genie node a workflow is executed in order to setup the job working directory
+and launch the job. Some minor steps left out for brevity.
+
+. Job is marked in `INIT` state in the database
+. A job directory is created under the admin configured jobs directory with a name of the job id
+. A run script file is created with the name `run` under the job working directory
+.. Currently this is a bash script
+. Kill handlers are added to the run script
+. Directories for Genie logs, application files, command files, cluster files are created under the job working
+directory
+. Default environment variables are added to the run script to export their values
+. Cluster configuration files are downloaded and stored in the job work directory
+. Cluster related variables are written into the run script
+. Application configuration and dependency files are downloaded and stored in the job directory if any applications are
+needed
+. Application related variables are written into the run script
+. Command configuration and dependency files are downloaded and store in the job directory
+. Command related variables are written into the run script
+. All job dependency files (including attachments) are downloaded into the job working directory
+. Job related variables are written into the run script
+. Job script is executed in a forked process.
+. Script `pid` stored in database `job_executions` table and job marked as `RUNNING` in database
+. Monitoring process created for pid
+
+===== After Job Running
+
+Once the job is running Genie will poll the PID periodically waiting for it to no longer be used.
+
+NOTE: Assumption made as to the amount of process churn on the Genie node. We're aware PID's can be reused but
+reasonably this shouldn't happen within the poll period given the amount of available PID to the processes a typical
+Genie node will run.
+
+Once the pid no longer exists Genie checks the done file for the exit code. It marks the job succeeded, failed or
+killed depending on that code.
+
+===== Cleaning Up
+
+To save disk space Genie will delete application dependencies from the job working directory after a job is completed.
+This can be disabled by an admin. If the job is marked as it should be archived the working directory will be zipped up
+and stored in the default archive location under the `{jobId}.tar.gz`.
+
+==== User Behavior
+
+Users can check on the status of their job using the `status` API and get the output using the output APIs. See the
+https://netflix.github.io/genie/docs/{revnumber}/rest/[REST Documentation] for specifics on how to do that.
+
+==== Wrap Up
+
+WARNING: TODO