- Java 8 must be installed
- ssh must be installed and sshd must be running
Yarn Log Processor (YALP) is a CLI tool for developers, helping them to effectively analyze YARN logs.
Analyzing the data helps to find the reason for application or test failures.
Run the processor with the following 3 steps:
- Clone the repository and navigate to its root folder
- Build project by
mvn clean package - Run the start script by
./start.sh <parameters>
- Execute various subshell commands (see Subshell commands section)
Steps 2. and 3. can be executed with one single command, by building the project and running the start script:
./start.sh <parameters> --build
There are three different ways to provide input for YALP:
-
Run with direct URL input:
./start.sh --url <direct url> --logFolder <log folder> [--keep] [--shell]
YALP downloads a zip archive from the given URL address and extracts its content.
The archive file and the folder of the extracted files will be named after the current time. -
Run with local archive input:
./start.sh --local <local file path> --logFolder <log folder> [--keep] [--shell]
YALP extracts a zip archive on the provided file path and filters YARN related log files.
The folder of the extracted files will be named after the input archive file. -
Run on an already extracted log folder: (if no
--urlor--localwas provided)./start.sh --logFolder <log folder> --shell
YALP uses the files provided in the defined folder and filters YARN related log files.
This makes it possible to only extract and filter the log files once and analyze them multiple times.
If we use an already extracted bundle,--shellneeds to be always provided and--keepshould not be provided.Keep in mind that with input options 1. or 2. we moved the extracted files in a subfolder which needs to be specified here, for example after:
./start.sh --logFolder someFolder --local fileName
The logs will be generated in
someFolder/fileNameand for that reason when you want to further examine these logs without extracting them again, you need to specify:./start.sh --logFolder someFolder/fileName --shell
About the extraction process:
The tool is able to extract zip and gz archive files recursively. Given another archive format, the user is required to convert the input to zip format before using it.
-
Build option: needs to be specified once after downloading to build the project
./start.sh <parameters> --build
-
Keep option: needs to be specified for keeping the input archive file (this is an invalid command if the input is an already extracted bundle)
./start.sh <parameters> --keep
-
Shell option: needs to be specified every time to open the subshell and analyze the log files:
./start.sh <parameters> --shell
-
Command option: executes a command without opening the subshell:
./start.sh <parameters> --command <arg>
where <arg>needs to be a subshell command (defined in Subshell commands section)
YALP configuration file can be found at ./src/main/resources/config.json.
By default, the configuration file contains the following pieces of information:
{
"regularExpressions": {
"timeStamp": "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}",
"logFile": ".*(?<role>(RESOURCEMANAGER|NODEMANAGER))-(?<host>.+)\\.log\\.out",
"configFile": ".*-site\\.xml"
},
"directoryNames": {
"directoryNameForYarnRelatedLogs": "workspace",
"subdirectoryNameForNodeLogs": "logs",
"subdirectoryNameForConfigFiles": "configs"
},
"cache": {
"cacheDirectory": "./.blp/cache",
"cacheType": "InMemoryLRUCache",
"cacheItemCapacity": "10"
}
}
There is a block for the regular expressions, where we can define the timestamp used in the log files, a regular expression to find the YARN related log files and another for the configuration files.
The second block defines the name of the directories created by the program. In the next section, we can see the structure of these directories.
The third section defines cache-related variables. Variable cacheType can be either "InMemoryLRUCache" or "GeneralCache". InMemoryLRUCache stores the cache items in-memory and deletes the rarely used elements.GeneralCache stores the cache items in the filesystem and does not delete them. cacheItemCapacity is only important in the case of in-memory LRU cache, where it defines the maximum number of items stored.cacheDirectory is only important in the case of Generalcache, YALP will store the cache items in this folder.
logfolder/
├── first/
│ ├── first/
│ └── workspace/
│ ├── configs/
│ └── logs/
└── second/
├── second/
└── workspace/
├── configs/
└── logs/ The example above shows the created log folder with the default configuration file after processing first.zip and second.zip. first/first and second/second contains all the log files from the archive file. workspace/logs folders contain the relevant YARN related logs. YALP also creates a folder for the config files but the config files are not used in the current version.
The name of workspace, configs and logs folders can be changed in the configuration file (./src/main/resources/config.json).
To launch the subshell append the --shell parameter to one of the input options described above (Input options chapter).
The following commands and parameters can be executed in the subshell:
| COMMAND | DESCRIPTION | EXAMPLE |
|---|---|---|
help |
Lists all valid commands operating in the subshell and their expected behaviour | help |
roles |
Lists all YARN roles in the cluster and the corresponding hosts the roles are running on | roles |
applications |
Lists all applications in the cluster, along with their owners and submission time | applications |
appattempts <appId> |
Lists all application attempts of a given application | appattempts application_1583168158408_0002 |
containers --application <appId> |
Lists all containers of a given application | containers --application application_1583168158408_0002 |
containers --appattempt <attemptId> |
Lists all containers of a given appattempt | containers --appattempt appattempt_1583162983042_1961_000001 |
containers --exiting |
Lists all exiting containers | containers --exiting |
containers --killed |
Lists all killed containers | containers --killed |
events --application <appId> |
Lists all events of a given application | events --application application_1583168158408_0002 |
events --appattempt <attemptId> |
Lists all events of a given appattempt | events --appattempt appattempt_1583168158408_0002_000001 |
states --application <appId> |
Lists all state changes of a given application | states --application application_1583168158408_0002 |
states --appattempt <attemptId> |
Lists all state changes of a given appattempt | states --appattempt appattempt_1583168158408_0002_000001 |
states --container <containerId> |
Lists all state changes of a given container | states --container container_1583167118773_0001_01_000006 |
grep <expression> |
Lists occurrences of a user-defined regular expression in ResourceManager and NodeManager logs | grep application_1583168158408_0002 |
grep <expression> [-rm/-nm] |
Lists occurrences of a user-defined regular expression only from ResourceManager/NodeManager logs | grep application_1583168158408_0002 -rm |
resources |
Lists all nodes and their resource capabilities | resources |
exceptions |
lists all exceptions in the logs | exceptions |
info |
Prints generic information about the cluster | info |
exit |
Terminates the subshell | exit |
The output of most of the commands can be modified with verbosity modifiers:
- List the found items with
--listmodifier:
events --application application_1583169702218_0001 --list
EVENT
START
APP_ACCEPTED
ATTEMPT_REGISTERED
ATTEMPT_UNREGISTERED
- Default verbosity
events --application application_1583169702218_0001
| TIME | EVENT |
| --------------- | --------------- |
| 2020-03-02 09:22:19 | START |
| 2020-03-02 09:22:19 | APP_ACCEPTED |
| 2020-03-02 09:22:26 | ATTEMPT_REGISTERED |
| 2020-03-02 09:22:33 | ATTEMPT_UNREGISTERED |
- Verbose output with
--verbosemodifier:
events --application application_1583169702218_0001 --verbose
| TIME | EVENT | FROM STATE | TO STATE |
| --------------- | --------------- | --------------- | --------------- |
| 2020-03-02 09:22:19 | START | NEW | NEW_SAVING |
| 2020-03-02 09:22:19 | APP_ACCEPTED | SUBMITTED | ACCEPTED |
| 2020-03-02 09:22:26 | ATTEMPT_REGISTERED | ACCEPTED | RUNNING |
| 2020-03-02 09:22:33 | ATTEMPT_UNREGISTERED | RUNNING | FINAL_SAVING |
- Print whole lines of logs with
--rawmodifier:
events --application application_1583169702218_0001 --raw
MATCHING LINES IN LOGS
2020-03-02 09:22:19,969 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1583169702218_0001 State change from NEW to NEW_SAVING on event = START
2020-03-02 09:22:19,998 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1583169702218_0001 State change from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
2020-03-02 09:22:26,582 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1583169702218_0001 State change from ACCEPTED to RUNNING on event = ATTEMPT_REGISTERED
2020-03-02 09:22:33,275 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1583169702218_0001 State change from RUNNING to FINAL_SAVING on event = ATTEMPT_UNREGISTERED
There are separate classes for the four main tasks of preprocessing:
- CliParser parses the CLI arguments
- FileDownloaderDownloading the archive file
- FileExtractor extracts the archive file and
- FileFilter filters YARN related log files
Preprocessor is coordinating the work of these classes.
The interactive subshell's structure is built around pair of classes, where one class is responsible for the formatting (reading/writing) and the other class is responsible for the execution.
CommandLineparses the input from the CLI andCommandExecutorcalls the appropriateCommandto take action.OptionParser(belonging to the specifiedCommand) parses the parameters and through anExecutabletheCommandwill generate the desired output.SearchEnginelooks through the log files and finds matches with a specified regular expression and Formatter will generate the output from the matched lines in the logs.
Every time a command is executed in the subshell one of the Command classes will be called. In the Subshell commands section, we already saw the output of these commands. In the following diagram, we can see how these Commands can be grouped into four categories.
- Primitive commands implement the
Commandinterface directly. No modifiers can be attached to them, they always display the same output.
Example:
exit
SimpleSearchis extended byRolesandInfo. These commands don't have modifiers, but their output depends on the content of the log files.
Example:
info
ParameterizedSearchcommands can have various options defined, which modify the output.
Example:
containers --application <appId> [--verbose]
HybridSearchcommands have a compulsory first parameter without a specified option. Options can be defined after the first parameter.
Examples:
appattempts <appId> [--verbose]
grep <expression> [-rm] [--verbose]
The Commands always return a Printable object to the Subshell which will be printed to the CLI. The diagram below shows the types of Printable objects.
Hudáky Márton