-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Add File Interpreter, HDFS Interpreter and Tests #276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
865e6ab
7d61e5f
1239fe6
70507a8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| --- | ||
| layout: page | ||
| title: "HDFS File Interpreter" | ||
| description: "" | ||
| group: manual | ||
| --- | ||
| {% include JB/setup %} | ||
|
|
||
|
|
||
| ## HDFS File Interpreter for Apache Zeppelin | ||
|
|
||
| <br/> | ||
| <table class="table-configuration"> | ||
| <tr> | ||
| <th>Name</th> | ||
| <th>Class</th> | ||
| <th>Description</th> | ||
| </tr> | ||
| <tr> | ||
| <td>%hdfs</td> | ||
| <td>HDFSFileInterpreter</td> | ||
| <td>Provides File System commands for HDFS</td> | ||
| </tr> | ||
| </table> | ||
|
|
||
| <br/> | ||
| This interpreter connects to HDFS using the HTTP WebHDFS interface. | ||
| It supports the basic shell file commands applied to HDFS, it currently only supports browsing | ||
| * You can use <i>ls [PATH]</i> and <i>ls -l [PATH]</i> to list a directory. If the path is missing, then the current directory is listed. | ||
| * You can use <i>cd [PATH]</i> to change your current directory by giving a relative or an absolute path. | ||
| * You can invoke <i>pwd</i> to see your current directory. | ||
|
|
||
| ### Create Interpreter | ||
|
|
||
| You can create the HDFS browser by pointing it to the WebHDFS interface of your Hadoop cluster. | ||
|
|
||
| ### Configuration | ||
| You can modify the configuration of HDFS from the `Interpreter` section. The HDFS interpreter express the following properties: | ||
|
|
||
| <table class="table-configuration"> | ||
| <tr> | ||
| <th>Property Name</th> | ||
| <th>Description</th> | ||
| <th>Default Value</th> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.url</td> | ||
| <td>The URL for WebHDFS</td> | ||
| <td>http://localhost:50070/webhdfs/v1/</td> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.user</td> | ||
| <td>The WebHDFS user</td> | ||
| <td>hdfs</td> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.maxlength</td> | ||
| <td>Maximum number of lines of results fetched</td> | ||
| <td>1000</td> | ||
| </tr> | ||
| </table> | ||
|
|
||
|
|
||
| #### WebHDFS REST API | ||
| You can confirm that you're able to access the WebHDFS API by running a curl command against the WebHDFS end point provided to the interpreter. | ||
|
|
||
| Here is an example: | ||
| $> curl "http://localhost:50070/webhdfs/v1/?op=LISTSTATUS" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,139 @@ | ||
| <?xml version="1.0" encoding="UTF-8"?> | ||
| <!-- | ||
| ~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
| ~ contributor license agreements. See the NOTICE file distributed with | ||
| ~ this work for additional information regarding copyright ownership. | ||
| ~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| ~ (the "License"); you may not use this file except in compliance with | ||
| ~ the License. You may obtain a copy of the License at | ||
| ~ | ||
| ~ http://www.apache.org/licenses/LICENSE-2.0 | ||
| ~ | ||
| ~ Unless required by applicable law or agreed to in writing, software | ||
| ~ distributed under the License is distributed on an "AS IS" BASIS, | ||
| ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| ~ See the License for the specific language governing permissions and | ||
| ~ limitations under the License. | ||
| --> | ||
|
|
||
| <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> | ||
| <modelVersion>4.0.0</modelVersion> | ||
|
|
||
| <parent> | ||
| <artifactId>zeppelin</artifactId> | ||
| <groupId>org.apache.zeppelin</groupId> | ||
| <version>0.6.0-incubating-SNAPSHOT</version> | ||
| </parent> | ||
|
|
||
| <groupId>org.apache.zeppelin</groupId> | ||
| <artifactId>zeppelin-file</artifactId> | ||
| <packaging>jar</packaging> | ||
| <version>0.6.0-incubating-SNAPSHOT</version> | ||
| <name>Zeppelin: File Manager</name> | ||
| <url>http://www.apache.org</url> | ||
|
|
||
| <dependencies> | ||
| <dependency> | ||
| <groupId>org.apache.zeppelin</groupId> | ||
| <artifactId>zeppelin-interpreter</artifactId> | ||
| <version>${project.version}</version> | ||
| <scope>provided</scope> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>javax.ws.rs</groupId> | ||
| <artifactId>javax.ws.rs-api</artifactId> | ||
| <version>2.0</version> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>org.slf4j</groupId> | ||
| <artifactId>slf4j-api</artifactId> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>org.slf4j</groupId> | ||
| <artifactId>slf4j-log4j12</artifactId> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>junit</groupId> | ||
| <artifactId>junit</artifactId> | ||
| <scope>test</scope> | ||
| </dependency> | ||
| </dependencies> | ||
|
|
||
| <build> | ||
| <plugins> | ||
| <plugin> | ||
| <groupId>org.apache.maven.plugins</groupId> | ||
| <artifactId>maven-deploy-plugin</artifactId> | ||
| <version>2.7</version> | ||
| <configuration> | ||
| <skip>true</skip> | ||
| </configuration> | ||
| </plugin> | ||
|
|
||
| <plugin> | ||
| <groupId>org.apache.maven.plugins</groupId> | ||
| <artifactId>maven-surefire-plugin</artifactId> | ||
| <version>2.18.1</version> | ||
| </plugin> | ||
|
|
||
| <plugin> | ||
| <artifactId>maven-enforcer-plugin</artifactId> | ||
| <version>1.3.1</version> | ||
| <executions> | ||
| <execution> | ||
| <id>enforce</id> | ||
| <phase>none</phase> | ||
| </execution> | ||
| </executions> | ||
| </plugin> | ||
|
|
||
| <plugin> | ||
| <artifactId>maven-dependency-plugin</artifactId> | ||
| <version>2.8</version> | ||
| <executions> | ||
| <execution> | ||
| <id>copy-dependencies</id> | ||
| <phase>package</phase> | ||
| <goals> | ||
| <goal>copy-dependencies</goal> | ||
| </goals> | ||
| <configuration> | ||
| <outputDirectory>${project.build.directory}/../../interpreter/file</outputDirectory> | ||
| <overWriteReleases>false</overWriteReleases> | ||
| <overWriteSnapshots>false</overWriteSnapshots> | ||
| <overWriteIfNewer>true</overWriteIfNewer> | ||
| <includeScope>runtime</includeScope> | ||
| </configuration> | ||
| </execution> | ||
| <execution> | ||
| <id>copy-artifact</id> | ||
| <phase>package</phase> | ||
| <goals> | ||
| <goal>copy</goal> | ||
| </goals> | ||
| <configuration> | ||
| <outputDirectory>${project.build.directory}/../../interpreter/file</outputDirectory> | ||
| <overWriteReleases>false</overWriteReleases> | ||
| <overWriteSnapshots>false</overWriteSnapshots> | ||
| <overWriteIfNewer>true</overWriteIfNewer> | ||
| <!--<includeScope>runtime</includeScope>--> | ||
| <artifactItems> | ||
| <artifactItem> | ||
| <groupId>${project.groupId}</groupId> | ||
| <artifactId>${project.artifactId}</artifactId> | ||
| <version>${project.version}</version> | ||
| <type>${project.packaging}</type> | ||
| </artifactItem> | ||
| </artifactItems> | ||
| </configuration> | ||
| </execution> | ||
| </executions> | ||
| </plugin> | ||
| </plugins> | ||
| </build> | ||
|
|
||
| </project> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| /** | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.zeppelin.file; | ||
|
|
||
| import org.apache.zeppelin.interpreter.Interpreter; | ||
| import org.apache.zeppelin.interpreter.InterpreterContext; | ||
| import org.apache.zeppelin.interpreter.InterpreterResult; | ||
| import org.apache.zeppelin.interpreter.InterpreterResult.Code; | ||
| import org.apache.zeppelin.interpreter.InterpreterResult.Type; | ||
| import org.apache.zeppelin.scheduler.Scheduler; | ||
| import org.apache.zeppelin.scheduler.SchedulerFactory; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
| import java.nio.file.Path; | ||
| import java.nio.file.Paths; | ||
| import java.util.*; | ||
|
|
||
| /** | ||
| * File interpreter for Zeppelin. | ||
| * | ||
| */ | ||
| public abstract class FileInterpreter extends Interpreter { | ||
| Logger logger = LoggerFactory.getLogger(FileInterpreter.class); | ||
| String currentDir = null; | ||
| CommandArgs args = null; | ||
|
|
||
| public FileInterpreter(Properties property) { | ||
| super(property); | ||
| currentDir = new String("/"); | ||
| } | ||
|
|
||
| /** | ||
| * Handling the arguments of the command | ||
| */ | ||
| public class CommandArgs { | ||
| public String input = null; | ||
| public String command = null; | ||
| public ArrayList<String> args = null; | ||
| public HashSet<Character> flags = null; | ||
|
|
||
| public CommandArgs(String cmd) { | ||
| input = cmd; | ||
| args = new ArrayList(); | ||
| flags = new HashSet(); | ||
| } | ||
|
|
||
| private void parseArg(String arg) { | ||
| if (arg.charAt(0) == '-') { // handle flags | ||
| for (int i = 0; i < arg.length(); i++) { | ||
| Character c = arg.charAt(i); | ||
| flags.add(c); | ||
| } | ||
| } else { // handle other args | ||
| args.add(arg); | ||
| } | ||
| } | ||
|
|
||
| public void parseArgs() { | ||
| if (input == null) | ||
| return; | ||
| StringTokenizer st = new StringTokenizer(input); | ||
| if (st.hasMoreTokens()) { | ||
| command = st.nextToken(); | ||
| while (st.hasMoreTokens()) | ||
| parseArg(st.nextToken()); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Functions that each file system implementation must override | ||
|
|
||
| public abstract String listAll(String path); | ||
|
|
||
| public abstract boolean isDirectory(String path); | ||
|
|
||
| // Combine paths, takes care of arguments such as .. | ||
|
|
||
| private String getNewPath(String argument){ | ||
| Path arg = Paths.get(argument); | ||
| Path ret = arg.isAbsolute() ? arg : Paths.get(currentDir, argument); | ||
| return ret.normalize().toString(); | ||
| } | ||
|
|
||
| // Handle the command handling uniformly across all file systems | ||
|
|
||
| @Override | ||
| public InterpreterResult interpret(String cmd, InterpreterContext contextInterpreter) { | ||
| logger.info("Run File command '" + cmd + "'"); | ||
|
|
||
| args = new CommandArgs(cmd); | ||
| args.parseArgs(); | ||
|
|
||
| if (args.command == null) { | ||
| logger.info("Error: No command"); | ||
| return new InterpreterResult(Code.ERROR, Type.TEXT, "No command"); | ||
| } | ||
|
|
||
| // Simple parsing of the command | ||
|
|
||
| if (args.command.equals("cd")) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if this is an abstract base class, should the specific command be handled by subclass instead of enforcing it here?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current commands are not specific to HDFS - they're ls, cd and pwd, something that you'll use in any linux shell. It might be better to use the same general commands so that the user does not have to learn new commands for each file system. If a file system needs something specific, they can override and still call the base class for the shared functionality. This way parsing is also handled in one place. |
||
|
|
||
| String newPath = !args.args.isEmpty() ? getNewPath(args.args.get(0)) : currentDir; | ||
| if (!isDirectory(newPath)) | ||
| return new InterpreterResult(Code.ERROR, Type.TEXT, "Invalid Directory"); | ||
|
|
||
| currentDir = newPath; | ||
| return new InterpreterResult(Code.SUCCESS, Type.TEXT, "OK"); | ||
|
|
||
| } else if (args.command.equals("ls")) { | ||
|
|
||
| String newPath = !args.args.isEmpty() ? getNewPath(args.args.get(0)) : currentDir; | ||
| if (!isDirectory(newPath)) | ||
| return new InterpreterResult(Code.ERROR, Type.TEXT, "Invalid List Directory"); | ||
|
|
||
| String results = listAll(newPath); | ||
| return new InterpreterResult(Code.SUCCESS, Type.TEXT, results); | ||
|
|
||
| } else if (args.command.equals("pwd")) { | ||
|
|
||
| return new InterpreterResult(Code.SUCCESS, Type.TEXT, currentDir); | ||
|
|
||
| } else { | ||
|
|
||
| return new InterpreterResult(Code.ERROR, Type.TEXT, "Unknown command"); | ||
|
|
||
| } | ||
| } | ||
|
|
||
| @Override | ||
| public void cancel(InterpreterContext context) { | ||
| } | ||
|
|
||
| @Override | ||
| public FormType getFormType() { | ||
| return FormType.SIMPLE; | ||
| } | ||
|
|
||
| @Override | ||
| public int getProgress(InterpreterContext context) { | ||
| return 0; | ||
| } | ||
|
|
||
| @Override | ||
| public Scheduler getScheduler() { | ||
| return SchedulerFactory.singleton().createOrGetFIFOScheduler( | ||
| FileInterpreter.class.getName() + this.hashCode()); | ||
| } | ||
|
|
||
| @Override | ||
| public List<String> completion(String buf, int cursor) { | ||
| return null; | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be
../docs/like the other ones?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at the change again. The real ones are different from the missing ones (../docs/pleasecontribute.html), below is a larger snippet that should help:
edited - to show text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, got it. thanks