Setting up a Schedoscope Project

Utz Westermann edited this page May 8, 2018 · 93 revisions

Schedoscope is an internal Scala DSL for specifying views (Hive table partitions), their structure and dependencies, as well as the transformation logic required compute views from other views. As a consequence, setting up a Schedoscope project means setting up a Scala project that uses Schedoscope as a library.

For this purpose, we provide a Maven POM template in this section. It is of course possible to use other build tools such as SBT or Ant/Ivy. You are also encouraged to take a look at the POM of the tutorial.

For running Schedoscope, the template utilizes the exec Maven plugin which assembles a classpath from the Maven dependencies and launches the Schedoscope REST service right out of the project folder.

In real-world production deployment scenarios, you should probably follow a different [deployment / bundling](Bundling and Deploying) / [launching strategy](Starting Schedoscope).

Maven POM Template

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>my-projects-group-id</groupId>
    <artifactId>my-project</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>Schedoscope POM template</name>
    <description>A template making it easier for you to set up a Schedoscope project</description>

    <dependencies>
        <dependency>
            <groupId>schedoscope</groupId>
            <artifactId>schedoscope-core</artifactId>
            <version>0.10.2</version>
        </dependency>

        <!-- If you need Oozie transformations, add the following dependency -->
        <dependency>
            <groupId>schedoscope</groupId>
            <artifactId>schedoscope-transformation-oozie</artifactId>
            <version>0.10.2</version>
        </dependency>

        <!-- If you need Pig transformations, add the following dependency -->
        <dependency>
            <groupId>schedoscope</groupId>
            <artifactId>schedoscope-transformation-pig</artifactId>
            <version>0.10.2</version>
        </dependency>

        <!-- If you need Spark transformations, add the following dependency -->
        <dependency>
            <groupId>schedoscope</groupId>
            <artifactId>schedoscope-transformation-spark</artifactId>
            <version>0.10.2</version>
        </dependency>

        <dependency>
            <groupId>org.scalatest</groupId>
            <artifactId>scalatest_2.10</artifactId>
            <version>2.2.5</version>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>hadoop-launcher</groupId>
            <artifactId>hadoop-launcher</artifactId>
            <version>0.1.1</version>
            <scope>test</scope>
        </dependency>

        <!-- For tests of Oozie transformations, add this dependency -->
        <dependency>
            <groupId>minioozie</groupId>
            <artifactId>minioozie</artifactId>
            <version>1.2.4</version>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <artifactId>maven-source-plugin</artifactId>
                <version>2.4</version>
                <executions>
                    <execution>
                        <id>attach-sources</id>
                        <goals>
                            <goal>jar</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.19</version>
                <configuration>
                    <skipTests>true</skipTests>
                </configuration>
            </plugin>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.2.2</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                            <goal>doc-jar</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <scalaVersion>2.11.11</scalaVersion>
                </configuration>
              </plugin>
              <plugin>
                <groupId>org.scalatest</groupId>
                <artifactId>scalatest-maven-plugin</artifactId>
                <version>1.0</version>
                <configuration>
                    <reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
                    <junitxml>.</junitxml>
                    <filereports>WDF TestSuite.txt</filereports>
                    <argLine>-Xmx1024m -XX:MaxPermSize=512M</argLine>
                    <environmentVariables>
                        <HADOOP_HOME>${project.build.directory}/hadoop</HADOOP_HOME>
                    </environmentVariables>
                </configuration>
                <executions>
                    <execution>
                        <id>test</id>
                        <goals>
                            <goal>test</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>exec-maven-plugin</artifactId>
                <version>1.4.0</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>java</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <mainClass>org.schedoscope.scheduler.rest.server.SchedoscopeRestService</mainClass>
                    <arguments>
                        <argument>--shell</argument>
                    </arguments>
                    <systemProperties>
                        <systemProperty>
                            <key>config.file</key>
                            <value>src/main/resources/schedoscope.conf</value>
                        </systemProperty>
                    </systemProperties>
                    <additionalClasspathElements>
                        <additionalClasspathElement>/etc/hadoop/conf</additionalClasspathElement>
                        <additionalClasspathElement>/etc/hive/conf</additionalClasspathElement>
                        <additionalClasspathElement>target/${project.build.finalName}-mapreduce.jar</additionalClasspathElement>
                        <additionalClasspathElement>target/${project.build.finalName}-hive.jar</additionalClasspathElement>
                    </additionalClasspathElements>
                </configuration>
            </plugin>
        </plugins>
    </build>

    <repositories>
        <repository>
            <id>otto-bintray</id>
            <url>https://dl.bintray.com/ottogroup/maven</url>
        </repository>
    </repositories>
</project> 

In case you want use [JDBC exports](JDBC Exports), you should also add the JDBC driver of your database to your pom.

As a reminder, a Maven project folder structure looks like this:

project
|
+-- src
|   |
|   +-- main
|   |   |
|   |   +-- scala
|   |   |
|   |   +-- resources
|   |
|   +-- test
|       |
|       +-- scala
|       |
|       +-- resources
|   
+-- pom.xml
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.