IMPRO-3 (SS14)

This project contains a set of machine learning algorithms implemented on top of Scala, Stratosphere, and Spark by master students attending the "Big Data Analytics Project" at FG DIMA, TU Berlin in the 2014 spring term.

Instructions for Contributors

Master students conrtibuting to the project should follow the instructions below:

1. Clone the code with Git

Use your group repository as «origin» and the main repository as «upstream»:

export GROUP=GXX # configure your group number, e.g. G07
git clone git@github.com:TU-Berlin-DIMA/IMPRO-3.SS14.${GROUP}.git
cd IMPRO-3.SS14.${GROUP}
git remote add upstream git@github.com:TU-Berlin-DIMA/IMPRO-3.SS14.git
git fetch upstream

Setup and push an appropriate branch structure (this should be done only once per group):

git checkout -b dev_scala
git checkout -b dev_stratosphere
git checkout -b dev_spark
git push origin dev_scala
git push origin dev_stratosphere
git push origin dev_spark

Each of the other group members can then merely checkout the existing branches:

git checkout -b dev_scala origin/dev_scala
git checkout -b dev_stratosphere origin/dev_stratosphere
git checkout -b dev_spark origin/dev_spark

2. Import the project into your IDE

We recommend using either IntelliJ or Eclipse. To enable auto-completion and syntax highlighting for the Scala code in your project, make sure you have the appropriate Scala plugin installed.

Project dependencies and build lifecycle are configured via Maven, so the easiest way to setup the project in your IDE is to point the Maven importer to the local Git clone location.

3. Contribute code

In the course of the spring term, each group should provide unit-tested implementations of one machine learning algorithm for Scala, Stratosphere, and Spark.

When you develop your code, please follow the workflow below:

Collaborate within the group. Create small commits into the dev_{system} branches and exchange them push/pull to «origin»/dev_{system}.
Make sure you frequently pull and rebase «upstream»/master onto the dev_{system} branch.
Once the algorithm is unit-tested and works, squash all small commits from the dev_{system} branch into one or two commits (e.g. one for the algorithm and one for the uni-test) and push them into «origin»/dev_{system}.
Create a pull request from «origin»/dev_{system} to «upstream»/master.
If everything is fine, we will merge your code into «upstream»/master. You can then pull from «upstream»/master and push the merged version into «origin»/master.

4. Contribute your project presentations

Each group should also prepare and update an algorithm presentation for their particular algorithm. Please contribute updates to your slide sets using pull requests.

The current set of presentations can be found below:

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
doc/slides		doc/slides
impro3-ss14-core		impro3-ss14-core
impro3-ss14-scala		impro3-ss14-scala
impro3-ss14-spark		impro3-ss14-spark
impro3-ss14-stratosphere		impro3-ss14-stratosphere
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMPRO-3 (SS14)

Instructions for Contributors

1. Clone the code with Git

2. Import the project into your IDE

3. Contribute code

4. Contribute your project presentations

About

Releases

Packages

Contributors 12

Languages

TU-Berlin-DIMA/IMPRO-3.SS14

Folders and files

Latest commit

History

Repository files navigation

IMPRO-3 (SS14)

Instructions for Contributors

1. Clone the code with Git

2. Import the project into your IDE

3. Contribute code

4. Contribute your project presentations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 12

Languages

Packages