Skip to content

ec-better/hackathon-2020-semanticgeoclustering

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

BETTER Hackathon 2020 Exercise 4: Semantic Geo-Clustering with SANSA

The exercise will feature the deployment of a pre-setup Docker image on Linux OS.

Requirements

  • Docker Engine >= 1.13.0
  • docker-compose >= 1.10.0
  • Around 10 GB of disk space for Docker images
  • Around 8 GB of RAM on the hosting computer and 4 GB on the Docker image

Preparation before the Hackathon

  1. Clone the Hackathon Project from here: https://github.com/ec-better/hackathon-2020-semanticgeoclustering

  2. The Hackathon requires installation of Python, SANSA, Hadoop, Apache Spark and Apache Zeppelin. But do not worry! We have installed them all in a docker image, ready to download and use on the go.

  3. To use the docker image you will need:

  1. After Installation, configure Docker:
sudo usermod -aG docker %username%

This allows to run docker commands without sudo prefix (necessary for running make targets).

Get the hackathon jar file (requires wget):

make

Start the cluster (this will lead to downloading BDE docker images, will take a while):

make up

To load the data to your cluster simply do:

make load-data

You are now ready for the hackathon!

Starting page on the day of hackathon

When start-up is done you will be able to access the following interfaces:

Go on and open Zeppelin, make a new notebook and wait for the moderator to start the session. Apache Zeppelin RDF POI

Notes

To restart Zeppelin without restarting the whole stack:

make restart

Stop the whole stack:

make down

Executing hackathon From Command Line

It is also possible to execute the applications from the command line. Get SANSA-Examples jar and start the cluster if you already have not done it:

make
make up
make load-data

Repository info.

  • The instructions from this repo were tested on Ubuntu 18.04 and Macos 10.15.5 with Docker engine 17.03. and Docker engine 19.03.13

  • This repository holds a docker-compose.yml for running Hadoop/Spark cluster locally.

  • The cluster also includes Hue for navigation and copying file to HDFS.

  • The notebooks are created and run using Apache Zeppelin.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published