No description, website, or topics provided.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
DS4DM-CreateCorrespondences
DS4DM-CreateLuceneIndex
DS4DM-Webservice
DS4DM-WebtableExtraction
Evaluation tables
README.md

README.md

DS4DM-Backend

The DS4DM-Backend is a webservice which works in conjunction with the Data Search for Data Mining (DS4DM) RapidMiner Extension. The memory-intensive and processing-intensive functionalities of the DS4DM RapidMiner Extension have been outsourced to the DS4DM Backend. This includes various Data Searches, Data pre-processing functions, Data Repository management functions,... - for more informatioin, please refer to the website of the DSDM Backend.

Setup Option 1 - Installing on a computer

  1. Download this GitHub repository
  2. Go to the releases page (https://github.com/BenediktKleppmann/DS4DM-Backend/releases) and download the three jar-files: "CreateCorrespondenceFiles-0.0.1-SNAPSHOT.jar", "CreateLuceneIndex-0.0.1-SNAPSHOT-jar-with-dependencies.jar" and "winter-1.0-jar-with-dependencies.jar".
    Copy them to the following location in the downloaded GitHub repository: DS4DM-Backend\DS4DM-Webservice\DS4DM_webservice\lib
  3. Make sure that the environment variable JAVA_HOME points to a jdk_8... -folder
  4. Open the a terminal and execute:
    cd <path_to_downloaded_folder>/DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice
    java -Xms1024m -Xmx1024m -XX:MetaspaceSize=64m -XX:MaxMetaspaceSize=256m -jar activator-launch-1.2.12.jar "run -Dhttp.port=9004"

  5. In your RapidMiner-process you now set url-Parameter of the Data Search operator to "http://localhost:9004".
    keyword-based search

Setup Option 2 - Running a virtual machine

  1. From the Project's Google Drive repository download the virtual machine image 'Ubuntu Server 16.04.4 (32bit).vdi'
  2. Launch the virtual machine
  3. Log on to the user: 'osboxes.org', password: 'osboxes.org'
  4. open a terminal and execute the following commands:
    cd /home/osboxes/Desktop/DS4DM-Backend-master/DS4DM-Webservice/DS4DM_webservice
    java -Xms1024m -Xmx1024m -XX:MetaspaceSize=64m -XX:MaxMetaspaceSize=256m -jar activator-launch-1.2.12.jar "run -Dhttp.port=9004"

Technical description of the Backend components

DS4DM-CreateCorrespondences

This backend component contains methods for finding/creating correspondences between tables. These methods are used by DS4DM-Webservice (the main backend component). For this the CreateCorrespondenceFiles-maven-project is compiled into a jar-file. The jar file with dependencies is saved to the folder DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/lib/ and added to the Build Path of the DS4DM_webservice-maven-project

DS4DM-CreateLuceneIndex

This backend component contains methods for indexing tables. These methods are also used by the DS4DM-Webservice. As with CreateCorrespondences, the CreateLuceneIndex-maven-project is compiled to a jar file and the jar file with dependencies is saved to DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/lib/, from where it is included in the Build Path of DS4DM_webservice.

DS4DM-Webservice

This is the main Backend component. The maven-project is structured according to the Java-Play-framework-guidelines. This allows the program activator-launch-1.2.12.jar to provide an API endpoint which calls various methods in this backend component. The File DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/conf/routes specifies the API calls that are possible and which methods these call. The majority of the called methods are in the class DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/app/controllers/ExtendTable.java. (All of the executed code is in the folder DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/app). The DS4DM-Webservices uses repositories of tables. These repositories are in the folder DS4DM-Backend/DS4DM-Webservice/DS4DM_webservice/public/repositories. Each repository has one folder containing csv-tables, one folder containing Indexes and another folder containing Correspondences, as well as a file with repository statistics.

Evaluation tables

This isn't a backend component, but a collection of the csv files that were used for the evaluations. For more information on the evaluations, please refer to http://web.informatik.uni-mannheim.de/ds4dm/#evaluation.

Other Resources: