Skip to content
Andrew Krizhanovsky edited this page Jun 4, 2015 · 4 revisions

Introduction

This guide helps you map Wiktionary parsed database (MySQL) to RDF database by D2RQ. D2RQ platform enables applications to access a RDF-view on a non-RDF database via the SPARQL Protocol.

Setup

Download the latest parsed Wiktionary database (e.g. file enwikt20101030_parsed.7z in the section Downloads). Upload it to your local MySQL database:

mysql> CREATE DATABASE enwikt20101030_parsed;
mysql> USE enwikt20101030_parsed;
mysql> SOURCE c:\enwikt20101030_parsed.sql

Create MySQL user and provide access to the parsed Wiktionary database:

mysql> CREATE USER rdfmapper;
mysql> GRANT SELECT ON enwikt20101030_parsed.* TO rdfmapper@'%';
mysql> FLUSH PRIVILEGES;

Download and install D2RQ.

Generate .n3 mapping file by the scheme of MySQL database:

cd C:\w\d2r-server-0.7\
generate-mapping -o mapping-wikt_parsed.n3 -u rdfmapper jdbc:mysql://localhost/enwikt20101030_parsed

You can find an example of .n3 file in the section Downloads (file mapping-enwikt20101030_parsed_en.7z).

Change the following lines in the .n3 mapping file:

  • delete lines with index_ith.lat.foreign_has_definition (outdated);
  • change vocab to wikpa (WIktionary PArsed database):
%s/vocab/wikpa/g
  • add lines:
map:Configuration a d2rq:Configuration;
    d2rq:resultSizeLimit 33;
    .

Run server:

cd C:\w\d2r-server-0.7\
d2r-server mapping-wikt_parsed.n3

Open web browser:

http://localhost:2020

SPARQL endpoint

You can test SPARQL examples presented below.

WikPaSPARQL - Java project with SPARQL examples

The small Java project was developed in Eclipse. This project contains SPARQL requests addressed to Wiktionary parsed database. Download WikPaSPARQL_20110618.7z.

SPARQL examples

Get definition (meaning) by word and language

Let's find all definitions for the English word "dog":

Open URL: http://localhost:2020/snorql/

Paste SPARQL request:

SELECT ?langId ?pageId ?langPosId ?meaningId ?wikiTextIdDef ?definition
WHERE {
    ?lang wikpa:lang_code "en";
          wikpa:lang_id ?langId.

    ?page wikpa:page_page_title "dog";
          wikpa:page_id ?pageId.

    ?lang_pos wikpa:lang_pos_page_id ?pageId;
              wikpa:lang_pos_lang_id ?langId;
              wikpa:lang_pos_id ?langPosId.

    ?meaning wikpa:meaning_id ?meaningId;
             wikpa:meaning_lang_pos_id ?langPosId;
             wikpa:meaning_wiki_text_id ?wikiTextIdDef.

    ?wiki_text wikpa:wiki_text_id ?wikiTextIdDef;
             wikpa:wiki_text_text ?definition.
}

You can access these data from Java code by using Jena:

package wikpasparql;

import com.hp.hpl.jena.query.Query;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QueryFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;

import de.fuberlin.wiwiss.d2rq.ModelD2RQ;

public class SPARQLExample {

	public static void main(String[] args) {
        ModelD2RQ m = new ModelD2RQ("file:mapping-wikt_parsed.n3");
		
        String sparql = 
            "PREFIX wikpa: <http://localhost:2020/wikpa/resource/>" +
            "SELECT ?langId ?pageId ?wikiText WHERE {" +
            "   ?lang wikpa:lang_code \"en\"; " +
            "         wikpa:lang_id ?langId. " +
            
            "   ?page wikpa:page_page_title \"dog\"; " +
            "         wikpa:page_id ?pageId." +
            
            "   ?lang_pos wikpa:lang_pos_page_id ?pageId; " +
            "         wikpa:lang_pos_lang_id ?langId; " +
            "         wikpa:lang_pos_id ?langPosId. " +

            "   ?meaning wikpa:meaning_lang_pos_id ?langPosId; " +
            "         wikpa:meaning_wiki_text_id ?wikiTextId. " +

            "   ?wiki_text wikpa:wiki_text_id ?wikiTextId; " +
            "         wikpa:wiki_text_text ?wikiText. " +
		    "}";
		
		Query q = QueryFactory.create(sparql); 
		ResultSet rs = QueryExecutionFactory.create(q, m).execSelect();
		while (rs.hasNext()) {
			QuerySolution row = rs.nextSolution();
			System.out.println("langID: " + row.getLiteral("langId").getString());
			System.out.println("pageId: " + row.getLiteral("pageId").getString());
			System.out.println("wikiText: " + row.getLiteral("wikiText").getString());
		}
	}
}

Get list of synonyms

There are two ways to find synonyms in the database of the parsed English Wiktionary:

  • direct way - from entry to words, which are listed in the section Synonyms;
  • reverse way - from synonyms to entries, i.e. a list of entries which contain this word as a synonym in the section Synonyms;

1. From entry to synonyms

Let's get a synonym for the Swedish (sv code) word ordbok.

SPARQL request:

SELECT ?langId ?pageId ?langPosId ?meaningId ?relationTypeId ?wikiTextIdRel ?relationWord
WHERE {
    ?lang wikpa:lang_code "sv";
          wikpa:lang_id ?langId.

    ?page wikpa:page_page_title "ordbok";
          wikpa:page_id ?pageId.

    ?lang_pos wikpa:lang_pos_page_id ?pageId;
              wikpa:lang_pos_lang_id ?langId;
              wikpa:lang_pos_id ?langPosId.

    ?meaning wikpa:meaning_id ?meaningId;
             wikpa:meaning_lang_pos_id ?langPosId.

    ?relation_type wikpa:relation_type_name "synonyms";
                   wikpa:relation_type_id ?relationTypeId.

    ?relation wikpa:relation_meaning_id ?meaningId;
              wikpa:relation_relation_type_id ?relationTypeId;
              wikpa:relation_wiki_text_id ?wikiTextIdRel.

    ?wiki_text wikpa:wiki_text_id ?wikiTextIdRel;
              wikpa:wiki_text_text ?relationWord.
}

2. From synonym to entries

Let's get a list of entries which contain a word (e.g. English phrase computer language) in the section Synonyms.

SPARQL request:

SELECT ?langId ?pageId ?langPosId ?meaningId ?relationTypeId ?wikiTextIdRel ?entry
WHERE {
    ?lang wikpa:lang_code "en";
          wikpa:lang_id ?langId.

    ?page wikpa:page_page_title ?entry;
          wikpa:page_id ?pageId.

    ?lang_pos wikpa:lang_pos_page_id ?pageId;
              wikpa:lang_pos_lang_id ?langId;
              wikpa:lang_pos_id ?langPosId.

    ?meaning wikpa:meaning_id ?meaningId;
             wikpa:meaning_lang_pos_id ?langPosId.

    ?relation_type wikpa:relation_type_name "synonyms";
                   wikpa:relation_type_id ?relationTypeId.

    ?relation wikpa:relation_meaning_id ?meaningId;
              wikpa:relation_relation_type_id ?relationTypeId;
              wikpa:relation_wiki_text_id ?wikiTextIdRel.

    ?wiki_text wikpa:wiki_text_id ?wikiTextIdRel;
              wikpa:wiki_text_text "computer language".
}

Result:

entry=language
entry=programming language

Translate word from one language to another

There are three ways to find translation in the database of the parsed English Wiktionary:

  • English entries:
    • direct translation - from English word to words in other languages, which are listed in the section Translation of the entry;
    • reverse translation - from the non-English word in the section Translation to the English title (header) of the entry;
  • non-English entries:
    • from the non-English word (entry title) to the definition in English of this entry;

Let's find translation of the English (en code) phrase rain cats and dogs into all languages by the first way (direct translation).

SPARQL request:

SELECT ?langId ?pageId ?langPosId ?meaningId ?translationId ?translationEntryId ?wikiTextIdTrans ?translationWord
WHERE {
    ?lang wikpa:lang_code "en";
          wikpa:lang_id ?langId.

    ?page wikpa:page_page_title "rain cats and dogs";
          wikpa:page_id ?pageId.

    ?lang_pos wikpa:lang_pos_page_id ?pageId;
              wikpa:lang_pos_lang_id ?langId;
              wikpa:lang_pos_id ?langPosId.

    ?meaning wikpa:meaning_id ?meaningId;
             wikpa:meaning_lang_pos_id ?langPosId.

    ?translation wikpa:translation_id ?translationId;
                 wikpa:translation_lang_pos_id ?langPosId;
                 wikpa:translation_meaning_id ?meaningId.

    ?translation_entry wikpa:translation_entry_id ?translationEntryId;
                       wikpa:translation_entry_translation_id ?translationId;
                       wikpa:translation_entry_wiki_text_id ?wikiTextIdTrans.

    ?wiki_text wikpa:wiki_text_id ?wikiTextIdTrans;
              wikpa:wiki_text_text ?translationWord.
}

Let's find translation of the English (en code) word dog into French by the first way (direct translation).

SELECT ?langIdDest ?langIdSource ?pageId ?langPosId ?meaningId ?translationId ?translationEntryId ?wikiTextIdTrans ?translationWord
WHERE {
    ?langDest wikpa:lang_code "en";
          wikpa:lang_id ?langIdDest.

    ?page wikpa:page_page_title "book";
          wikpa:page_id ?pageId.

    ?lang_pos wikpa:lang_pos_page_id ?pageId;
              wikpa:lang_pos_lang_id ?langIdDest;
              wikpa:lang_pos_id ?langPosId.

    ?meaning wikpa:meaning_id ?meaningId;
             wikpa:meaning_lang_pos_id ?langPosId.

    ?translation wikpa:translation_id ?translationId;
                 wikpa:translation_lang_pos_id ?langPosId;
                 wikpa:translation_meaning_id ?meaningId.

    ?langSource wikpa:lang_code "fr";
          wikpa:lang_id ?langIdSource.

    ?translation_entry wikpa:translation_entry_id ?translationEntryId;
                       wikpa:translation_entry_translation_id ?translationId;
                       wikpa:translation_entry_lang_id ?langIdSource;
                       wikpa:translation_entry_wiki_text_id ?wikiTextIdTrans.

    ?wiki_text wikpa:wiki_text_id ?wikiTextIdTrans;
              wikpa:wiki_text_text ?translationWord.
}

Let's find translations of the French (fr code) word livre into English by the second way (reverse translation).

As I understand, the system founds translations in the data extracted from translation sections of the following Wiktionary pages: pound and book.

SPARQL request:

SELECT ?langIdDest ?englishWord ?langIdSource ?pageId ?langPosId ?meaningId ?translationId ?translationEntryId ?wikiTextIdTrans ?translationWord
WHERE {
    ?langDest wikpa:lang_code "en";
          wikpa:lang_id ?langIdDest.

    ?page wikpa:page_page_title ?englishWord;
          wikpa:page_id ?pageId.

    ?lang_pos wikpa:lang_pos_page_id ?pageId;
              wikpa:lang_pos_lang_id ?langIdDest;
              wikpa:lang_pos_id ?langPosId.

    ?meaning wikpa:meaning_id ?meaningId;
             wikpa:meaning_lang_pos_id ?langPosId.

    ?translation wikpa:translation_id ?translationId;
                 wikpa:translation_lang_pos_id ?langPosId;
                 wikpa:translation_meaning_id ?meaningId.

    ?langSource wikpa:lang_code "fr";
          wikpa:lang_id ?langIdSource.

    ?translation_entry wikpa:translation_entry_id ?translationEntryId;
                       wikpa:translation_entry_translation_id ?translationId;
                       wikpa:translation_entry_lang_id ?langIdSource;
                       wikpa:translation_entry_wiki_text_id ?wikiTextIdTrans.

    ?wiki_text wikpa:wiki_text_id ?wikiTextIdTrans;
              wikpa:wiki_text_text "livre".
}

Problems and question

  • How to track and stop very long queries?

See also

  • [SQL examples](SQL examples)