Trim the README.md and add quickstart guide #202

niklas88 · 2019-03-07T11:50:18Z

pretty aggressive trimming and a very quick no details no fuss quickstart guide

floriankramer

I read through everything and marked all language, spelling and grammar errors and potential improvements I noticed. I also added one or two comments as to the context, but did not verify the correctness and completeness of the commands.

floriankramer · 2019-03-11T09:59:00Z

docs/advanced_features.md

+
+On top of the vanilla SPARQL functionality, QLever allows so-called SPARQL+Text
+queries on a text corpus linked to a knowledge base via entity recognition.  For
+example, the following query find all mentions of astronauts next to the words


the following query find all mentions -> the following query finds all mentions

floriankramer · 2019-03-11T09:59:56Z

docs/advanced_features.md

+[here](docs/sparql_plus_text.md).
+
+QLever also supports efficient SPARQL autocompletion.  For example, the
+following query yields a list of all predicates associated with persons in the


persons -> people

floriankramer · 2019-03-11T10:00:17Z

docs/advanced_features.md

+
+QLever also supports efficient SPARQL autocompletion.  For example, the
+following query yields a list of all predicates associated with persons in the
+knowledge base, ordered by the number of persons which have that predicate.


pesons -> people

Interestingly it turns out "persons" is okay too and is in fact the more formal (older) form. Still, since we are using C++17 we might as well stick to the more modern form for English as well. So thanks for pointing this out to me!

floriankramer · 2019-03-11T10:01:41Z

docs/advanced_features.md

+    GROUP BY ?predicate
+    ORDER BY DESC(?count)
+
+Note that this query could also be processed by standard SPARQL simply by


As SPARQL is the language, I would replace this with Note that this query could also be processes by a standard SPARQL engine or Note that this query is equivalent to a standard SPARQL query.

floriankramer · 2019-03-11T10:03:01Z

docs/advanced_features.md

+    ORDER BY DESC(?count)
+
+Note that this query could also be processed by standard SPARQL simply by
+replacing the second triple by ?x ?predicate ?object. However, that query is


replacying by -> replacuing with that avoids having two repetitive bys. Also replace with is normally used in the active case (versus replace by in the passive case).

floriankramer · 2019-03-11T10:48:29Z

docs/wikidata.md

+## Build a QLever Index
+
+Now we can build a QLever Index from the `latest-all.ttl` Wikidata Turtle file
+using the `wikidata_settings.json` file for some useful default settings for


... .json file for some ... this sentence is ginourmous and should probably be split into two. E.g.:
.json file. The .json file conatains some useful...

floriankramer · 2019-03-11T10:51:18Z

docs/wikidata.md

+
+Now we can build a QLever Index from the `latest-all.ttl` Wikidata Turtle file
+using the `wikidata_settings.json` file for some useful default settings for
+relations that can be safely stored on disk because their actual values are


that can be stored on disk safely as their

floriankramer · 2019-03-11T10:54:17Z

docs/wikidata.md

+-u`) is not 1000 you have to make the `./index` folder writable for QLever
+inside the container e.g. by running `chmod -R o+rw ./index`
+
+**Note (1):** This takes about half a day but should be much faster than with most


but should be ->which is still faster than [most] other triple stores.. I would consider simply removing that part of the sentence though, as this is a tutorial and not a comparison, and it feels somewhat out of place to me.

floriankramer · 2019-03-11T10:55:14Z

docs/wikidata.md

+        qlever
+
+Then point your browser to [http://localhost:7001/](http://localhost:7001/) and
+enter the query.


Same as before, I prefer open ... in your browser and I would remove the enter the query part, as that should be obvious, and the implication that the user wants to enter exactly one query sounds strange to me.

floriankramer · 2019-03-11T10:55:23Z

docs/wikidata.md

+Then point your browser to [http://localhost:7001/](http://localhost:7001/) and
+enter the query.
+
+For example the following query retrieves all mountains above 8000 m


For example,

niklas88 · 2019-03-11T15:42:31Z

@floriankramer thanks for the review. I think I addressed all of your comments. I also reworked the Wikidata Quickstart Guide a bit

Moved the filesystem space part to the git clone so it's less likely one needs to move it later
Changed to using a wikidata-input folder so it's more clear how the input can be stored separately
Uncompress wikidata directly while downloading, so that it is one slow command and one can get started immediately on the next steps once that is finished

floriankramer

I read through everything again and added some more comments for both things I missed the first time around, and new changes. Overall I really like the new documentation though, and the comments focus mostly on language details.

floriankramer · 2019-03-15T11:03:40Z

README.md

-on > 4 GB files or allocate enough RAM for larger KBs), docker version 18.05 or newer
-(needs multi-stage builds without leaking files (for End-to-End Tests)) and `git`.
-Then you can simply do the following:
+If you use QLever in your work, please cite this paper.


I believe this should be that paper, since this is written inside of the readme and not the paper.

floriankramer · 2019-03-15T11:06:08Z

README.md

+Alternatively to get started with a real (and really big) dataset we have prepared
+a [Wikidata Quickstart Guide](docs/wikidata.md). This guide takes you through the entire
+process of loading the full Wikidata Knowledge Base into QLever, but don't worry
+it is pretty.


Did you mean to write pretty simple / easy, or do you want to express the beauty of wikidata, qlever, the process or the guide?

I guess I missed a flush in my brain's output path.

floriankramer · 2019-03-15T11:12:11Z

README.md

 machine. If you have no input data yet obtain it from one of our [recommended
-sources](docs/obtaining_data.md) or create your own knowledge base in standard
+sources](docs/knowledge_bases.md) or create your own knowledge base in standard


input data yet -> input data yet,
or ->, or

floriankramer · 2019-03-15T11:14:03Z

README.md

 machine. If you have no input data yet obtain it from one of our [recommended
-sources](docs/obtaining_data.md) or create your own knowledge base in standard
+sources](docs/knowledge_bases.md) or create your own knowledge base in standard
 *NTriple* or *Turtle* formats and (obtionally) add a [text
 corpus](docs/sparql_plus_text.md).

 Note that QLever only accepts UTF-8 encoded input files, then again [you should


Not related to your changes, but I would replace the comma here by a fullstop.

floriankramer · 2019-03-15T11:16:08Z

README.md

+By default and when running `docker` **without user namespaces**, the container
+will use the user ID 1000 which on Linux is almost always the first real user.
+If the default user does not work add `-u "$(id -u):$(id -g)"` to `docker run`
+to let QLever execute as the current user.


to let -> to have / to make

floriankramer · 2019-03-15T11:53:11Z

docs/wikidata.md

+[here](https://docs.docker.com/install/linux/docker-ce/ubuntu/).
+
+To download QLever we will clone the `git` repository from GitHub. As we
+create the QLever index in a subfolder of the repository in this tutorial, **make


I think the in this tutorial needs to either be moved to after the As we, or simply be omitted (the context should be clear).
make sure -> you should make sure, that you
of available space -> of space available on the drive on which you execute...

floriankramer · 2019-03-15T11:53:36Z

docs/wikidata.md

+To download QLever we will clone the `git` repository from GitHub. As we
+create the QLever index in a subfolder of the repository in this tutorial, **make
+sure you have about 2 TB of available space** where you execute the following
+steps. Alternatively you can see the full


Alternatively -> Alternatively,

floriankramer · 2019-03-15T11:54:02Z

docs/wikidata.md

+## Download and uncompress Wikidata
+
+If you already downloaded **and decrompressed** Wikidata to uncompressed Turtle
+format you can skip this step, otherwise we download and uncompress it.


step, otherwise -> step. Otherwise
You could also simply remove the second sentence, as it is redundant, given that we just said you could skip this step, if you already downloaded and uncompressed wikidata.

floriankramer · 2019-03-15T11:55:59Z

docs/wikidata.md

+[README](https://github.com/ad-freiburg/QLever#building-the-index) for
+instructions on using a different path for the index.
+
+**The index plus unpacked Wikidata will use up to about 2 TB.**


I still think that turning this into a more fully formed sentence would help the text flow.

floriankramer · 2019-03-15T11:58:33Z

docs/wikidata.md

+Now we can build a QLever Index from the `latest-all.ttl` Wikidata Turtle file.
+For the process of building an index we can tune some settings to the particular
+Knowledge Base. The most important of these is a list of relations which can safely be
+stored on disk as their actual values are rarely accessed. For Wikidata these


on disk -> on disk, (the comma is not required, but I think it helps structure the sentence)

niklas88 · 2019-03-15T15:03:50Z

@floriankramer thank you for the great (as always) review. I've addressed your comments. I'm really looking forward to the new README. I also used the commands in the Wikidata Quickstart Guide for building the xsd:double index yesterday, so these definitely work.

floriankramer

I commented on some minor things, but overall I think that this is ready to be merged.

floriankramer · 2019-03-19T11:05:18Z

README.md

@@ -55,8 +55,9 @@ Further documentation is available on the following topics

 # Building the QLever Docker Container

-We recommend using QLever with `docker` if you absolutely want to run QLever
-directly on your host see [here](docs/native_setup.md).
+We recommend using QLever with [docker](https://www.docker.com) if you


if -? . If

floriankramer · 2019-03-19T11:06:10Z

docs/advanced_features.md

-a correspondingly long query time.  In contrast, the query above takes only
-about 100 ms on a standard Linux machine (with 16 GB memory) and a dataset with 360
-million triples and 530 million text records.
+by replacing the second triple with `?x ?predicate ?object` and add `DISTINCT`


add -> adding

floriankramer · 2019-03-19T11:11:32Z

docs/wikidata.md

@@ -27,12 +29,14 @@ build the index under a different path.
 ## Download and uncompress Wikidata

 If you already downloaded **and decrompressed** Wikidata to uncompressed Turtle
-format you can skip this step, otherwise we download and uncompress it.
+format you can skip this step. Otherwise we download and uncompress it.


to uncompressed -> to the uncompressed
I'm probably just nitpicking at this point, but the Otherwise we download and uncompress it. still sounds slightly weird to me. What about Otherwise we'll download and uncompress it in this step.

Also use a folder for the input file and address Florian's comments

niklas88 requested a review from joka921 March 8, 2019 15:37

floriankramer reviewed Mar 11, 2019

View reviewed changes

niklas88 requested review from floriankramer and removed request for joka921 March 13, 2019 13:26

floriankramer reviewed Mar 15, 2019

View reviewed changes

floriankramer approved these changes Mar 19, 2019

View reviewed changes

niklas88 added 10 commits March 19, 2019 12:34

Trim the README.md and add quickstart guide

4c25a94

Typo and outdated "next section" comment

233738a

Headers in quickstart

52a4570

First draft of Wikidata Quickstart Guide fix #201

b465b62

Docs: Wikidata, uncompress during download

394905b

Also use a folder for the input file and address Florian's comments

Address review comments

980439a

Address more review comments

df8cb53

Use a non-hostname URL for db downloads

bc49d6c

Fix wrong wikipedia-freebase path and rename

34ac6ee

Further improve some details of docs after review

f9ffe59

niklas88 merged commit 647bb7e into ad-freiburg:master Mar 19, 2019

niklas88 deleted the improve_readme branch October 1, 2019 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trim the README.md and add quickstart guide #202

Trim the README.md and add quickstart guide #202

niklas88 commented Mar 7, 2019

floriankramer left a comment

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

niklas88 Mar 11, 2019

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

floriankramer Mar 11, 2019

niklas88 commented Mar 11, 2019

floriankramer left a comment

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

niklas88 Mar 15, 2019

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

floriankramer Mar 15, 2019

niklas88 commented Mar 15, 2019

floriankramer left a comment

floriankramer Mar 19, 2019

floriankramer Mar 19, 2019

floriankramer Mar 19, 2019

Trim the README.md and add quickstart guide #202

Trim the README.md and add quickstart guide #202

Conversation

niklas88 commented Mar 7, 2019

floriankramer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

niklas88 commented Mar 11, 2019

floriankramer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

niklas88 commented Mar 15, 2019

floriankramer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment