-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trim the README.md and add quickstart guide #202
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through everything and marked all language, spelling and grammar errors and potential improvements I noticed. I also added one or two comments as to the context, but did not verify the correctness and completeness of the commands.
docs/advanced_features.md
Outdated
|
||
On top of the vanilla SPARQL functionality, QLever allows so-called SPARQL+Text | ||
queries on a text corpus linked to a knowledge base via entity recognition. For | ||
example, the following query find all mentions of astronauts next to the words |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the following query find all mentions
-> the following query finds all mentions
docs/advanced_features.md
Outdated
[here](docs/sparql_plus_text.md). | ||
|
||
QLever also supports efficient SPARQL autocompletion. For example, the | ||
following query yields a list of all predicates associated with persons in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
persons
-> people
docs/advanced_features.md
Outdated
|
||
QLever also supports efficient SPARQL autocompletion. For example, the | ||
following query yields a list of all predicates associated with persons in the | ||
knowledge base, ordered by the number of persons which have that predicate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pesons
-> people
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly it turns out "persons" is okay too and is in fact the more formal (older) form. Still, since we are using C++17 we might as well stick to the more modern form for English as well. So thanks for pointing this out to me!
docs/advanced_features.md
Outdated
GROUP BY ?predicate | ||
ORDER BY DESC(?count) | ||
|
||
Note that this query could also be processed by standard SPARQL simply by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As SPARQL is the language, I would replace this with Note that this query could also be processes by a standard SPARQL engine
or Note that this query is equivalent to a standard SPARQL query
.
docs/advanced_features.md
Outdated
ORDER BY DESC(?count) | ||
|
||
Note that this query could also be processed by standard SPARQL simply by | ||
replacing the second triple by ?x ?predicate ?object. However, that query is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replacying by
-> replacuing with
that avoids having two repetitive bys. Also replace with is normally used in the active case (versus replace by in the passive case).
docs/wikidata.md
Outdated
## Build a QLever Index | ||
|
||
Now we can build a QLever Index from the `latest-all.ttl` Wikidata Turtle file | ||
using the `wikidata_settings.json` file for some useful default settings for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... .json file for some ...
this sentence is ginourmous and should probably be split into two. E.g.:
.json file. The .json file conatains some useful...
docs/wikidata.md
Outdated
|
||
Now we can build a QLever Index from the `latest-all.ttl` Wikidata Turtle file | ||
using the `wikidata_settings.json` file for some useful default settings for | ||
relations that can be safely stored on disk because their actual values are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that can be stored on disk safely as their
docs/wikidata.md
Outdated
-u`) is not 1000 you have to make the `./index` folder writable for QLever | ||
inside the container e.g. by running `chmod -R o+rw ./index` | ||
|
||
**Note (1):** This takes about half a day but should be much faster than with most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but should be
->which is still faster than [most] other triple stores.
. I would consider simply removing that part of the sentence though, as this is a tutorial and not a comparison, and it feels somewhat out of place to me.
docs/wikidata.md
Outdated
qlever | ||
|
||
Then point your browser to [http://localhost:7001/](http://localhost:7001/) and | ||
enter the query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before, I prefer open ... in your browser
and I would remove the enter the query
part, as that should be obvious, and the implication that the user wants to enter exactly one query sounds strange to me.
docs/wikidata.md
Outdated
Then point your browser to [http://localhost:7001/](http://localhost:7001/) and | ||
enter the query. | ||
|
||
For example the following query retrieves all mountains above 8000 m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example,
@floriankramer thanks for the review. I think I addressed all of your comments. I also reworked the Wikidata Quickstart Guide a bit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through everything again and added some more comments for both things I missed the first time around, and new changes. Overall I really like the new documentation though, and the comments focus mostly on language details.
README.md
Outdated
on > 4 GB files or allocate enough RAM for larger KBs), docker version 18.05 or newer | ||
(needs multi-stage builds without leaking files (for End-to-End Tests)) and `git`. | ||
Then you can simply do the following: | ||
If you use QLever in your work, please cite this paper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be that paper
, since this is written inside of the readme and not the paper.
README.md
Outdated
Alternatively to get started with a real (and really big) dataset we have prepared | ||
a [Wikidata Quickstart Guide](docs/wikidata.md). This guide takes you through the entire | ||
process of loading the full Wikidata Knowledge Base into QLever, but don't worry | ||
it is pretty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to write pretty simple / easy
, or do you want to express the beauty of wikidata, qlever, the process or the guide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I missed a flush in my brain's output path.
machine. If you have no input data yet obtain it from one of our [recommended | ||
sources](docs/obtaining_data.md) or create your own knowledge base in standard | ||
sources](docs/knowledge_bases.md) or create your own knowledge base in standard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input data yet
-> input data yet,
or
->, or
README.md
Outdated
machine. If you have no input data yet obtain it from one of our [recommended | ||
sources](docs/obtaining_data.md) or create your own knowledge base in standard | ||
sources](docs/knowledge_bases.md) or create your own knowledge base in standard | ||
*NTriple* or *Turtle* formats and (obtionally) add a [text | ||
corpus](docs/sparql_plus_text.md). | ||
|
||
Note that QLever only accepts UTF-8 encoded input files, then again [you should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to your changes, but I would replace the comma here by a fullstop.
README.md
Outdated
By default and when running `docker` **without user namespaces**, the container | ||
will use the user ID 1000 which on Linux is almost always the first real user. | ||
If the default user does not work add `-u "$(id -u):$(id -g)"` to `docker run` | ||
to let QLever execute as the current user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to let
-> to have
/ to make
docs/wikidata.md
Outdated
[here](https://docs.docker.com/install/linux/docker-ce/ubuntu/). | ||
|
||
To download QLever we will clone the `git` repository from GitHub. As we | ||
create the QLever index in a subfolder of the repository in this tutorial, **make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the in this tutorial
needs to either be moved to after the As we
, or simply be omitted (the context should be clear).
make sure
-> you should make sure, that you
of available space
-> of space available on the drive on which you execute...
docs/wikidata.md
Outdated
To download QLever we will clone the `git` repository from GitHub. As we | ||
create the QLever index in a subfolder of the repository in this tutorial, **make | ||
sure you have about 2 TB of available space** where you execute the following | ||
steps. Alternatively you can see the full |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively
-> Alternatively,
docs/wikidata.md
Outdated
## Download and uncompress Wikidata | ||
|
||
If you already downloaded **and decrompressed** Wikidata to uncompressed Turtle | ||
format you can skip this step, otherwise we download and uncompress it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
step, otherwise
-> step. Otherwise
You could also simply remove the second sentence, as it is redundant, given that we just said you could skip this step, if you already downloaded and uncompressed wikidata.
docs/wikidata.md
Outdated
[README](https://github.com/ad-freiburg/QLever#building-the-index) for | ||
instructions on using a different path for the index. | ||
|
||
**The index plus unpacked Wikidata will use up to about 2 TB.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think that turning this into a more fully formed sentence would help the text flow.
docs/wikidata.md
Outdated
Now we can build a QLever Index from the `latest-all.ttl` Wikidata Turtle file. | ||
For the process of building an index we can tune some settings to the particular | ||
Knowledge Base. The most important of these is a list of relations which can safely be | ||
stored on disk as their actual values are rarely accessed. For Wikidata these |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on disk
-> on disk,
(the comma is not required, but I think it helps structure the sentence)
@floriankramer thank you for the great (as always) review. I've addressed your comments. I'm really looking forward to the new README. I also used the commands in the Wikidata Quickstart Guide for building the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on some minor things, but overall I think that this is ready to be merged.
README.md
Outdated
@@ -55,8 +55,9 @@ Further documentation is available on the following topics | |||
|
|||
# Building the QLever Docker Container | |||
|
|||
We recommend using QLever with `docker` if you absolutely want to run QLever | |||
directly on your host see [here](docs/native_setup.md). | |||
We recommend using QLever with [docker](https://www.docker.com) if you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if
-? . If
docs/advanced_features.md
Outdated
a correspondingly long query time. In contrast, the query above takes only | ||
about 100 ms on a standard Linux machine (with 16 GB memory) and a dataset with 360 | ||
million triples and 530 million text records. | ||
by replacing the second triple with `?x ?predicate ?object` and add `DISTINCT` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add
-> adding
docs/wikidata.md
Outdated
@@ -27,12 +29,14 @@ build the index under a different path. | |||
## Download and uncompress Wikidata | |||
|
|||
If you already downloaded **and decrompressed** Wikidata to uncompressed Turtle | |||
format you can skip this step, otherwise we download and uncompress it. | |||
format you can skip this step. Otherwise we download and uncompress it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to uncompressed
-> to the uncompressed
I'm probably just nitpicking at this point, but the Otherwise we download and uncompress it.
still sounds slightly weird to me. What about Otherwise we'll download and uncompress it in this step.
Also use a folder for the input file and address Florian's comments
pretty aggressive trimming and a very quick no details no fuss quickstart guide